This function performs the QCA minimization of an input truth table, or if the input is a dataset the minimization it minimizes a set of causal conditions with respect to an outcome. Three minimization methods are available: the classical Quine-McCluskey, the enhanced Quine-McCluskey and the latest Consistency Cubes algorithm that is built for performance.
All algorithms return the same, exact solutions, see Dusa (2017) and Dusa and Thiem (2015)
minimize(input, include = "", exclude = NULL, dir.exp = "", pi.cons = 0, pi.depth = 0, sol.cons = 0, sol.cov = 1, sol.depth = 0, min.pin = FALSE, row.dom = FALSE, all.sol = FALSE, details = FALSE, use.tilde = FALSE, method = "CCubes", ...)
input |
A truth table object (preferred) or a data frame containing calibrated causal conditions and an outcome. | |||
include |
A vector of other output values to include in the minimization process. | |||
exclude |
A vector of row numbers from the truth table, or a matrix of causal combinations to exclude from the minimization process. | |||
dir.exp |
A vector of directional expectations to derive intermediate solutions. | |||
pi.cons |
Numerical fuzzy value between 0 and 1, minimal consistency threshold for a prime implicant to be declared as sufficient. | |||
pi.depth |
Integer, a maximum number of causal conditions to be used when searching for conjunctive prime implicants. | |||
sol.cons |
Numerical fuzzy value between 0 and 1, minimal consistency threshold for a model to be declared as sufficient. | |||
sol.cov |
Numerical fuzzy value between 0 and 1, minimal coverage threshold for a model to be declared as necessary. | |||
sol.depth |
Integer, a maximum number of prime implicants to be used when searching for disjunctive solutions. | |||
min.pin |
Logical, terminate the search at the depth where newly found prime implicants do not contribute to minimally solving the PI chart. | |||
row.dom |
Logical, perform row dominance in the prime implicants' chart to eliminate redundant prime implicants. | |||
all.sol |
Logical, search for all possible solutions even of not minimal. | |||
details |
Logical, print more details about the solution. | |||
use.tilde |
Logical, use tilde to signal the absence of conditions. | |||
method |
Minimization method, one of "CCubes" (default), or "QMC" the classical Quine-McCluskey, or "eQMC" the enhanced Quine-McCluskey. | |||
... |
Other arguments to be passed to function truthTable() . |
Most of the times, this function takes a truth table object as the
input
for the minimization procedure, but the same argument can refer to
a data frame containing calibrated columns.
For the later case, the function minimize()
originally
had some additional formal arguments which were sent to the function
truthTable()
:
outcome
, conditions
, n.cut
,
incl.cut
, show.cases
, use.letters
and inf.test
.
All of these parameters are still possible with function minimize()
, but
since they are sent to the truthTable()
function
anyway, it is unnecessary to duplicate their explanation here. The only situation which does need
an additional description relates to the argument outcome
, where
unlike truthTable()
which accepts a single one, the
function minimize()
accepts multiple outcomes and performs a
minimization for each of them (a situation when all columns are considered causal conditions).
The argument include
specifies which other truth table rows are
included in the minimization process. Most often, the remainders are included but
any value accepted in the argument explain
is also accepted in the
argument include
.
The argument exclude
is used to exclude truth table rows from the
minimization process, from the positive configurations and/or from the remainders.
It can be specified as a vector of truth table line numbers, or as a matrix of
causal combinations.
The argument dir.exp
is used to specify directional expectations,
as described by Ragin (2003). They can be specified as a single string, with values
separated by commas. For multi-value directional expectations, they are specified
together, separated by semicolons. The total length of the directional expectations
must match the number of causal conditions specified in the analysis, using a dash
"-"
if there are no particular expectations for a specific
causal condition.
Activating the details
argument has the effect of printing
parameters of fit for each prime implicant and each overall solution, the essential prime
implicants being listed in the top part of the table. It also prints the truth
table, in case the argument input
has been provided as a data frame
instead of a truth table object.
The argument use.tilde
signals the absence of a causal condition,
in a sufficiency relation with the outcome, using a tilde sign "~"
.
It is ignored if the data is multivalent.
By default, the package QCA employes a different search algorithm based on Consistency Cubes (Dusa, 2017), analysing all possible combinations of causal conditions and all possible combinations of their respective levels. The structure of the input dataset (number of causal conditions, number of levels, number of unique rows in the truth table) has a direct implication on the search time, as all of those characteristics become entry parameters when calculating all possible combinations.
Consequently, two kinds of depth arguments are provided:
pi.depth |
the maximum number of causal conditions needed to construct a prime implicant, the complexity level where the search can be stopped, as long as the PI chart can be solved. | |||
sol.depth |
the maximum number of prime implicants needed to find a solution (to cover all initial positive output configurations) |
These arguments introduce a possible new way of deriving prime implicants and solutions,
that can lead to different results (i.e. even more parsimonious) compared to the classical
Quine-McCluskey. When either of them is modified from the default value of 0, the minimization
method is automatically set to "CCubes"
and the remainders are
automatically included in the minimization.
The search time is larger the higher these depths, or inversely the search time can be
significantly shorter if these depths are smaller. Irrespective of how large
pi.depth
is, the algorithm will always stop at a maximum complexity level
where no new, non-redundant prime implicants are found.
sol.depth
is relevant only when solving the PI chart by activating
the argument all.sol
. In such a situation, the number of combinations of all
possible numbers of prime implicants is potentially too large to be solved in a polynomial time
and if not otherwise specified, the depth for the disjunctive solutions is automatically bounded
to 5 prime implicants.
The default method to solve the PI chart (when all.sol = FALSE
),
is to find the minimal number (k
) of prime implicants needed to cover all
initial positive output configurations, then it exhaustively searches through all possible
disjunctions of k
prime implicants which do cover those configurations.
The argument min.pin
introduces an additional parameter to
control when to stop the search for prime implicants. It is based on the observation by
Dusa (2017) that out of the
entire set of non redundant prime implicants, only a subset actually contribute to
solving the chart with disjunctions of k
PIs. The search depth can be
shortened at the level where the next subset of PIs do not contribute to solving the
PI chart, thus avoiding to spend unnecessary time on finding the maximal number of
non-redundant PIs. Instead, it finds the set of minimal ("min"
) number
of PIs ("pin"
) necessary to obtain exactly the same solutions, with a
dramatically improved overall performance.
Once the PI chart is constructed using the prime implicants found in the previous
stages, the argument row.dom
can be used to further eliminate
irrelevant prime implicants when solving the PI chart, applying the principle of row
dominance: if a prime implicant A covers the same (intial) positive output
configurations as another prime implicant B and in the same time covers
other configurations which B does not cover, then B is irrelevant and eliminated.
The argument all.sol
automatically deactivates the argument
min.pin
, because it aims to exhaustively identify all possible
non-redundant disjunctions of n
prime implicants that solve the PI
chart, where n >= k
, with an inflated number of possible solutions.
Depending on the complexity of the PI chart, sometimes it may take a very long time
to identify all possible non-redundant (disjunctions that are not subsets of
previously found) disjunctive solutions.
The task of solving the PI chart depends on its size, with prime implicants on the
rows and the positive output configurations on the columns. Since the columns are
fixed, another possible way to reduce the solving time is to eliminate redundant rows,
by activating the argument row.dom
If minimizing a dataset instead of a truth table, unless otherwise specified the
argument incl.cut
is automatically set to the minimum value between
pi.cons
and sol.cons
, then passed to the function
truthTable()
.
The argument sol.cons
introduces another possibility to change
the method of solving the PI chart. Normally, once the solutions are found among all possible
combinations of k
prime implicants, consistencies and coverages are
subsequently calculated. When sol.cons
is lower than 1, then solutions
are searched based on their consistencies, which should be at least equal to this threshold.
"qca"
when using a single outcomes, or class
"mqca"
when using multiple outcomes. These objects are lists having
the following components:
tt |
The truth table object. | |||
options |
Values for the various options used in the function (including defaults). | |||
negatives |
The line number(s) of the negative configuration(s). | |||
initials |
The initial positive configuration(s). | |||
PIchart |
A list containing the PI chart(s). | |||
primes |
The prime implicant(s). | |||
solution |
A list of solution(s). | |||
essential |
A list of essential PI(s). | |||
pims |
A list of PI membership scores. | |||
IC |
The matrix containing the inclusion and coverage scores for the solution(s). | |||
SA |
A list of simplifying assumptions. | |||
i.sol |
A list of components specific to intermediate solution(s), each having a PI chart, prime implicant membership scores, (non-simplifying) easy counterfactuals and difficult counterfactuals. | |||
call |
The user's command which produced all these objects and result(s). |
Cebotari, V.; Vink, M.P. (2013) A Configurational Analysis of Ethnic Protest in Europe. International Journal of Comparative Sociology vol.54, no.4, pp.298-324.
Cebotari, V.; Vink, M.P. (2015) Replication Data for: A configurational analysis of ethnic protest in Europe, Harvard Dataverse, V2, DOI: 10.7910/DVN/PT2IB9
Cronqvist, L.; Berg-Schlosser, D. (2009) Multi-Value QCA (mvQCA), in Rihoux, B.; Ragin, C. (eds.) Configurational Comparative Methods. Qualitative Comparative Analysis (QCA) and Related Techniques, SAGE.
Dusa, A.; Thiem, A. (2015) Enhancing the Minimization of Boolean and Multivalue Output Functions With eQMC Journal of Mathematical Sociology vol.39, no.2, pp.92-108.
Ragin, C.C. (2003) Recent Advances in Fuzzy-Set Methods and Their Application to Policy Questions. WP 2003-9, COMPASSS.
Ragin, C.C. (2009) Qualitative Comparative Analysis Using Fuzzy-Sets (fsQCA), in Rihoux, B.; Ragin, C. (eds.) Configurational Comparative Methods. Qualitative Comparative Analysis (QCA) and Related Techniques, SAGE.
Ragin, C.C.; Strand, S.I. (2008) Using Qualitative Comparative Analysis to Study Causal Order: Comment on Caren and Panofsky (2005). Sociological Methods & Research vol.36, no.4, pp.431-441.
Rihoux, B.; De Meur, G. (2009) Crisp Sets Qualitative Comparative Analysis (mvQCA), in Rihoux, B.; Ragin, C. (eds.) Configurational Comparative Methods. Qualitative Comparative Analysis (QCA) and Related Techniques, SAGE.
# ----- # Lipset binary crisp data data(LC) # the associated truth table ttLC <- truthTable(LC, "SURV", sort.by = "incl, n") ttLCOUT: outcome value n: number of cases in configuration incl: sufficiency inclusion score DEV URB LIT IND STB OUT n incl PRI 32 1 1 1 1 1 1 4 1.000 1.000 22 1 0 1 0 1 1 2 1.000 1.000 24 1 0 1 1 1 1 2 1.000 1.000 1 0 0 0 0 0 0 3 0.000 0.000 2 0 0 0 0 1 0 2 0.000 0.000 5 0 0 1 0 0 0 2 0.000 0.000 6 0 0 1 0 1 0 1 0.000 0.000 23 1 0 1 1 0 0 1 0.000 0.000 31 1 1 1 1 0 0 1 0.000 0.000# conservative solution (Rihoux & De Meur 2009, p.57) cLC <- minimize(ttLC) cLCM1: DEV*urb*LIT*STB + DEV*LIT*IND*STB <=> SURV# view the Venn diagram for the associated truth table library(venn) venn(cLC)
# add details and case names minimize(ttLC, details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 8/10/0 Total : 18 Number of multiple-covered cases: 2 M1: DEV*urb*LIT*STB + DEV*LIT*IND*STB <=> SURV inclS PRI covS covU cases ------------------------------------------------------------------ 1 DEV*urb*LIT*STB 1.000 1.000 0.500 0.250 FI,IE; FR,SE 2 DEV*LIT*IND*STB 1.000 1.000 0.750 0.500 FR,SE; BE,CZ,NL,UK ------------------------------------------------------------------ M1 1.000 1.000 1.000# negating the outcome ttLCn <- truthTable(LC, "~SURV", sort.by = "incl, n") minimize(ttLCn)M1: dev*urb*ind + DEV*LIT*IND*stb <=> surv# using a tilde instead of upper/lower case names minimize(ttLCn, use.tilde = TRUE)M1: ~DEV*~URB*~IND + DEV*LIT*IND*~STB <=> ~SURV# parsimonious solution, positive output pLC <- minimize(ttLC, include = "?", details = TRUE, show.cases = TRUE) pLCn OUT = 1/0/C: 8/10/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV*STB <=> SURV inclS PRI covS covU cases ----------------------------------------------------------------- 1 DEV*STB 1.000 1.000 1.000 - FI,IE; FR,SE; BE,CZ,NL,UK ----------------------------------------------------------------- M1 1.000 1.000 1.000# the associated simplifying assumptions pLC$SA$M1 DEV URB LIT IND STB 18 1 0 0 0 1 20 1 0 0 1 1 26 1 1 0 0 1 28 1 1 0 1 1 30 1 1 1 0 1# parsimonious solution, negative output pLCn <- minimize(ttLCn, include = "?", details = TRUE, show.cases = TRUE) pLCnn OUT = 1/0/C: 10/8/0 Total : 18 Number of multiple-covered cases: 5 M1: dev + stb <=> surv inclS PRI covS covU cases -------------------------------------------------------------- 1 dev 1.000 1.000 0.800 0.300 GR,PT,ES; IT,RO; HU,PL; EE 2 stb 1.000 1.000 0.700 0.200 GR,PT,ES; HU,PL; AU; DE -------------------------------------------------------------- M1 1.000 1.000 1.000# ----- # Lipset multi-value crisp data (Cronqvist & Berg-Schlosser 2009, p.80) data(LM) # truth table ttLM <- truthTable(LM, "SURV", conditions = "DEV, URB, LIT, IND", sort.by = "incl", show.cases = TRUE) # conservative solution, positive output minimize(ttLM, details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 7/11/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV{2}*LIT{1}*IND{1} + DEV{1}*URB{0}*LIT{1}*IND{0} => SURV{1} inclS PRI covS covU cases --------------------------------------------------------------------------- 1 DEV{2}*LIT{1}*IND{1} 1.000 1.000 0.625 0.625 FR,SE; BE,NL,UK 2 DEV{1}*URB{0}*LIT{1}*IND{0} 1.000 1.000 0.250 0.250 FI,IE --------------------------------------------------------------------------- M1 1.000 1.000 0.875# parsimonious solution, positive output minimize(ttLM, include = "?", details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 7/11/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV{2} + DEV{1}*IND{0} => SURV{1} inclS PRI covS covU cases ------------------------------------------------------------- 1 DEV{2} 1.000 1.000 0.625 0.625 FR,SE; BE,NL,UK 2 DEV{1}*IND{0} 1.000 1.000 0.250 0.250 FI,IE ------------------------------------------------------------- M1 1.000 1.000 0.875# negate the outcome ttLMn <- truthTable(LM, "~SURV", conditions = "DEV, URB, LIT, IND", sort.by = "incl", show.cases = TRUE) # conservative solution, negative output minimize(ttLMn, details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 9/9/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV{0}*URB{0}*IND{0} + DEV{1}*URB{0}*LIT{1}*IND{1} => ~SURV{1} inclS PRI covS covU cases ------------------------------------------------------------------------------------ 1 DEV{0}*URB{0}*IND{0} 1.000 1.000 0.800 0.800 GR,IT,PT,RO,ES; EE,HU,PL 2 DEV{1}*URB{0}*LIT{1}*IND{1} 1.000 1.000 0.100 0.100 AU ------------------------------------------------------------------------------------ M1 1.000 1.000 0.900# parsimonious solution, positive output minimize(ttLMn, include = "?", details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 9/9/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV{0} + DEV{1}*URB{0}*IND{1} => ~SURV{1} inclS PRI covS covU cases ----------------------------------------------------------------------------- 1 DEV{0} 1.000 1.000 0.800 0.800 GR,IT,PT,RO,ES; EE,HU,PL 2 DEV{1}*URB{0}*IND{1} 1.000 1.000 0.100 0.100 AU ----------------------------------------------------------------------------- M1 1.000 1.000 0.900# ----- # Lipset fuzzy sets data (Ragin 2009, p.112) data(LF) # truth table using a very low inclusion cutoff ttLF <- truthTable(LF, "SURV", incl.cut = 0.7, show.cases = TRUE, sort.by="incl") # conservative solution minimize(ttLF, details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 8/10/0 Total : 18 Number of multiple-covered cases: 2 M1: DEV*urb*LIT*STB + DEV*LIT*IND*STB => SURV inclS PRI covS covU cases ------------------------------------------------------------------ 1 DEV*urb*LIT*STB 0.809 0.761 0.433 0.196 FI,IE; FR,SE 2 DEV*LIT*IND*STB 0.843 0.821 0.622 0.385 FR,SE; BE,CZ,NL,UK ------------------------------------------------------------------ M1 0.871 0.851 0.818# parsimonious solution minimize(ttLF, include = "?", details = TRUE, show.cases = TRUE)n OUT = 1/0/C: 8/10/0 Total : 18 Number of multiple-covered cases: 0 M1: DEV*STB => SURV inclS PRI covS covU cases ----------------------------------------------------------------- 1 DEV*STB 0.869 0.848 0.824 - FI,IE; FR,SE; BE,CZ,NL,UK ----------------------------------------------------------------- M1 0.869 0.848 0.824# intermediate solution using directional expectations iLF <- minimize(ttLF, include = "?", details = TRUE, show.cases = TRUE, dir.exp = "1,1,1,1,1") iLFn OUT = 1/0/C: 8/10/0 Total : 18 p.sol: DEV*STB Number of multiple-covered cases: 0 M1: DEV*LIT*STB => SURV inclS PRI covS covU cases --------------------------------------------------------------------- 1 DEV*LIT*STB 0.869 0.848 0.824 - FI,IE; FR,SE; BE,CZ,NL,UK --------------------------------------------------------------------- M1 0.869 0.848 0.824# ----- # Cebotari & Vink (2013, 2015) data(CVF) ttCVF <- truthTable(CVF, outcome = "PROTEST", incl.cut = 0.8, show.cases = TRUE, sort.by = "incl, n") pCVF <- minimize(ttCVF, include = "?", details = TRUE, show.cases = TRUE) pCVFn OUT = 1/0/C: 13/16/0 Total : 29 Number of multiple-covered cases: 5 M1: natpride + DEMOC*GEOCON*POLDIS + (democ*ETHFRACT*POLDIS + DEMOC*ETHFRACT*GEOCON) => PROTEST M2: natpride + DEMOC*GEOCON*POLDIS + (democ*ETHFRACT*POLDIS + DEMOC*ETHFRACT*poldis) => PROTEST M3: natpride + DEMOC*GEOCON*POLDIS + (DEMOC*ETHFRACT*GEOCON + ETHFRACT*GEOCON*POLDIS) => PROTEST M4: natpride + DEMOC*GEOCON*POLDIS + (DEMOC*ETHFRACT*poldis + ETHFRACT*GEOCON*POLDIS) => PROTEST --------------------------------- inclS PRI covS covU (M1) (M2) (M3) (M4) --------------------------------------------------------------------------------- 1 natpride 0.899 0.807 0.597 0.121 0.132 0.122 0.136 0.126 2 DEMOC*GEOCON*POLDIS 0.906 0.805 0.342 0.065 0.065 0.070 0.065 0.065 --------------------------------------------------------------------------------- 3 democ*ETHFRACT*POLDIS 0.842 0.718 0.299 0.000 0.040 0.040 4 DEMOC*ETHFRACT*GEOCON 0.935 0.826 0.480 0.000 0.085 0.085 5 DEMOC*ETHFRACT*poldis 0.932 0.773 0.417 0.000 0.085 0.085 6 ETHFRACT*GEOCON*POLDIS 0.869 0.786 0.365 0.005 0.045 0.045 --------------------------------------------------------------------------------- M1 0.877 0.777 0.805 M2 0.877 0.777 0.805 M3 0.879 0.782 0.810 M4 0.879 0.782 0.810 cases -------------------------------- 1 natpride CrimRussiansUkr,RussiansUkraine; HungariansYugo,KosovoAlbanians; RussiansLatvia; BasquesSpain; AlbaniansFYROM 2 DEMOC*GEOCON*POLDIS HungariansRom,CatholicsNIreland; AlbaniansFYROM; RussiansEstonia -------------------------------- 3 democ*ETHFRACT*POLDIS HungariansYugo,KosovoAlbanians; GagauzMoldova 4 DEMOC*ETHFRACT*GEOCON BasquesSpain; SerbsFYROM,CatalansSpain; AlbaniansFYROM; RussiansEstonia 5 DEMOC*ETHFRACT*poldis BasquesSpain; SerbsFYROM,CatalansSpain 6 ETHFRACT*GEOCON*POLDIS HungariansYugo,KosovoAlbanians; GagauzMoldova; AlbaniansFYROM; RussiansEstonia --------------------------------# inspect the PI chart pCVF$PIchart5 15 16 24 27 29 30 31 32 natpride x x - - x x - x - democ*ETHFRACT*POLDIS - x x - - - - - - DEMOC*ETHFRACT*GEOCON - - - - - x x x x DEMOC*ETHFRACT*poldis - - - - - x x - - DEMOC*GEOCON*POLDIS - - - x - - - x x ETHFRACT*GEOCON*POLDIS - x x - - - - x x# DEMOC*ETHFRACT*poldis is dominated by DEMOC*ETHFRACT*GEOCON # using row dominance to solve the PI chart pCVFrd <- minimize(ttCVF, include = "?", row.dom = TRUE, details = TRUE, show.cases = TRUE) # plot the prime implicants on the outcome pims <- pCVFrd$pims par(mfrow = c(2, 2)) for(i in 1:4) { XYplot(pims[, i], CVF$PROTEST, cex.axis = 0.6) }
# ----- # temporal QCA (Ragin & Strand 2008) data(RS) minimize(RS, outcome = "REC", details = TRUE, show.cases = TRUE)OUT: outcome value n: number of cases in configuration incl: sufficiency inclusion score P E A S EBA OUT n incl PRI cases 3 0 0 0 0 - 0 3 0.000 0.000 15,16,17 15 0 1 0 0 - 0 1 0.000 0.000 14 22 0 1 1 1 0 1 1 1.000 1.000 13 27 1 0 0 0 - 0 1 0.000 0.000 12 30 1 0 0 1 - 0 3 0.000 0.000 9,10,11 36 1 0 1 1 - 0 2 0.000 0.000 7,8 42 1 1 0 1 - 1 1 1.000 1.000 6 44 1 1 1 0 1 1 2 1.000 1.000 4,5 46 1 1 1 1 0 1 1 1.000 1.000 3 47 1 1 1 1 1 1 2 1.000 1.000 1,2 n OUT = 1/0/C: 7/10/0 Total : 17 Number of multiple-covered cases: 3 M1: P*E*S + P*E*A*EBA + E*A*eba*S <=> REC inclS PRI covS covU cases --------------------------------------------------- 1 P*E*S 1.000 1.000 0.571 0.143 6; 3; 1,2 2 P*E*A*EBA 1.000 1.000 0.571 0.286 4,5; 1,2 3 E*A*eba*S 1.000 1.000 0.286 0.143 13; 3 --------------------------------------------------- M1 1.000 1.000 1.000