Sie sind auf Seite 1von 68

Causal Inference Using Machine Learning:

An Application to Human Rights Treaty Ratification

October 28, 2017

Abstract

Recent advances in machine learning can be leveraged and incorporated into a causality
framework to make robust causal inference in political science. I demonstrate the applicabil-
ity of machine learning-based causal inference to the enduring puzzle of human rights treaty
ratification. The literature remains divided when it comes to explaining why states commit
to international human rights law. Many theories have been proposed only to get empirically
disputed. I address this conundrum in a causal variable importance analysis. Specifically, I
employ Judea Pearl’s structural causal framework and use an ensemble machine learning
method to estimate and compare the causal effect of multiple predictors of state decisions
to become and remain a party to three international human rights treaties on civil and po-
litical rights, women’s rights, and the right not to be tortured. The substantive findings have
important implications for our understanding of the issue of human rights treaty ratification.
1 Introduction

Despite its breathtaking advances and widespread impact, machine learning has rarely been

used in political science research. A major reason could be that political scientists tend to focus

on explaining the causal process rather than making predictions about unobserved instances of

an outcome. This cultural distinction between prediction versus explaining notwithstanding, it

should not prevent us from taking advantage of powerful machine learning methods to make

robust causal inference. I present a causal variable importance analysis that combines machine

learning and a modern causal inference method to address a thorny problem in international

relations—the puzzle of human rights treaty ratification. This template of causal analysis can be

applied to a large variety of research questions in political science to draw causal inference from

observational data. Unlike studies aiming to detect statistical association, the findings of a causal

variable importance analysis should have a causal interpretation and could be used to make

decisions as to which variables are more important to intervene upon and which policy areas

are more effective to change in order to stop torture and state repression, protect human rights,

reduce war duration, and promote economic development, among others. Since almost every

outcome worth examining in social sciences is multicausal, evaluating the relative importance

of its causal predictors could be extremely beneficial.

In the field of international relations, the existing literature remains divided over two

unresolved questions regarding (a) whether and how human rights treaties causally impact

state behavior; and (b) why countries ratify human rights treaties in the first place. In this

paper, I take a novel approach to addressing the second question by investigating the factors

that potentially cause states to ratify three major United Nations (UN) human rights treaties,

including the International Covenant on Civil and Political Rights (ICCPR), the Convention on

the Elimination of All Forms of Discrimination against Women (CEDAW), and the Convention

against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment (CAT). The

goal of my analysis is to estimate and compare the relative causal importance of these factors.

While I do not propose yet another theory as to why states ratify human rights treaties, this

paper nonetheless makes substantive contributions through its innovative methodological appli-

1
cation. First, it subjects multiple existing theories to a different kind of empirical testing that

does not merely rely on a statistical significance test of regression model coefficients. Rather,

my theory-testing constructs a causal model that is more transparent in its causal assumptions

and uses machine learning-based estimation methods that are less dependent on functional form

assumptions. These two features of identification transparency and modeling flexibility are miss-

ing in many current empirical inquiries. Second, my analysis provides new substantive insights

into the causal determinants of treaty ratification. Previous research has analyzed predictors

of state commitment to universal treaties (Lupu 2014). Others have applied machine learning

technique of random forest to examine the predictive associations between various covariates

and state repression (Hill and Jones 2014). My investigation improves upon the former by using

machine learning in lieu of parametric linear regression models and upon the latter by endowing

the findings with a causal interpretation.

Fundamentally, my causal analysis follows Judea Pearl’s philosophy of “define first, iden-

tify second, estimate last” (van der Laan and Rose 2011). I start by examining the literature,

describing the research gaps, and reformulating them within a causal inference framework. A

careful review of existing theories and models of treaty ratification is critical not only to iden-

tify the research problems, but also to provide the substantive foundation upon which I can

then construct my graphical model of the causal process. Any causal analysis either implicitly

assumes or explicitly specifies a graphical causal model that represents the underlying data-

generating process. In an experimental setting, this causal model could be relatively simple. For

an observational study, particularly of a complex problem, a graphical causal model could be

substantially more intricate, but also exponentially more important to explicitly specify because

it encodes the many causal assumptions for identification. I then employ Pearl’s causal infer-

ence method (Pearl 2009) to identify the causal effects of interest and use an ensemble machine

learning technique called Super Learner (Polley and van der Laan 2010) to produce more robust

effect estimates. Finally, I interpret the causal findings in their substantive context.

2
2 Theories and Models of Treaty Ratification

International human rights law is created to protect and promote universal human rights. It es-

tablishes substantive obligations for states parties and designs procedural mechanisms to mon-

itor the implementation of those obligations (De Schutter 2010; Alfredsson et al. 2009; Buer-

genthal 2006). A major global regime is the UN human rights treaty system, which includes

many treaties and their associated monitoring bodies (Keller and Ulfstein 2012; Rodley 2013).

A natural question arises in the literature as to why more and more countries have ratified and

remained committed to human rights treaties that are designed precisely to limit their freedom

in how to treat their own citizens. Figure 1 shows the increasing number of states parties to

three major human rights treaties from 1966 when the ICCPR was opened for ratification until

2013. The question of treaty ratification is a simple, yet vexing, puzzle that scholars have wres-

tled with for a long time. Many theories have been proposed, identifying various explanatory

variables, but any consensus and agreements remain elusive.

Treaty CAT ICCPR CEDAW

175

150

125

100

75

50

25

0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Figure 1: Numbers of states parties to the ICCPR, the CEDAW, and the CAT from 1966 to 2013. The
three treaties were opened for ratification in 1966, 1979, and 1984, respectively.

First, some scholars believe that international socialization and the pressure of normative

conformity make cause state leaders to realize that treaty ratification is the expected and appro-

priate thing to do (Finnemore and Sikkink 1998). Two studies by Goodliffe and Hawkins (2006)

3
and Hathaway (2007) find correlative evidence to support this argument when they use global

and regional ratification rates as proxies for international socialization. A prominent study that

follows, however, casts doubt on the role of socialization as the driving force behind treaty rat-

ification. Simmons (2009, 90–96) creates a series of variables (measuring regional normative

convergence, socialization opportunities, an index for two different time periods, and informa-

tion environments) that interact with density of regional ratification and argues that regional

ratification rates do not necessarily reflect a normative force as much as a strategic calculation. It

is not immediately clear what causal models that Simmons (2009) assumes would generate the

data and whether and how the effect estimates of those interactive variables could be causally

interpreted.

The second group of explanations focuses on the economic reasons that states voluntarily

commit to universal human rights standards and subject themselves to international monitoring.

According to these explanations, states use ratification as a signaling device to improve their so-

cial standing, expecting to gain material benefits in return, even if they are disingenuous about

treaty compliance. The need for social signaling could be significant given the pressures on

lending institutions, foreign investors, and developed countries to link foreign aid (Lebovic and

Voeten 2006; Spence 2014), international investment (Blanton and Blanton 2007), and pref-

erential trade agreements (Hafner-Burton 2005) to human rights issues in recipient countries.

Participation in international trade in particular has been shown to be a significant predictor of

treaty commitment (Lupu 2014). The transactional rationale of treaty ratification could be even

more pressing for transitional and newly independent countries since they often need external

economic assistance and financial support (Smith-Cannoy 2012, 64–91). This instrumental ar-

gument, however, turns out to have virtually no empirical support according to a critical study

by Nielsen and Simmons (2015). The two authors find no correlation between ratifications of

four major human rights regimes (under the ICCPR and the CAT) and either the amounts of

foreign aid from OECD countries or other measures of tangible and intangible benefits.

Third, the most popular explanations of treaty ratification often identify domestic institu-

tions as the key predictors. An early theory advances what is often referred to as the “lock in”

argument, according to which transitional countries or those facing potential democratic insta-

4
bility tend to join human rights regimes to lock in and consolidate their democratic institutions

(Moravcsik 2000). Although this argument finds some empirical support in another study (Neu-

mayer 2007), there are some dissenting findings as well, indicating that neither new democra-

cies nor unstable, volatile regimes are significant predictors of CAT ratification (Goodliffe and

Hawkins 2006).

Researchers also focus on the interaction of domestic institutions and human rights prac-

tices to explain ratifications (Hathaway 2007). Post-ratification, they argue, states that have

sub-standard human rights protection will likely incur a higher cost of policy adjustment. This

cost, in turn, is more likely to actually materialize if democratic institutions are in place to

constrain state leaders. As a result, a poor human rights record predicts a low probability of

ratification, but only among democracies. Ratification cost may rise as well, depending on the

types of domestic institutions, including constitutional ratification rules, political regimes, and

an independent court system (Simmons 2009, 67–77). Hill (2016a) applies the same logic to

explain how governments selectively make reservations when they ratify human rights treaties

based on their domestic standards and legal institutions. Conversely, autocracies are just as

likely to ratify human rights treaties since their ratifications are usually empty promises that do

not bring any real cost of behavioral change (von Stein 2016). The theoretical expectation is

that, among autocracies, prior human rights practices have little impact on the probability of

treaty ratification.

Generally, it should be noted, states are believed to be less likely to commit to international

treaties if their prior level of compliance is low. This is often known as the selection effect

argument (Downs, Rocke and Barsoom 1996; von Stein 2005; Simmons and Hopkins 2005). In

the literature on international human rights law, however, this selection effect is often treated as

source of potential bias where prior measures of human rights outcome may confound the causal

relationship between human rights treaties and contemporaneous measure of the outcome. The

causal impact of prior human rights practices on treaty ratification is rarely a quantity of interest

to investigate.

For the most part, democracies are also believed to be more likely than autocracies to ratify

human rights treaties (Landman 2005) because of their domestic pressures or an incentive to

5
export rights-respecting norms. Hafner-Burton, Mansfield and Pevehouse (2015) similarly argue

that autocracies are less likely to join human rights regimes that may expose them to a high cost

of compliance. Vreeland (2008) adds an important caveat, however. He agrees that because

dictators are more inclined to use torture to retain power, they are indeed less likely to ratify

the CAT so as to avoid the cost associated with treaty violations. Yet, for dictators that co-

exist with multiple political parties, they have to bear the cost of non-ratification in the form of

pressures from the opposition parties. It turns out, according to Vreeland (2008), dictatorships

with multiple parties are actually more likely to ratify the treaty.

Hollyer and Rosendorff (2011) concur with Vreeland (2008), but they differ with respect

to his reasoning. For repressive leaders, the two authors claim, ratifying the CAT can actually

bring some significant signaling benefits with respect to a particular audience: the domestic

opposition. Opposition groups perceive an authoritarian leader’s act of committing to the CAT

(and then flaunting treaty violations) as a credible signal of her strength. As a result, the oppo-

sition is less likely to mount a challenge, in effect prolonging the survival of the authoritarian

leader. The implication is that autocracies are more likely to ratify costly human rights treaties

not because they concede to pressures from the opposition parties as Vreeland (2008) argues,

but rather because they actively seek ratification to reap its domestic signaling benefits. For

many human rights scholars, this credible commitment argument to explain treaty ratification

among autocratic regimes “has some plausibility problems on its face” (Simmons 2011, 743),

but it has not been disputed empirically. Even Hollyer and Rosendorff (2011) have conducted no

causal tests, pointing instead to the statistical association between CAT ratification and several

different outcomes such as leadership survival, level of government repression, and the extent

of opposition efforts.

To summarize, exactly why states ratify human rights treaties is still unclear. There could

be many reasons and multiple theories, but findings are all over the map and often contradict

each other or go untested causally. Whether they are ideational, instrumental, or institutional,

theories of treaty ratification remain contested and the variation in treaty ratification “has not yet

been fully explored” (Hafner-Burton 2012, 271). As Simmons (2011, 737–744) also observes,

the question of why states ratify international human rights law remains “an enduring puzzle.”

6
My causal variable importance analysis offers a solution to this puzzle by comparing the

causal effect estimates of a large number of theoretically identified predictors of treaty ratifica-

tion across three major human rights treaties. Its novelty is the machine learning-based causal

inference approach that I adopt to address two major limitations in existing empirical inquiries.

First, existing studies mostly rely on regression models that relies on the statistical significance of

ratification predictors. These models almost always make some restrictive parametric assump-

tions such as linearity, normality, and additivity to characterize the shape of the relationships

between treaty ratification and its predictors. Usually no justifications are provided as to why,

for example, a linear functional form or additivity of covariate effects is appropriate or accurate

instead of, for example, exponential, U-shaped, higher-order, or threshold effects. Since we do

not know a priori the underlying data-generating process, and usually it is virtually impossible to

know especially with regard to complex political phenomena, a conveniently specified statistical

model is likely a misspecified one, thus producing unreliable and biased effect estimates.

The second limitation is that virtually every study implies a causal query about the deter-

minants of treaty ratification. Yet, none has openly embraced a causal language and inference

framework within which to formulate and estimate the quantities of interest that correspond to

the research questions. This inattention to causal identification has unfortunate implications.

Scholars usually do not make explicit all their causal assumptions for identification. For ex-

ample, rather than an identification issue, endogeneity is often viewed as a statistical problem

because “there is no agreement on the most appropriate statistical approach” (von Stein 2016,

661). Researchers, as a result, often mistake estimation techniques such as propensity score

matching for an identification strategy (Pearl 2009, 349) and fail to explicitly link causal iden-

tification to transparent causal assumptions. Moreover, researchers often do not employ highly

useful and intuitive causal inference methods such as the backdoor criterion to guide their co-

variate selection and inform their statistical modeling, resorting instead to statistical “fixes” such

as country fixed effects and time trends that, without proper substantive justifications, could be

arbitrary or even counterproductive (Chaudoin, Hays and Hicks 2016).

The following two examples underscore the benefits of embracing a transparent causal

inference framework. In a prominent study of treaty commitment, the researcher fits multiple

7
regression models and successively regresses ratifications of human rights treaties and optional

protocols and provisions on several predictors that are measured contemporaneously, including

democracy, human rights violations, and their interaction term. The regression coefficient for

democracy is then interpreted as indication that “for each point increase in the measure of

Democracy, states with no human rights violations have between 10 and 54 percent increased

chance of ratifying human rights treaties than nondemocratic ones” (Hathaway 2007, 609).

This modeling procedure and interpretation are appropriate for a causal model repre-

sented in Figure 2a where X denotes democracy, Y stands for human rights violations, and

A is ratification. The majority of the literature, however, suggests that it is at least as likely

that democracy contemporaneously influences the extent of human rights violations rather than

the other way around even if it is possible that state repression may impede democratization

or undermine democracy in the next time period. A different causal model in Figure 2b could

be deemed just as, if not more, plausible, in which conditioning on human rights violations Y

would induce a post-treatment bias in estimating the causal effect of democracy X on ratifi-

cation A. The broader point is that whether the causal effect of interest can be identified and

estimated without bias depends intimately on the topology of the causal model and it is unnec-

essarily difficult, if not impossible, to fairly evaluate the causal model’s substantive plausibility

in the absence of an explicit, preferably graphical, representation of the causal model.

Y Y

X A X A

(a) (b)

Figure 2: (a) Simplified causal model inferred from Hathaway (2007) of the effect of X (democracy)
on A (treaty ratification), which is confounded by Y (torture practice); (b) Modified causal model
adapted from Hathaway (2007) of the effect of X (democracy) on A (treaty ratification) both directly
and indirectly through Y , suggesting a potential post-treatment bias in the simplified model.

For a more complicated example, the study by Vreeland (2008) raises the possibility of

omitted variable bias in explaining the positive correlation between CAT ratification and torture

practices in dictatorships. The situation is represented in Figure 3 where the vertices X, Y , and

A respectively denote multiple parties, torture, and CAT ratification. Failing to condition on

8
X in this case would confound the potential (non)relationship between Y and A and explain

why “the more a dictatorship practices torture, the more likely it is to sign and ratify the CAT”

(Vreeland 2008, 68).

Y A

Figure 3: Simplified causal model inferred from Vreeland (2008) of the effect of X (multiple parties)
on both Y (torture) and A (CAT ratification).

Assuming the goal of Vreeland (2008) is to make causal inference, we can infer from his

statistical models various causal models that the author implicitly assumes. Table 1 in Vreeland

(2008, 83) presents multiple regression models that estimate the instantaneous effect of multi-

ple parties on torture among dictatorships. These models are represented in Figure 4a where X

denotes multiple parties, Y denotes torture, and W1 is a set of control variables (gross domestic

product per capita, population, trade/GDP, civil war, and communist regime). I add the node S

in double circle to indicate the sample selection of only dictatorships.1

Vreeland (2008) then proceeds to estimate the instantaneous effect of multiple parties

(X) on CAT ratification (A) among dictatorships (S). His regression models in Table 3 (Vree-

land 2008, 90) assume the causal model in Figure 4b where W2 is a different set of control

variables (communist regime, lagged regional score of CAT ratification, the number of countries

that have ratified the CAT, the percentage of the population that are Muslims, GDP per capita,

population, and the trade/GDP proportion). Vreeland (2008, 89) also controls for “the log of

the Hathaway torture scale.” This is a curious modeling decision, however, since it implies that

Y is a confounding variable that affects both X and A. Thus, it can be seen that between the
1
The original study does not discuss sample selection and its consequences for identification. Here I assume that
sample selection S, which is based on regime type, is dependent on the control variables W1 and W2 . This is not un-
reasonable since democracy arguably depends on economic development, the presence or absence of civil war, trade,
among others. This assumption is also convenient because we can then remove from consideration the consequences
of sample selection in order to focus on the causal relationships between multiple parties and, respectively, torture
and treaty ratification. In other cases, though, as Bareinboim, Tian and Pearl (2014) demonstrate, sample selection
could potentially render the causal effect of X on Y in Figure 4a non-identifiable from the sample data. For example,
insofar as legally organized political parties (treatment X) and torture (outcome Y ) both influence sample selection
S, that is, the use of torture may suppress and undermine democracy (Y → S) while mobilization by opposition
parties promotes democratization (X → S), we will end up with a collider bias X → S ← Y and the causal effect
of X on Y will not be recoverable from the sample data.

9
W1 W2 A

X Y X Y

S S

(a) (b)

Figure 4: (a) Causal model inferred from Vreeland (2008, 83) of the effect of X (multiple parties) on
Y (torture) among S (dictatorships) with control variables W1 ; (b) Causal model inferred from Vree-
land (2008, 90) of the effect of X (multiple parties) on A (CAT ratification) among S (dictatorships)
with control variables W2 . Arrows of opposite directions between X and Y across the two causal
models suggest incoherent assumptions about the causal process.

causal model in Figure 4a (where X → Y ) and the causal model in Figure 4b (where Y → X),

some incoherent assumptions are made with respect to the contemporaneous causal relationship

between multiple parties and torture. If multiple parties only affect torture as assumed in Fig-

ure 4a but not the other way around, then controlling for torture as Vreeland (2008, 90) does

would introduce a post-treatment bias. It might be that X and Y mutually cause each other

instantaneously, but then it would not be possible to identify the causal effect of X (multiple

parties) on either A (CAT ratification) or Y (torture).

It should be emphasized that I remain agnostic at this point as to whether these causal

models accurately depict the true underlying causal process or which specific statistical methods

are used to estimate the causal quantities of interest from observational data. Nevertheless, the

two examples illustrate the critical importance of graphically representing our causal models.

A graphical model would make explicit our assumptions, consistent or otherwise, about the

underlying data-generating process and reveal potential identification problems that may arise.

10
3 Causal Variable Importance Analysis of Treaty Ratification

3.1 Notation and causal model formulation

Traditional variable importance analyses use parametric models to estimate the association be-

tween input variables and an outcome, using a variety of metrics such as regression coefficients

and p-values, model fit, or predictive accuracy. Taking a different approach, I instead formulate

variable importance in terms of their average causal effects. Informally, the causal effect of a

variable is defined as the effect of an intervention to fix, as opposed to observe, that variable.

For a binary variable, the treatment and control values are intuitively clear. For a continuous

variable, I use its observed maximum and minimum values.

In an observational setting, the first step in identifying and estimating causal effects is to

build a non-parametric structural causal model as a set of equations to describe, to the best of

our knowledge, the underlying data-generating process. In my following model, W is a set of

time-invariant covariates; X1 and X2 are either binary or continuous time-varying predictors; Y

is human rights outcome; and A is treaty ratification.2 The subscript t indicates the time periods

during which the variables are measured. Together these equations form a generative system

from which n country–year observations On are sampled and the joint probability distribution

of the observed data is On = (W, X1t , X2t , At , Yt ) ∼ PO .

W = fW (UW )

X1t = fX1 (W, At−1 , Yt−1 , X1t−1 , X2t−1 , UX1 )

X2t = fX2 (W, At−1 , Yt−1 , X1t−1 , X2t−1 , UX2 ) (1)

At = fA (W, At−1 , Yt−1 , X1t , X2t , UA )

Yt = fY (W, Yt−1 , At , X1t , X2t , UY )


2
Quantitative human rights law research mostly focuses on the influence of human rights treaties on state prac-
tices. It therefore often considers treaty ratification as the treatment, the impact of which is to be evaluated. In the
epidemiology and biomedical literature, from which I derive a lot of methodological insights, the treatment is usually
denoted A and the outcome Y . To be consistent with the larger research program on international human rights
law, throughout the paper I use A to denote treaty ratification, which is the outcome in this study. The treatments in
my causal variable importance analysis are ratification predictors denoted X such as {X1, X2}. As annotated and
explained later in my graphical causal model, human rights practice, denoted Y , is actually a potential confounder.

11
A structural causal model is best represented in the form of an acyclic directed graph

(DAG). A causal DAG (Darwiche 2009; Elwert 2013; Pearl 2009) comprises of a set of nodes/ver-

tices denoting random variables. An edge/arrow denotes one variable’s (the parent node) direct

causal influence on another node (the child node). A path in a causal DAG is an arrow or a

sequence of arrows, regardless of their directions, that connects one node to another. A causal

(or directed) path have all arrows on its path point to the same direction. Otherwise, it is a

non-causal path.

My causal DAG in Figure 5 has a dynamic structure that reflects a temporal order with

past nodes in the left shaded block and future nodes in the right shaded block. Each block

represents a single time period. There are no arrows or sequence of arrows going from the

block on the right to the block on the left, meaning that no variable in the future should have

a causal influence on any variable in the past. The DAG is also acyclic in the sense that, within

the same temporal block, there are no loops or directed paths going from a node to itself. I

make no assumptions about any of the functional forms f = {fW , fX1 , fX2 , fA , fY }, which is

consistent with the recognition that usually we do not have enough knowledge to specify the

exact functional forms that characterize the relationships between variables. For the sake of

simplicity and without loss of generality, I construct a causal model with only two time-varying

predictors X1 and X2 over two time periods from t − 1 to t. A larger number of predictors over

a longer time span can be represented in a similar fashion.

As in any causal analyses, we have to make a few assumptions about the underlying causal

process. Similar to Díaz et al. (2015, 6), I assume ratification predictors do not instantaneously

affect each other although they may influence every other predictor of the next time period.

That means, for example, the amount of official development assistance (ODA) and economic

development are conditionally independent from each other in the same time period. ODA

at time t − 1, however, could certainly affect economic development at time t (notationally,

X1t−1 → X2t ). From an identification standpoint, this assumption is necessary because if the

predictors are allowed to mutually cause each other instantaneously, it would render the causal

model cyclical and make it impossible to identify their causal effects.

I further assume the exogenous variables U = {UW , UX1 , UX2 , UA , UY } are jointly inde-

12
pendent. As a result, the values of any node is strictly a function of its parent nodes and some

exogenous factors. This implies that observing a variable’s parent nodes will render that variable

independent from other covariates except for its descendants. For example, treaty ratification At

has as its parent nodes time-invariant covariates W , predictors Xt , human rights practice in the

immediate past Yt−1 , and prior ratification status At−1 . If we observe the set {W, Yt−1 , At−1 , Xt },

then At is conditionally independent from other nodes, including all Xt−1 , except for the de-

scendants of At such as Yt and At+1 .

X1t−1 X1t

At−1 Yt−1 At Yt

X2t−1 X2t

Figure 5: A dynamic graphical causal model with shaded blocks indicating two temporal periods.
Time-invariant covariates W , which precede and potentially affect all other variables, are not repre-
sented. The sufficient adjustment sets to identify the causal effects of X1t → At and X2t → At are
{W, At−1 , Yt−1 , X2t } and {W, At−1 , Yt−1 , X1t }, respectively.

It should be emphasized that, short of a randomization of the treatment like in an exper-

imental design, any observational studies that aim to make causal inference have to make this

exogeneity assumption and the only way to justify it is to rely on the domain knowledge in the

literature (Table 1). In other words, since one cannot know if a model accurately represents

the causal process based on a scrutiny of the observed data alone, it is important that the body

of knowledge in the literature should guide and justify the construction of my causal model as

follows. First, the causal dependence Yt−1 → At is informed by the selection effect argument

13
that a state may make a commitment decision based in part on its prior level of compliance

because they will significantly determine its ratification cost (Downs, Rocke and Barsoom 1996;

von Stein 2005).

Second, I allow for the causal dependencies Xt−1 → Xt and Yt−1 → Yt . This is a routine

assumption in the context of time-series cross-section data structure. Substantively, this assump-

tion also permits the possibility that human rights violations may have some inherent dynamic

that goes beyond contextual factors such as poverty, dictatorship, involvement in conflicts, and

so forth. As Hill and Jones (2014, 674) observes, this argument means that “the governments

can become habituated to the use of violence to resolve political conflict.” I include this causal

relationship, bearing in mind that, in a graphical causal model, an arrow between variables in-

dicates a possible, but not necessarily an actual causal link. A missing arrow, on the other hand,

is equivalent to ruling out any direct causality.

Third, an argument can also be made that human rights practices affect some ratification

predictors in the next time period. An obvious example is that the use of torture and other

extrajudicial measures by the government could intimidate its critics, suppress movements for

democratization, and undermine democracy. The inclusion of the directed arrows Yt−1 → X1t

and Yt−1 → X2t in my causal model is informed by this argument.

Fourth, I similarly speculate a direct causal dependence At−1 → At based on the obser-

vation that once governments ratify an international human rights treaty, they are unlikely to

withdraw from that treaty. It should be noted that in many cases withdrawal is entirely legally

possible. Many human rights treaties and their optional protocols have denunciation provisions

that allow states to exit from these institutions, including Article 31 of the CAT, Article 12 of

the First Optional Protocol to the ICCPR, and Article 19 of the Optional Protocol to the CEDAW.

This is not the case with the ICCPR and the CEDAW, which do not have a denunciation clause

or provision. That, however, has not prevented some states from denouncing and attempting

to withdraw from the ICCPR (Tyagi 2009). I therefore code treaty membership as an implicit

annual ratification as opposed to a terminal event. This is also consistent with conventional

modeling practices in the literature that estimates the impact of human rights treaty ratification

as a time-varying treatment.

14
Finally, the causal dependencies At−1 → X1t and At−1 → X2t suggest that we leave open

the possibility that a human rights treaty, once ratified, could influence state behavior in the next

time period through a variety of mediators such as public opinion and electoral accountability in

democracies (Dai 2005; Wallace 2013), legislative constraints of the executive by the opposition

parties (Lupu 2015), and judicial effectiveness of the domestic court system (Crabtree and Fariss

2015; Powell and Staton 2009).

Table 1 lists the model variables and data sources for their measurements. It also refers

to studies in the literature that similarly classify or assume these variables as time-invariant

covariates, confounders, and ratification predictors. For example, if a study that investigates the

impact of a human rights treaty on state practice includes democracy and independent judiciary

as time-varying control variables in its statistical models, we can infer that study views these two

covariates as ratification predictors. Appendix A provides more detailed variable descriptions,

coding, and data sources.

Given the causal model and its encoded assumptions, I formulate the causal importance

of a predictor in terms of its contemporaneous average causal effect on treaty ratification. It is


   
denoted by τ = E At |do(X1t = 1) − E At |do(X1t = 0) where the do-operator is notation for

an active intervention to fix the value of X1. In the interventional framework of causal inference

(Pearl 2009), that means we would intervene on the generative system (Equation set 1) to fix

the equation X1t = fX1 (W, At−1 , X1t−1 , X2t−1 , UX1 ) reiteratively at X1t = {0, 1}. From the

two resulting modified generative systems At = fA (W, At−1 , Yt−1 , x, X2t , UA ) for x = {0, 1}, we

then compute the difference between the two mean values of treaty ratification, which will be a

consistent estimate of the causal effect of X1 as long as causal identification is established.

3.2 Causal identification

Causal identification involves establishing the conditions under which a property of an interven-
 
tional distribution such as the expectation E A|do(X = x) can be computed without bias from

an observational probability distribution. My causal identification strategy is to identify a valid

adjustment set of observed variables that makes the interventional distribution of the outcome

A (treaty ratification) essentially equivalent to its observed conditional distribution.

15
Table 1: Model variables

Sets Variables and references


Ratification rules (Simmons 2009) measured by Simmons (2009).
W Domestic legal traditions (Mitchell, Ring and Spellman 2013)
measured by La Porta, Lopez-de Silanes and Shleifer (2008).
ICCPR proportion of ratification globally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by Office of High Commissioner for Human Rights (OHCHR).
CEDAW proportion of ratification globally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by OHCHR.
CAT proportion of ratification globally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by OHCHR.
ICCPR proportion of ratification regionally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by OHCHR.
CEDAW proportion of ratification regionally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by OHCHR.
CAT proportion of ratification regionally (Goodliffe and Hawkins 2006; Hathaway 2007)
measured by OHCHR.
Democracy/dictatorship classification
(Hathaway 2007; Chapman and Chaudoin 2013; Neumayer 2007)
measured by Cheibub, Gandhi and Vreeland (2010).
X Multiple parties (Vreeland 2008; Hollyer and Rosendorff 2011)
measured by Cheibub, Gandhi and Vreeland (2010).
Transition to/from democracy (Goodliffe and Hawkins 2006; Moravcsik 2000)
measured by Cheibub, Gandhi and Vreeland (2010).
Involvement in militarized interstate dispute (Chapman and Chaudoin 2013)
measured by Melander, Pettersson and Themnér (2016) and Gleditsch et al. (2002).
Judicial independence (Powell and Staton 2009) measured by
Linzer and Staton (2015).
Population size (Hafner-Burton and Tsutsui 2007)
measured by the World Bank Indicators.
Gross domestic product (GDP) per capita (Hafner-Burton and Tsutsui 2007)
measured by the World Bank Indicators.
Participation in international trade (Hafner-Burton 2013)
measured as trade volume/GDP by the World Bank Indicators.
Net official development assistance (Nielsen and Simmons 2015)
measured by the World Bank Indicators.
CIRI torture index (Cingranelli, Richards and Clay 2013).
Y CIRI women’s political rights index (Cingranelli, Richards and Clay 2013).
Human rights dynamic latent score (Fariss 2014).
ICCPR ratification measured by OHCHR.
A CEDAW ratification measured by OHCHR.
CAT ratification measured by OHCHR.

Any causal identification in the setting of observational data ultimately depends on the un-

derlying causal structure, which is best represented by a causal DAG. DAGs are thus an effective

tool to make all causal assumptions transparent and facilitate a clear and easy determination of

16
sufficient adjustment sets using the backdoor criterion. To illustrate identification of the causal

effect of X1t on At , for example, I apply the following backdoor criterion (Pearl, Glymour and

Jewell 2016, 61–66) to find an adjustment set of variables such that conditioning on that set

will:

(a) block any (non-causal) paths from X1t to At that have an arrow coming into X1t ;

(b) leave open all causal paths from X1t to At ; and

(c) not condition on a collider (a node that lies on any paths between X1t and At and has

two arrows coming into it) or a descendant of a collider (a node connected to a collider

through a directed path emanating from the collider).

When we condition on an adjustment set that satisfies the backdoor criterion, we essen-

tially remove all non-causal pathways from X1t to At and render these two variables condition-

ally independent or d-separated and, as a result, the interventional distribution of the outcome

A when X1 is intervened upon is essentially equivalent to its observational distribution. More

generally, when all non-causal paths between a predictor and the outcome are closed off, any

remaining significant correlation between them is evidence of a causal relationship.

From the graphical causal model in Figure 5, I derive a sufficient set of covariates for

adjustment Z1 = {W, At−1 , Yt−1 , X2t } that satisfies the backdoor requirement to identify the

causal effect of X1t on At . Specifically, conditioning on Yt−1 will, according to rule (a), block

five non-causal paths from X1t to At , including (i) X1t ← At−1 → Yt−1 → At ; (ii) X1t ←

X1t−1 → At−1 → Yt−1 → At ; (iii) X1t ← Yt−1 → At ; (iv) X1t ← Yt−1 → X2t → At ;

and (v) X1t ← X2t−1 → Yt−1 → At . Similarly, conditioning on At−1 will, by the same rule,

block two other non-causal paths from X1t to At , including (i) X1t ← At−1 → At and (ii)

X1t ← At−1 → X2t → At .

However, Yt−1 is also a collider on the path X1t ← X1t−1 → Yt−1 ← X2t−1 → X2t →

At . Conditioning on Yt−1 will therefore open that non-causal path and violate rule (b) of the

backdoor requirement. I therefore further condition on X2t to block this non-causal path. For

the same reason that I have accidentally opened the non-causal path X1t ← X1t−1 → At−1 ←

X2t−1 → X2t → At when conditioning on the collider At−1 , I block this path by conditioning

17
on X2t . Conditioning on X2t also happens to block three other non-causal paths that traverse

through X2t , including (i) X1t ← X2t−1 → X2t → At ; (ii) X1t ← At−1 → X2t → At ; and

(iii) X1t ← X2t−1 → At−1 → X2t → At . The latter two of these three non-causal paths run

through At−1 as well and therefore are already blocked when we condition on At−1 .

We should not condition on contemporaneous measure of human rights practice Yt when

estimating the causal effect of X1t , however. Since it is a collider on the path X1t → Yt ← At ,

conditioning on Yt would violate rule (c) of the backdoor criterion, introducing a non-causal

association between X1t and At and biasing the causal effect estimate of X1t . For identification

of the causal effect of X2t on At , I apply the same rules and similarly derive a sufficient adjust-

ment set Z2 = {W, At−1 , Yt−1 , X1t }. In summary, to identify the contemporaneous causal effect

of a ratification predictor, I condition on time-invariant covariates, immediately prior ratification

status and level of compliance, and other contemporary time-varying covariates.

In addition to a causal variable importance analysis, I use the same graphical causal model

to develop a causal test of many theories of CAT ratification. First, I test the argument by Hath-

away (2007) that democracy (X1t ) and torture practices (Yt ) interact to lower the probability

of CAT ratification (At ). Based on the causal DAG in Figure 5, one should not condition on Yt or,

for that matter, use an interaction term of Yt and X1t while estimating the effect of X1t on At .

Since Yt is a collider on two different paths X1t → Yt ← At and X1t → Yt ← Yt−1 → At , con-

ditioning on Yt will induce a collider bias. I instead causally test this interactive effect argument

by estimating the Yt−1 -specific effect of X1t on At , using the adjustment set Z = {W, At−1 , X2t }

that satisfies the backdoor requirement within each subset of observations based on the values

of Yt−1 (Pearl, Glymour and Jewell 2016, 71–72). The test results will provide evidence as to

whether there is any effect modification by past torture practice, that is, whether the effect of

democracy on treaty ratification varies across levels of compliance in the previous year. The con-

ventional expectation is that the positive causal effect of democracy on treaty ratification will

diminish and eventually reverse its direction as the level of torture in the prior year increases.

Note that we cannot identify the X1t -specific causal effect of Yt−1 on At because of potential

post-treatment bias since X1t could be a descendant of Yt−1 along the path Yt−1 → X1t → At if

the use of torture possibly undermines democratic institutions.

18
Second, I test Vreeland’s omitted variable bias argument by directly estimating the causal

effect of multiple political parties (X2) on CAT ratification (A) among dictatorships (X1 =

0). The quantity of interest corresponding to the test is formulated as the X1t -specific causal

effect of X2t on At , that is, the causal effect of multiple parties on treaty ratification among

observations with the value X1t = 0. The sufficient adjustment set for identification is Z =

{W, At−1 , Yt−1 , X1t }. As Vreeland (2008, 79) predicts, “the effect of the multiparty institution

is to make a dictatorship more likely to enter into the CAT,” implying a positive causal effect of

multiple parties.

Third, I estimate the average causal effect of prior torture practice on CAT ratification

(Yt−1 → At ) in a causal test of the selection effect argument. This argument is often made

but has rarely been empirically quantified within a causal inference framework. The theoretical

expectation is a negative causal effect of Yt−1 , suggesting that higher level of torture in the

previous year is expected to cause state leaders to be less likely to ratify the CAT in the following

year. A sufficient adjustment set I derive for identification is Z = {W, At−1 , X1t−1 , X2t−1 }.

Finally, I also test the argument with respect to the signaling benefits of CAT ratification

for dictators (Hollyer and Rosendorff 2011) by estimating the causal effect of torture on CAT

ratification among autocracies, that is, the X1t−1 -specific causal effect of Yt−1 on At . The the-

oretical expectation is that “authoritarian governments that torture heavily are more likely to

sign the treaty than those that torture less” (Hollyer and Rosendorff 2011, 276), which implies

a positive effect of Yt−1 among observations that have the value X1t−1 = 0. A sufficient set that

satisfies the backdoor criterion for causal effect identification is Z = {W, At−1 , X2t−1 }.

3.3 Machine learning-based estimation

Once we have determined the sufficient adjustment sets Z that satisfy the backdoor requirement

for identification of various causal effects, I adopt two machine learning-based methods for

causal effect estimation: substitution estimation and targeted maximum likelihood estimation

(TMLE). My estimation methods are analogous to the OLS estimator if the underlying causal

system in Equation set 1 is assumed to be linear and all covariate effects are additive and all the

noise terms U are Gaussian. The use of machine learning is aimed to relax this assumption.

19
For each of the continuous predictors of treaty ratification (global proportion of ratifica-

tion, regional proportions of ratification, population size, GDP per capita, trade/GDP propor-

tion, net amount of ODA, and judicial independence) the substitution estimator (Robins 1986;
1 Pn  
Robins, Greenland and Hu 1999) computes τ̂ = n i=1 Qn (1, Z) − Qn (0, Z) as an estimate of
   
its average causal effect τ = E A|do(X = 1) − E A|do(X = 0) . Specifically, I fit a prediction
 
model Q̄n (X, Z) = E A|X, Z of treaty ratification A using X and the corresponding sufficient

adjustment set Z. I then reiteratively substitute the predictor values with X = 1 (empirically

maximum value) and X = 0 (empirically minimum value) for each observation, generate the

counterfactual outcomes, and compute the mean difference.

For variance estimation, I use the nonparametric bootstrap method. In the presence of

missing data, my procedure is similar to Daniel et al. (2011, 491) and suggested by Tsiatis

(2007, 362–371). I combine bootstrap with single stochastic imputation rather than multiple

imputation in order to make efficient and still valid inference. In addition to its greater efficiency,

another benefit of combining nonparametric bootstrap and single (improper) imputation is that

we do not have to rely on the normality assumption as required by the Rubin’s approach (Little

and Rubin 2014) when pooling variances across imputed datasets. Instead, I create distribution-

free confidence intervals, using the 2.5% and 97.5% quantiles of the bootstrap distribution to

obtain the desired coverage.

The key to obtaining consistent effect estimates with a substitution estimator is to fit a cor-

rectly specified outcome model Q̄n that approximates the (unknown) data generating mecha-

nism. The standard practice is to assume a binomial distribution for the binary outcome of treaty

ratification and then model a property of the outcome distribution as a linear, additive function

of a set of covariates, sometimes with an interaction term included. If these distributional and

functional form assumptions are wrong, which they likely are for probably non-linear, highly

complex political phenomena, the results will be misspecified models, biased effect estimates,

invalid inference, and misleading conclusions. The ensemble machine learning technique Super

Learner (van der Laan, Polley and Hubbard 2007; Sinisi et al. 2007) offers a powerful solution

to this problem of correct functional forms.

Super Learner has been used in economics (Kreif et al. 2015), political science (Samii,

20
Paler and Daly 2016), and epidemiology and biomedical research (Neugebauer et al. 2013;

Pirracchio, Petersen and van der Laan 2015). It stacks a user-selected library of predictive

algorithms and uses cross-validation to evaluate the performance of each algorithm in minimiz-

ing a specified loss function. For the binary outcome of treaty ratification, an appropriate loss
h 1−A i
function is the negative log-likelihood −log Q(X, Z)A 1 − Q(X, Z) , which measures the

degree of misfit with the observed data. User-selected predictive algorithms can include simple

main-term linear regression model, semi-parametric generalized additive model (Hastie and Tib-

shirani 1990), regularized regression models (Tibshirani 1996), and non-parametric tree-based

ensemble methods such as boosting (Friedman 2001) and random forest (Breiman 2001). Ta-

ble 2 lists the algorithms I use for my machine learning-based substitution estimation given the

constraints in terms of computational resources.

Table 2: Algorithms used in Super Learning-based Substitution Estimation

Algorithm Description
Pp
GLMnet Regularized logistic regression with lasso penalty j=1 |βj |.
GAM Generalized additive model.
(Tuned) XGBoost Extreme gradient boosting (eta = 0.01, depth = 4, ntree = 500).

The use of cross-validation is crucial for the algorithms to generalize well in terms of pre-

dicting unknown outcome values and avoiding overfitting. Super Learner then creates a linear

combination of these algorithms, each of which is weighted by its average predictive accuracy,

to build a hybrid prediction function that performs approximately as well as and usually better

than the best algorithm in the library. The ability of Super Learner to assemble a rich, diverse set

of algorithms makes it particularly effective and much more likely to approximate the underlying

data generating process (Polley and van der Laan 2010).

One favorite, state-of-the-art algorithm is extreme gradient boosting (Chen and He 2015;

Chen and Guestrin 2016), a faster implementation of the very popular and effective machine

learning technique of gradient boosting machine (Friedman 2001; Schapire and Freund 2012;

Natekin and Knoll 2013). Extreme gradient boosting (XGBoost) is non-parametric and its tree-

based nature allows it to capture non-linear, interactive dynamics among a large number of

predictors. Furthermore, unlike other tree-based methods such as random forest and gradient

21
boosting machine, XGBoost has greater computational efficiency, which makes it particularly

suitable to use in the context of nonparametric bootstrap for inference.

The performance of XGBoost could be sensitive to hyper-parameter setting. I employ a

combination of cross-validation and grid search to select the best among a large number of

configurations (comprising of varying learning rates, tree depths, and numbers of trees) that are

tuned specifically to each of the three singly imputed ICCPR, CEDAW, and CAT datasets.
XGB_500_4_0.01_All XGB_500_4_0.01_All
Discrete SL Discrete SL
Super Learner Super Learner
XGB_500_5_0.01_All XGB_500_5_0.01_All
XGB_1000_4_0.01_All XGB_1000_4_0.01_All
XGB_500_6_0.01_All XGB_200_4_0.05_All
XGB_200_4_0.05_All XGB_500_7_0.01_All
XGB_1000_5_0.01_All XGB_500_6_0.01_All
XGB_500_7_0.01_All XGB_200_4_0.01_All
XGB_200_5_0.05_All XGB_1000_5_0.01_All
XGB_1000_6_0.01_All XGB_200_5_0.05_All
XGB_200_6_0.05_All XGB_200_5_0.01_All
XGB_1000_7_0.01_All XGB_1000_7_0.01_All
XGB_200_7_0.05_All XGB_200_7_0.05_All
XGB_200_5_0.1_All XGB_200_6_0.05_All
XGB_200_4_0.1_All XGB_200_6_0.01_All
XGB_500_4_0.05_All XGB_1000_6_0.01_All
XGB_500_5_0.05_All XGB_200_4_0.1_All
XGB_200_6_0.1_All XGB_200_7_0.01_All
XGB_200_4_0.01_All XGB_500_4_0.05_All
XGB_500_6_0.05_All XGB_200_5_0.1_All
XGB_200_5_0.01_All XGB_200_6_0.1_All
XGB_200_7_0.1_All XGB_200_7_0.1_All
XGB_500_7_0.05_All XGB_500_6_0.05_All
Method

Method
XGB_200_6_0.01_All XGB_500_5_0.05_All
XGB_500_4_0.1_All XGB_200_4_0.2_All
XGB_200_5_0.2_All XGB_500_7_0.05_All
XGB_200_4_0.2_All XGB_1000_4_0.05_All
XGB_200_7_0.01_All XGB_500_4_0.1_All
XGB_500_5_0.1_All XGB_200_5_0.2_All
XGB_1000_4_0.05_All XGB_200_6_0.2_All
XGB_200_6_0.2_All XGB_200_7_0.2_All
XGB_500_6_0.1_All XGB_500_6_0.1_All
XGB_1000_5_0.05_All XGB_1000_7_0.05_All
XGB_200_7_0.2_All XGB_1000_6_0.05_All
XGB_1000_6_0.05_All XGB_500_5_0.1_All
XGB_1000_7_0.05_All XGB_500_7_0.1_All
XGB_500_7_0.1_All XGB_1000_5_0.05_All
XGB_1000_4_0.1_All XGB_500_4_0.2_All
XGB_1000_5_0.1_All XGB_1000_7_0.1_All
XGB_1000_6_0.1_All XGB_1000_6_0.1_All
XGB_500_5_0.2_All XGB_1000_4_0.1_All
XGB_500_4_0.2_All XGB_500_7_0.2_All
XGB_1000_7_0.1_All XGB_500_5_0.2_All
XGB_500_6_0.2_All XGB_500_6_0.2_All
XGB_500_7_0.2_All XGB_1000_5_0.1_All
XGB_1000_5_0.2_All XGB_1000_7_0.2_All
XGB_1000_6_0.2_All XGB_1000_4_0.2_All
XGB_1000_4_0.2_All XGB_1000_5_0.2_All
XGB_1000_7_0.2_All XGB_1000_6_0.2_All
0.020 0.025 0.030 0.035 0.02 0.03 0.04 0.05
V−fold CV Risk Estimate V−fold CV Risk Estimate

(a) (b)
XGB_500_4_0.01_All
Discrete SL
Super Learner
XGB_500_5_0.01_All
XGB_500_6_0.01_All
XGB_1000_4_0.01_All
XGB_200_4_0.05_All
XGB_500_7_0.01_All
XGB_1000_5_0.01_All
XGB_200_5_0.05_All
XGB_1000_6_0.01_All
XGB_200_6_0.05_All
XGB_200_7_0.05_All
XGB_200_4_0.1_All
XGB_200_4_0.01_All
XGB_1000_7_0.01_All
XGB_200_5_0.1_All
XGB_200_5_0.01_All
XGB_500_4_0.05_All
XGB_200_7_0.1_All
XGB_500_5_0.05_All
XGB_200_6_0.01_All
XGB_200_6_0.1_All
XGB_500_6_0.05_All
Method

XGB_200_7_0.01_All
XGB_500_7_0.05_All
XGB_500_4_0.1_All
XGB_200_5_0.2_All
XGB_1000_5_0.05_All
XGB_1000_4_0.05_All
XGB_500_5_0.1_All
XGB_1000_6_0.05_All
XGB_500_7_0.1_All
XGB_200_7_0.2_All
XGB_200_4_0.2_All
XGB_500_6_0.1_All
XGB_200_6_0.2_All
XGB_1000_7_0.05_All
XGB_1000_4_0.1_All
XGB_1000_5_0.1_All
XGB_1000_7_0.1_All
XGB_500_7_0.2_All
XGB_500_5_0.2_All
XGB_1000_6_0.1_All
XGB_500_6_0.2_All
XGB_500_4_0.2_All
XGB_1000_7_0.2_All
XGB_1000_6_0.2_All
XGB_1000_5_0.2_All
XGB_1000_4_0.2_All
0.03 0.04
V−fold CV Risk Estimate

(c)

Figure 6: Cross-validated risk of XGB algorithms in predicting (a) ICCPR ratification, (b) CEDAW
ratification, and (c) CAT ratification.

22
To estimate the causal effect of binary predictors (democracy, multiple political parties,

democratic transition, and involvement in militarized interstate disputes), I use targeted maxi-

mum likelihood estimation (van der Laan and Rose 2011). Similar to the substitution estima-

tor, TMLE also starts by fitting an initial predictive outcome model of treaty ratification Q0n =

E(A|X, Z). It then modifies the initial model Q0n (X, Z) into an updated model Q1n (X, Z), using

the modifying equation logit(Q1n ) = logit(Q0n ) + n Hn where the “clever covariate” Hn (X, Z) =
h i
I(X=1) I(X=0)
gn (X=1|Z) − gn (X=0|Z) is a function of the treatment mechanism gn = E(X|Z) and the coeffi-

cient n is obtained via a separate regression model logit(A) = logit(Q0n ) + n Hn . In the third

and final step, TMLE similarly substitutes two distinct values of a binary predictor, plugs them

into the updated outcome model Q1n (X, Z) to generate the counterfactual outcomes for each ob-

servation, and computes the average causal effect as the mean difference of the counterfactual

outcome values.

TMLE is essentially the substitution estimator but with an additional updating step in be-

tween to incorporate information about treatment assignment. This updating step is at the heart

of the TMLE methodology. It makes the estimator doubly robust by reducing any remaining bias

in the initial outcome model, producing unbiased estimates if either the initial outcome model

Q0n or the treatment assignment model gn is consistent. It is maximally efficient asymptotically

if both Q0n and gn are consistent. Note that both Q0n and gn are already more robust to mis-

specification, and thus more likely to be consistent than standard parametric statistical models,

because I have incorporated machine learning in my estimation.

In short, the TMLE methodology computes causal effect estimates of binary treatment

variables that are more robust than both parametric regression models and propensity score-

based estimators. Machine learning-based TMLE is even more robust and less computationally

expensive than the machine learning-based substitution estimator with bootstrapped samples

thanks to its efficient influence function-based approach to variance estimation (van der Laan

and Rose 2011, 94–97). Because of TMLE’s greater computational efficiency, I am able to employ

a more diverse and richer set of learning algorithms in Table 3.

To handle missing data when estimating the causal effect of binary predictors, I conduct

multiple imputation, using the Amelia II program (Honaker et al. 2011), and combine estimates

23
across m = 5 imputed data sets. Appendix B provides the summary statistics of the observed

data and Appendix C summarizes the imputation process. The ICCPR, the CEDAW, and the

CAT were opened for ratification at different times. I thus create three separate datasets (and,

correspondingly, 15 imputed datasets) that have different temporal coverage periods, including

1967–2013 for the ICCPR (opened for ratification in 16 December 1966), 1982–2013 for the

CEDAW (adopted and opened for ratification in 18 December 1979, but the CIRI measure of

women’s political rights only begin in 1981), and 1985–2013 for the CAT (opened for ratifi-

cation in 10 December 1984). For algorithmic learning stability and ease of interpretation, I

standardize all continuous covariates into a bounded range between zero and one.

Table 3: Algorithms used in Super Learner-based Targeted Maximum Likelihood Estimation

Algorithm Description
Pp
GLMnet Regularized logistic regression with lasso penalty j=1 |βj |.
GAM Generalized additive model (degree of polynomials = 2).
polymars Polynomial multivariate adaptive regression with splines.
randomForest Random Forest (ntree = 1,000).
XGBoost Extreme gradient boosting (eta = 0.01, depth = 4, ntree = 500).

3.4 Results and interpretation

Table 4 reports the estimates of the contemporaneous average causal effects of the ratification

predictors. Despite some differences, their causal effect estimates are relatively consistent across

three human rights treaties. First, the results underscore the importance of regional socialization

and norm diffusion in causing states to ratify human rights treaties. Going from the observed

lowest proportion to the observed highest proportion of regional ratifications will increase a

country’s probability of becoming and remaining a state party by somewhere between 7.2%

to 9.5%, depending on the treaties. Density of regional ratification is, in fact, the single most

causally consistent and the second most causally important predictor of treaty ratification across

all three human rights treaties.

Second, similar to other studies in the literature (Landman 2005), my findings further

confirm that democracy is a significant predictor of treaty ratification. In fact, I find that democ-

racy is the most causally important variable for the ratification of the ICCPR and the CEDAW.

24
Table 4: Causal effect point estimates and 95% CI of predictors on treaty ratification

Predictors ICCPR CEDAW CAT


Super Learner-based Targeted Maximum Likelihood Estimator
Influence function-based CI with multiple imputation
Democracy 0.237 0.116 0.093
[0.121, 0.353] [0.064, 0.168] [−0.065, 0.251]
Multiple parties 0.153 0.197 0.192
[−0.063, 0.370] [−0.114, 0.508] [0.040, 0.344]
Democratic transition 0.186 0.091 −0.013
[−0.080, 0.451] [−0.046, 0.227] [−0.144, 0.118]
Involvement in militarized −0.004 −0.002 −0.010
interstate disputes [−0.015, 0.007] [−0.017, 0.013] [−0.023, 0.004]
Super Learner-based Substitution Estimator
Bootstrap (B = 500) quantile-based CI with single stochastic imputation
Global proportion of ratification −0.011 −0.011 −0.019
[−0.032, 0.000] [−0.025, 0.000] [−0.042, 0.002]
Regional proportions of ratification 0.095 0.072 0.094
[0.039, 0.190] [0.034, 0.155] [0.033, 0.241]
Population size 0.009 0.025 0.028
[−0.004, 0.027] [0.001, 0.087] [0.005, 0.056]
GDP per capita −0.003 −0.017 0.037
[−0.020, 0.011] [−0.043, −0.001] [−0.007, 0.121]
Trade/GDP −0.002 0.007 0.003
[−0.015 , 0.011] [−0.010, 0.032] [−0.014, 0.016]
Net official development assistance 0.014 0.003 0.004
[−0.010, 0.043] [−0.025, 0.019] [−0.027, 0.025]
Judicial independence −0.005 0.029 0.024
[−0.031, 0.014] [0.004, 0.094] [−0.008, 0.108]
Number of countries 192 192 192
Number of years 47 32 29
Number of observations 7,870 5,823 5,354

Being a democracy causes the probability of being a state party to these two treaties to go up

by 23.7% and 11.6%, respectively. Democracy is being defined here as having direct election

of the executive, election of the legislature, and an alternation of power, among other criteria

(Cheibub, Gandhi and Vreeland 2010). The coding criteria for democracy, in other words, are

unlikely to overlap conceptually with various measures of human rights outcomes (Hill 2016b;

von Stein 2016). By implications, my findings suggest that the best way to push a state to

25
ratify and remain committed to human rights treaties is to support its domestic democratic in-

stitutions and promote ratifications by its regional neighbors. In the case of CAT ratification, it

should be cautioned, it is not democracy per se that has a significant causal impact. Rather, it

is the existence of de facto multiple political parties that increases the probability of ratification

by 19.2%.

Third, as to other predictors, their causal importance is either very limited or inconsistent.

Like Goodliffe and Hawkins (2006), I find that democratic transition does not significantly affect

ratification of any treaties, indicating a lack of empirical support for the “lock in” argument. In-

volving in militarized interstate disputes is not causally important, either. My findings also share

the skepticism by Nielsen and Simmons (2015) with respect to many economic variables such as

economic development, the amount of ODA received, and participation in international trade.

These variables do not seem to matter causally for human rights treaty ratification. Population

size tends to have a significantly positive, but substantively very small, causal impact, averaging

about 2% across three treaties. Independence of the judiciary makes states slightly more likely

to ratify the CEDAW, but otherwise has no impact on the ratification of the ICCPR and the CAT.

I employ the same template of causal analysis, including graphical identification and ma-

chine learning-based TMLE estimation, to test many theories of CAT ratification. The results

reported in Table 5 offer several interesting findings. First, I find scant evidence to support

the commonly accepted argument regarding the interactive effect of democratic institutions and

human rights practice on CAT ratification (Hathaway 2007). Instead, my findings suggest that,

irrespective of a state’s torture practice in the year prior, changing the regime type from a dicta-

torship to a democracy does not lower the probability of its CAT ratification status. If anything,

being a democracy causes an increase, not a decrease, by 8.2% in the chance of becoming and

remaining a state party to the CAT even at the highest level of torture practice during the previ-

ous year, although this estimate is certainly not statistically significant.

One speculative reason could be that the executives in non-compliant democracies do

want to ratify and comply because torture practices in the past were more a legacy of an abusive

government agency. Such executives, perhaps under the pressures of the democratic public,

could have an incentive to ratify the CAT and even use treaty obligations as a way to constrain

26
domestic abusive forces. In any event, these causal tests partially challenge the conventional

wisdom that poorly performing democracies are reluctant to become a treaty member because

their democratic institutions will make subsequent compliance very costly. Nevertheless, there is

some evidence, though not extremely solid, that being a democracy does increase the probability

of becoming a state party to the CAT by 14% among those countries that did not practice torture

at all—a significantly greater effect than among those that engaged in torture in the immediate

past.

Table 5: CAT ratification theories and causal effect point estimates and 95% CI

Theory tested Notation Mean SE Lower Upper


Interactive effect argument
Democracy w/ No Torture X1t → At at Yt−1 = 2 0.140 0.075 −0.007 0.287
Democracy w/ Occasion Torture X1t → At at Yt−1 = 1 0.056 0.047 −0.037 0.148
Democracy w/ Freq. Torture X1t → At at Yt−1 = 0 0.082 0.071 −0.056 0.221
Omitted variable bias argument
Multiple parties in Dictatorships X2t → At at X1t = 0 0.050 0.043 −0.034 0.134
Selection effect argument
Torture in All Yt−1 → At 0.116 0.044 0.029 0.202
Torture in Democracies Yt−1 → At at X1t−1 = 1 −0.018 0.012 −0.042 0.005
Credible commitment argument
Torture in Dictatorships Yt−1 → At at X1t−1 = 0 0.201 0.125 −0.043 0.445

Second, as indicated previously, the kind of domestic institutions that significantly improve

the probability of a country being a CAT member is not democracy in general, but rather the

presence of de facto multiple political parties. However, contrary to Vreeland (2008), multiple

political parties existing under authoritarian regimes do not seem to have a significant causal

impact on treaty ratification. This raises an interesting puzzle, which is that marginally multiple

political parties seem to be a causally important variable, but its regime type-specific effects

can vary significantly. This also suggests for further inquiries into the potentially heterogeneous

causal effects by different components within the definition of democracy.

Third, I rescale and dichotomize the CIRI torture index (with zero indicating no torture

and one indicating occasional or frequent torture) and test the selection effect argument by

directly estimating the causal impact of torture practices on CAT ratification in the following

time period. States that engage in occasional or even frequent torture practices are actually

11.4% more likely than those engaging in no torture at all to be a state party to the CAT in

27
the following year. In other words, this is evidence of an adverse selection effect. Governments

whose prior human rights practices do not conform to international standards tend to self-select

into, not away from, the CAT.

For a closer look at this surprising finding about the adverse selection effect, I further

disaggregate the sample observations into democracies and dictatorships based on their regime

classification during the time period when their human rights practices are recorded so as not

to introduce a post-treatment bias. It turns out that among democracies, engaging in torture

practices would cause only a small 1.8% decrease in their chance of being a CAT member the

following year. This comports with my previous findings that democracy and rights practices do

not significantly interact to determine CAT ratification.

Among dictatorships, though, the estimates are highly variable and uncertain. The point

estimate suggests that authoritarian regimes that practice torture are, on average, 20% more

likely to ratify the CAT the following year, which seems to support a claim in the literature

that “[t]he empirical record has shown fairly consistently that among non-democracies, the less

compliant are as likely (and in some cases even more likely) to ratify” (von Stein 2016, 661).

However, the high variability of causal effect estimates mean that we do not find solid empiri-

cal support for the counterintuitive claim by Hollyer and Rosendorff (2011) that authoritarian

leaders may be signaling their strength to opposition groups by way of a CAT ratification. In

short, my causal effect estimation indicates that prior torture practices do not significantly make

CAT ratification more likely even though it points to a potential existence of an adverse selec-

tion effect. This, by implication, reiterates the need to take into account prior rights practices

if one wants to single out and estimate the causal impact of CAT ratification on human rights

practices. Otherwise, the causal effect of the CAT would be biased downward towards zero or

even negative and CAT ratification would likely appear to exacerbate human rights violations.

4 Conclusion

Machine learning in many respects has outpaced statistical theory in terms of modeling reality

(Efron and Hastie 2016). Political scientists could leverage these powerful methods in service

28
of the goal of making causal inference about political behavior and institutions. Embedding

machine learning within a causal inference framework is an effective way to increase model

flexibility while circumventing the inherent issue of model interpretability in machine learning.

One area of application is causal variable importance analysis. As demonstrated in recent

research in public health and biomedical studies (Díaz et al. 2015; Hubbard et al. 2013; Pir-

racchio et al. 2016; Ahern et al. 2016), one can reformulate traditional measures of variable

importance in terms of the causal effects of predictor variables. I adopt a similar template of

analysis that specifically leverages state-of-the-art machine learning techniques and incorporates

them into the structural causal inference framework for causal effect estimation. I then apply

this template to address the puzzle of human rights treaty ratification and test many existing

theories of CAT ratification in the literature.

In terms of broad interpretation, my analysis casts doubt on the instrumental explanations,

questions some popular institutional models, and generally supports the norms-based theories

of human rights treaty ratification. It partially confirms some less intuitive arguments in the

literature while challenging some of the most commonly accepted conventional wisdom, includ-

ing that democracy and state practices interact to determine ratification decisions and that states

self-select into treaty regimes based on their high level of compliance. Importantly, my findings

have a causal, rather than correlative, interpretation. Additionally, the data-adaptive, machine

learning-based estimation methods that I use are much less dependent upon distributional and

functional form assumptions as do traditional statistical models.

Despite the great promises of machine learning and the structural causal inference frame-

work, the dearth of applied research that combines these two methods suggests that there is a

gap to bridge between methodological advances in causal inference and machine learning on

the one hand and substantive applications in political science research on the other hand. Given

that any causal analysis requires a sufficient understanding of the literature in any particular

research area, applied researchers are probably better positioned to bridge this gap by adopting

machine learning methods in their political science research.

Finally, there is a critical need to openly embrace the structural, interventional framework

of causal inference in political science given that a lot of research questions in the discipline are

29
explicitly causal queries. This framework has developed significantly in the last decade or so

(Pearl 2014) and has been adopted very successfully in sociology, epidemiology, and biomedical

research. It was also recently presented in a way that makes it much more accessible to scholars

of various methodological persuasions (Pearl, Glymour and Jewell 2016). My application of this

framework to the issue of human rights treaty ratification shows that it can help researchers

clarify confusion about the assumed underlying causal process, identify incoherence in causal

assumptions, and modify our causal models to increase their substantive plausibility. Employing

this structural causal inference framework could be extremely beneficial to applied political

science research.

30
References
Ahern, Jennifer, K Ellicott Colson, Claire Margerson-Zilko, Alan Hubbard and Sandro Galea. 2016. “Pre-
dicting the Population Health Impacts of Community Interventions: The Case of Alcohol Outlets and
Binge Drinking.” American Journal of Public Health 106(11):1938–1943. 29

Alfredsson, Gudmundur, Jonas Grimheden, BC Ramcharan and Alfred de Zayas. 2009. International
Human Rights Monitoring Mechanisms: Essays in Honour of Jakob Th. Möller. Martinus Nijhoff. 3

Bareinboim, Elias, Jin Tian and Judea Pearl. 2014. Recovering from Selection Bias in Causal and Sta-
tistical Inference. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. pp. 2410–2416.
9

Blanton, Shannon Lindsey and Robert G Blanton. 2007. “What Attracts Foreign Investors? An Examina-
tion of Human Rights and Foreign Direct Investment.” Journal of Politics 69(1):143–155. 4

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45(1):5–32. 21

Buergenthal, Thomas. 2006. “The Evolving International Human Rights System.” American Journal of
International Law 100:783–807. 3

Chapman, Terrence L and Stephen Chaudoin. 2013. “Ratification Patterns and the International Criminal
Court1.” International Studies Quarterly 57(2):400–409. 16

Chaudoin, Stephen, Jude Hays and Raymond Hicks. 2016. “Do We Really Know the WTO Cures Cancer?
False Positives and the Effects of International Institutions.” British Journal of Political Science pp. 1–26.
7

Cheibub, José Antonio, Jennifer Gandhi and James Raymond Vreeland. 2010. “Democracy and Dictator-
ship Revisited.” Public choice 143(1-2):67–101. 16, 25, 36

Chen, Tianqi and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” arXiv preprint
arXiv:1603.02754 . 21

Chen, Tianqi and Tong He. 2015. Higgs Boson Discovery with Boosted Trees. In JMLR: Workshop and
Conference Proceedings. Number 42 pp. 69–80. 21

Cingranelli, David L., David L. Richards and K. Chad Clay. 2013. “The Cingranelli-Richards (CIRI) Human
Rights Dataset.” CIRI Human Rights Data Website: http: // www. humanrightsdata. org . 16

Crabtree, Charles D and Christopher J Fariss. 2015. “Uncovering Patterns among Latent Variables: Human
Rights and De Facto Judicial Independence.” Research & Politics 2(3):2053168015605343. 15

Dai, Xinyuan. 2005. “Why Comply? The Domestic Constituency Mechanism.” International Organization
59(02):363–398. 15

Daniel, Rhian M, Bianca L De Stavola, Simon N Cousens et al. 2011. “gformula: Estimating Causal Effects
in the Presence of Time-varying Confounding or Mediation Using the g-computation Formula.” Stata
Journal 11(4):479. 20

Darwiche, Adnan. 2009. Modeling and Reasoning with Bayesian Networks. Cambridge University Press.
12

De Schutter, Olivier. 2010. International Human Rights Law: Cases, Materials, Commentary. Cambridge
University Press. 3

Díaz, Iván, Alan Hubbard, Anna Decker and Mitchell Cohen. 2015. “Variable Importance and Prediction
Methods for Longitudinal Problems with Missing Variables.” PloS One 10(3):1–17. 12, 29

31
Downs, George .W., David M. Rocke and Peter N. Barsoom. 1996. “Is the Good News about Compliance
Good News about Cooperation?” International Organization 50:379–406. 5, 14

Efron, Bradley and Trevor Hastie. 2016. Computer Age Statistical Inference. Vol. 5 Cambridge University
Press. 28

Elwert, Felix. 2013. Graphical Causal Models. In Handbook of Causal Analysis for Social Research. Springer
pp. 245–273. 12

Elwert, Felix and Christopher Winship. 2014. “Endogenous Selection Bias: The Problem of Conditioning
on a Collider Variable.” Annual Review of Sociology 40:31–53.

Fariss, Christopher J. 2014. “Respect for Human Rights Has Improved Over Time: Modeling the Changing
Standard of Accountability.” American Political Science Review 108(2):297–318. 16, 36

Finnemore, Martha and Kathryn Sikkink. 1998. “International Norm Dynamics and Political Change.”
International Organization 52(4):887–917. 3

Friedman, Jerome H. 2001. “Greedy Function Approximation: A Gradient Boosting Machine.” Annals of
Statistics pp. 1189–1232. 21

Gleditsch, Nils Petter, Peter Wallensteen, Mikael Eriksson, Margareta Sollenberg and Håvard Strand.
2002. “Armed Conflict 1946-2001: A New Dataset.” Journal of Peace Research 39(5):615–637. 16

Goodliffe, Jay and Darren G. Hawkins. 2006. “Explaining Commitment: States and the Convention
against Torture.” Journal of Politics 68(2):358–371. 3, 5, 16, 26

Hafner-Burton, Emilie and Kiyoteru Tsutsui. 2007. “Justice Lost! The Failure of International Human
Rights Law to Matter Where Needed Most.” Journal of Peace Research 44(4):407–425. 16

Hafner-Burton, Emilie M. 2005. “Trading Human Rights: How Preferential Trade Agreements Influence
Government Repression.” International Organization 59(3):593–629. 4

Hafner-Burton, Emilie M. 2012. “International Regimes for Human Rights.” Annual Review of Political
Science 15:265–286. 6

Hafner-Burton, Emilie M. 2013. Forced to Be Good: Why Trade Agreements Boost Human Rights. Cornell
University Press. 16

Hafner-Burton, Emilie M, Edward D Mansfield and Jon CW Pevehouse. 2015. “Human Rights Institutions,
Sovereignty Costs and Democratization.” British Journal of Political Science 45(1):1–27. 6

Hastie, Trevor J and Robert J Tibshirani. 1990. Generalized Additive Models. Vol. 43 CRC Press. 21

Hathaway, Oona A. 2007. “Why Do Countries Commit to Human Rights Treaties?” Journal of Conflict
Resolution 51(4):588–621. 4, 5, 8, 16, 18, 26

Hill, Daniel W. 2016a. “Avoiding Obligation: Reservations to Human Rights Treaties.” Journal of Conflict
Resolution 60(6):1–30. 5

Hill, Daniel W. 2016b. “Democracy and the Concept of Personal Integrity Rights.” Journal of Politics
78(3):822–835. 25, 36

Hill, Daniel W and Zachary M Jones. 2014. “An Empirical Evaluation of Explanations for State Repres-
sion.” American Political Science Review 108(3):1–27. 2, 14

Hollyer, James and B. Peter Rosendorff. 2011. “Why Do Authoritarian Regimes Sign the Convention
against Torture? Signaling, Domestic Politics and Non-Compliance.” Quarterly Journal of Political Sci-
ence 6(3-4):275–327. 6, 16, 19, 28

32
Honaker, James, Gary King, Matthew Blackwell et al. 2011. “Amelia II: A Program for Missing Data.”
Journal of Statistical Software 45(7):1–47. 23

Hubbard, Alan, Ivan Diaz Munoz, Anna Decker, John B Holcomb, Martin A Schreiber, Eileen M Bulger,
Karen J Brasel, Erin E Fox, Deborah J Del Junco, Charles E Wade et al. 2013. “Time-Dependent
Prediction and Evaluation of Variable Importance Using SuperLearning in High Dimensional Clinical
Data.” The Journal of Trauma and Acute Care Surgery 75(1):S53–S60. 29

Keller, Helen and Geir Ulfstein. 2012. UN Human Rights Treaty Bodies: Law and Legitimacy. Vol. 1
Cambridge University Press. 3

Kreif, Noémi, Richard Grieve, Iván Díaz and David Harrison. 2015. “Evaluation of the Effect of a Contin-
uous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain
Injury.” Health Economics 24(9):1213–1228. 20

La Porta, Rafael, Florencio Lopez-de Silanes and Andrei Shleifer. 2008. “The Economic Consequences of
Legal Origins.” Journal of Economic Literature 46(2):285–332. 16, 36

Landman, Todd. 2005. Protecting Human Rights: A Comparative Study. Georgetown University Press. 5,
24

Lebovic, James H and Erik Voeten. 2006. “The Politics of Shame: The Condemnation of Country Human
Rights Practices in the UNHCR.” Internationl Studies Quarterly 50(4):861–888. 4

Linzer, Drew A and Jeffrey K Staton. 2015. “A Global Measure of Judicial Independence, 1948–2012.”
Journal of Law and Courts 3(2):223–256. 16

Little, Roderick JA and Donald B Rubin. 2014. Statistical Analysis with Missing Data. John Wiley & Sons.
20

Lupu, Yonatan. 2014. “Why Do States Join Some Universal Treaties but Not Others? An Analysis of Treaty
Commitment Preferences.” Journal of Conflict Resolution pp. 1–32. 2, 4

Lupu, Yonatan. 2015. “Legislative Veto Players and the Effects of International Human Rights Agree-
ments.” American Journal of Political Science 59(3):578–594. 15

Melander, Erik, Therése Pettersson and Lotta Themnér. 2016. “Organized Violence, 1989–2015.” Journal
of Peace Research 53(5):727–742. 16

Mitchell, Sara McLaughlin, Jonathan J Ring and Mary K Spellman. 2013. “Domestic Legal Traditions and
States’ Human Rights Practices.” Journal of Peace Research 50(2):189–202. 16

Moravcsik, Andrew. 2000. “The Origins of Human Rights Regimes: Democratic Delegation in Postwar
Europe.” International Organization 54(2):217–252. 5, 16

Natekin, Alexey and Alois Knoll. 2013. “Gradient Boosting Machines, A Tutorial.” Frontiers in Neuro-
robotics 7. 21

Neugebauer, Romain, Bruce Fireman, Jason A Roy, Marsha A Raebel, Gregory A Nichols and Patrick J
O’Connor. 2013. “Super Learning to Hedge against Incorrect Inference from Arbitrary Parametric
Assumptions in Marginal Structural Modeling.” Journal of Clinical Epidemiology 66(8):S99–S109. 21

Neumayer, Eric. 2007. “Qualified Ratification: Explaining Reservations to International Human Rights
Treaties.” The Journal of Legal Studies 36(2):397–429. 5, 16

Nielsen, Richard A and Beth A Simmons. 2015. “Rewards for Ratification: Payoffs for Participating in the
International Human Rights Regime?” International Studies Quarterly 59(2):197–208. 4, 16, 26

33
Pearl, Judea. 2009. Causality. Cambridge University Press. 2, 7, 12, 15

Pearl, Judea. 2014. “The Deductive Approach to Causal Inference.” Journal of Causal Inference 2(2):115–
129. 30

Pearl, Judea, Madelyn Glymour and Nicholas P Jewell. 2016. Causal Inference in Statistics: A Primer.
John Wiley & Sons. 17, 18, 30

Pirracchio, Romain, John K Yue, Geoffrey T Manley, Mark J van der Laan, Alan E Hubbard et al. 2016.
“Collaborative Targeted Maximum Likelihood Estimation for Variable Importance Measure: Illustration
for Functional Outcome Prediction in Mild Traumatic Brain Injuries.” Statistical Methods in Medical
Research pp. 1–15. 29

Pirracchio, Romain, Maya L Petersen and Mark van der Laan. 2015. “Improving Propensity Score Esti-
mators’ Robustness to Model Misspecification Using Super Learner.” American Journal of Epidemiology
181(2):108–119. 21

Polley, Eric C and Mark J van der Laan. 2010. “Super Learner in Prediction.” Working Paper Series UC
Berkeley Division of Biostatistics . 2, 21

Powell, Emilia J. and Jeffrey K. Staton. 2009. “Domestic Judicial Institutions and Human Rights Treaty
Violation.” International Studies Quarterly 53(1):149–174. 15, 16

Robins, James. 1986. “A New Approach to Causal Inference in Mortality Studies with a Sustained Expo-
sure Period – Application to Control of the Healthy Worker Survivor Effect.” Mathematical Modelling
7(9-12):1393–1512. 20

Robins, James M, Sander Greenland and Fu-Chang Hu. 1999. “Estimation of the Causal Effect of a
Time-varying Exposure on the Marginal Mean of a Repeated Binary Outcome.” Journal of the American
Statistical Association 94(447):687–700. 20

Rodley, Nigel S. 2013. The Role and Impact of Treaty Bodies. In The Oxford Handbook of International
Human Rights Law, ed. Dinah Shelton. Oxford University Press pp. 621–648. 3

Samii, Cyrus, Laura Paler and Sarah Daly. 2016. “Retrospective Causal Inference with Machine Learning
Ensembles: An Application to Anti-Recidivism Policies in Colombia.” Political Analysis Forthcoming. 20

Schapire, Robert E and Yoav Freund. 2012. Boosting: Foundations and Algorithms. MIT press. 21

Simmons, Beth A. 2009. Mobilizing for Human Rights: International Law in Domestic Politics. Cambridge:
Cambridge University Press. 4, 5, 16, 36

Simmons, Beth A. 2011. “Reflections on Mobilizing for Human Rights.” NYU Journal of International Law
and Politics 44:729–750. 6

Simmons, Beth A and Daniel J Hopkins. 2005. “The Constraining Power of International Treaties: Theory
and Methods.” American Political Science Review 99(04):623–631. 5

Sinisi, Sandra E, Eric C Polley, Maya L Petersen, Soo-Yon Rhee and Mark J van der Laan. 2007. “Super
Learning: An Application to the Prediction of HIV-1 Drug Resistance.” Statistical Applications in Genetics
and Molecular Biology 6(1). 20

Smith-Cannoy, Heather. 2012. Insincere Commitments: Human Rights Treaties, Abusive States, and Citizen
Activism. Georgetown University Press. 4

Spence, Douglas Hamilton. 2014. “Foreign Aid and Human Rights Treaty Ratification: Moving beyond
the Rewards Thesis.” The International Journal of Human Rights 18(4-5):414–432. 4

34
Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statis-
tical Society. Series B (Methodological) pp. 267–288. 21

Tsiatis, Anastasios. 2007. Semiparametric Theory and Missing Data. Springer Science & Business Media.
20

Tyagi, Yogesh. 2009. “The Denunciation of Human Rights Treaties.” British Yearbook of International Law
79(1):86–193. 14

van der Laan, Mark J, Eric C Polley and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications
in Genetics and Molecular Biology 6(1). 20

van der Laan, Mark J and Sherri Rose. 2011. Targeted Learning: Causal Inference for Observational and
Experimental Data. Springer. 2, 23

VanderWeele, Tyler J. 2009. “On the Distinction between Interaction and Effect Modification.” Epidemi-
ology 20(6):863–871.

von Stein, Jana. 2005. “Do Treaties Constrain or Screen? Selection Bias and Treaty Compliance.” Ameri-
can Political Science Review 99(4):611–622. 5, 14

von Stein, Jana. 2016. “Making Promises, Keeping Promises: Democracy, Ratification and Compliance in
International Human Rights Law.” British Journal of Political Science 46(3):655–679. 5, 7, 25, 28

Vreeland, James R. 2008. “Political Institutions and Human Rights: Why Dictatorships Enter into the
United Nations Convention Against Torture.” International Organization 62(1):65. 6, 8, 9, 10, 16, 19,
27

Wallace, Geoffrey PR. 2013. “International Law and Public Attitudes toward Torture: An Experimental
Study.” International Organization 67(01):105–140. 15

35
A Variable Description
• Treaty ratification status of the ICCPR, CEDAW, CAT: A country–year binary variable coded 1 for
ratification and 0 otherwise. Data are coded manually from the database of the Office of the High
Commissioner for Human Rights.
(http://www.ohchr.org/EN/HRBodies/Pages/HumanRightsBodies.aspx).
• Human rights dynamic latent protection scores: a country–year interval variable that measures
respect for physical integrity rights. Rescaled to a 0–1 range from the empirical range for ease
of estimation and interpretation. The scores were generated by Fariss (2014) using a dynamic
ordinal item-response theory model that accounts for systematic change in the way human rights
abuses have been monitored over time. The human rights scores model builds on data from the
CIRI Human Rights Data Project, the Political Terror Scale, the Ill Treatment and Torture Data
Collection, the Uppsala Conflict Data Program, and several other public sources.
Variable name in original dataset is latentmean.
(http://humanrightsscores.org).
• CIRI women’s political rights: an ordinal variable from 0 – 3 that measures the extent to which
women’s political rights are protected, including the rights to vote, run for political office, hold
elected office, join political parties, and petition government officials.
A score of 0 indicates these rights are not guaranteed by law; a score of 1 indicates rights are
guaranteed by law but severely restricted in practice; a score of 2 indicates rights are guaranteed
by law but moderately restricted in practices; and a score of 3 indicates rights are guaranteed in
law and practice.
(http://www.humanrightsdata.com/p/data-documentation.html).
• CIRI toture index: an ordinal index that measures the extent of torture practice by government
officials or by private individuals at the instigation of government officials. A score of zero indi-
cates frequent torture practice; a score of 1 indicates occasional torture practice; and a score of 2
indicates that torture did not occur in a given year.
(http://www.humanrightsdata.com/p/data-documentation.html).
• Legal origins: a cross-sectional (country) multinomial variable coded for British, French, German,
Scandinavian, and Socialist legal origins. Data are from La Porta, Lopez-de Silanes and Shleifer
(2008). I recoded 1 for common law and 0 otherwise.
• Ratification rules: a cross-sectional (country) five-point ordinal variable (1, 1.5, 2, 3, 4) by (Sim-
mons 2009). Its empirical maximum value, however, is only a score of 3. It measures “the insti-
tutional “hurdle” that must be overcome in order to get a treaty ratified.” The coding is based on
descriptions of national constitution or basic rule.
(http://scholar.harvard.edu/files/bsimmons/files/APP_3.2_Ratification_rules.pdf).
• Global and regional ratification rates: continuous variables measuring the cumulative ratification
rates globally and by region. Regional classification is defined using the United Nations Regional
Groups of Member States, including Africa Group (AG), Asia-Pacific Group (APG), Eastern Euro-
pean Group (EEG), Latin American and Caribbean Group (GRULAC), and Western European and
Others Group (WEOG).
(http://www.un.org/depts/DGACM/RegionalGroups.shtml).
• Democracy: measured by the dummy variable democracy in the Democracy-Dictatorship dataset
by Cheibub, Gandhi and Vreeland (2010). It is coded 1 if the regime qualifies as democratic and 0
otherwise. This measure is preferred to the Polity 4 dataset to avoid a conceptual overlap between
democracy and physical integrity rights (Hill 2016b).
(https://sites.google.com/site/joseantoniocheibub/datasets/democracy-and-dictatorship-revisited).

36
• Multiple parties: a ordinal variable coded 0 for no parties, 1 for single party, and 2 for multi-
ple parties. Variable name in original dataset is defacto. I recoded 1 for multiple parties and 0
otherwise.
(https://sites.google.com/site/joseantoniocheibub/datasets/democracy-and-dictatorship-revisited).
• Democratic transition: a binary variable coded 1 when there is transition to or from democracy
and 0 otherwise.
Variable name in original dataset is tt.
(https://sites.google.com/site/joseantoniocheibub/datasets/democracy-and-dictatorship-revisited).
• Judicial independence: a time-series cross-sectional latent score (0 – 1) measuring judicial inde-
pendence. The scores range from 0 (no judicial independence) to 1 (complete judicial indepen-
dence).
(http://polisci.emory.edu/faculty/jkstato/page3/index.html).
• GDP per capita: a country–year interval variable measuring gross domestic product divided by
midyear population measured in current US dollars. A few country-year observations have a GDP
per capita value of zero. I change that into the next smallest value of 65.
(http://data.worldbank.org/indicator/NY.GDP.PCAP.CD).
• Population: a country–year interval variable measuring the total number of residents in a country
regardless of their legal status.
(http://data.worldbank.org/indicator/SP.POP.TOTL).
• Trade: a country–year interval variable measuring the sum of exports and imports of goods and
services as a share of gross domestic product.
(http://data.worldbank.org/indicator/NE.TRD.GNFS.ZS).
• Net ODA received (current USD): data are from the World Bank Indicators database.
(http://data.worldbank.org/indicator/DT.ODA.ODAT.CD).
• Involvement in militarized interstate dispute: a country–year binary variable from the Milita-
rized Interstate Dispute Data (MIDB dataset, version 4.1). It is recoded 1 to indicate a country’s
involvement in any side of an militarized dispute and 0 otherwise between the start year and the
end year of a dispute.
(http://cow.dss.ucdavis.edu/data-sets/MIDs).

37
B Summary Statistics
B.1 Summary statistics

Table 6: Summary Statistics

Statistic N Mean St. Dev. Min Max


COW country code 8,062 — — 2 990
Year 8,062 — — 1966 2013
ICCPR ratification 8,062 0.560 0.496 0 1
CEDAW ratification 8,062 0.563 0.496 0 1
CAT ratification 8,062 0.370 0.483 0 1
Human rights scores 8,062 0.345 1.420 −3.110 4.710
CIRI women’s political rights 4,840 1.780 0.649 0 3
CIRI torture index 4,850 0.778 0.747 0 2
Legal origins 7,956 — — 1 5
Ratification rules 7,796 1.800 0.640 1 3
ICCPR global rate 8,062 0.561 0.268 0 0.869
CEDAW global rate 8,062 0.564 0.379 0 0.964
CAT global rate 8,062 0.369 0.316 0 0.792
ICCPR regional rates 8,062 0.563 0.311 0 1
CEDAW regional rates 8,062 0.565 0.397 0 1
CAT regional rates 8,062 0.372 0.356 0 1
Democracy 6,886 0.442 0.497 0 1
Multiple parties 6,886 1.650 0.653 0 2
Transition 6,886 0.018 0.134 0 1
Judicial independence 7,679 0.465 0.321 0.01 0.995
Population 7,798 31,846,961 115,863,080 9,419 1,357,380,000
GDP per capita 7,055 6,907 14,088 37.5 193,648
Trade 6,536 75.7 49.3 0.021 532
Net ODA 7,490 268,622,622 619,681,691 −943,150,000 22,057,090,000
Militarized dispute 7,501 0.308 0.462 0 1

B.2 R code for data preprocessing


• R version 3.3.2 (2016-10-31)
• Platform: x86_64-w64-mingw32/x64 (64-bit)
• Running under: Windows >= 8 x64 (build 9200)

1 options(digits = 3)
2 options(dplyr.width = Inf)
3 rm(list = ls())
4 cat("\014")
5
6 library(dplyr) # Upload dplyr to process data
7 library(tidyr) # tinyr package to tidy data
8 library(foreign) # Read Stata data
9 library(lubridate) # Handle dates data
10 library(stargazer) # Export summary statistics in latex table

38
11 library(reshape2) # convert data sets into long and wide formats
12 library(ggplot2)
13 library(ggthemes)
14
15 ######################################
16 # Ratification and human rights scores
17 ######################################
18
19 # Treaty ratification years by 192 countries
20 # ICCPR open 1966, entry 1976
21 # CAT open 1984, entry 1987
22 # CEDAW open 1979 entry 1981
23 ratifcow <− read.csv("ratification.csv") %>%
24 mutate(catyear = year(as.Date(cat.date, format = "%m/%d/%Y")),
25 iccpryear = year(as.Date(iccpr.date, format = "%m/%d/%Y")),
26 cedawyear = year(as.Date(cedaw.date, format = "%m/%d/%Y"))) %>%
27 dplyr::select(cow, name, catyear, iccpryear, cedawyear) %>%
28 mutate(catyear = ifelse(is.na(catyear), 0, catyear),
29 iccpryear = ifelse(is.na(iccpryear), 0, iccpryear),
30 cedawyear = ifelse(is.na(cedawyear), 0, cedawyear))
31
32 # PTS on 203 states from 1976−2014 by order Amnesty > SD > HRW
33 ptscores <− read.csv("PTS2015.csv") %>%
34 mutate(pts = ifelse(is.na(Amnesty),
35 ifelse(is.na(State.Dept),
36 ifelse(is.na(HRW), NA, HRW),
37 State.Dept), Amnesty)) %>%
38 rename(cow = COWnum, year = Year) %>%
39 dplyr::select(cow, year, pts)
40
41 # HR protection scores on 205 states from 1949−2013
42 hrscores <− read.csv("hrscores.csv") %>%
43 rename(cow = COW, year = YEAR, hrs = latentmean) %>%
44 dplyr::select(cow, year, hrs)
45
46 ######################
47 # Baseline covariates
48 ######################
49
50 # Regional indicator of 194 states
51 regional <− read.csv("region.csv") %>%
52 dplyr::select(cow, region)
53

54 # Legal origin and ratification rules in 187 states


55 # legal origins (1−English, 2−French, 4−German, 5−Scandinavian)
56 # ratification rules 1 (lowest hurdle) to 4 (highest)
57 legalratif <− read.csv("legalratif.csv") %>%
58 mutate(cow, legor = as.factor(legor), ratifrule = as.factor(ratifrule)) %>%
59 dplyr::select(−name)
60
61 # Combine TSCS dataset using PTS (region, legalratif)
62 # 192 states from 1966−2013 (remember to filter by treaty opening years later)
63 data <− ratifcow %>% left_join(hrscores, by = "cow") %>%

39
64 mutate(iccpr = ifelse(iccpryear == 0, 0, ifelse(year >= iccpryear, 1, 0)),
65 cedaw = ifelse(cedawyear == 0, 0, ifelse(year >= cedawyear, 1, 0)),
66 cat = ifelse(catyear == 0, 0, ifelse(year >= catyear, 1, 0))) %>%
67 dplyr::select(−c(name, iccpryear, cedawyear, catyear)) %>%
68 dplyr::filter(year > 1965) %>%
69 left_join(regional, by = "cow") %>%
70 left_join(legalratif, by = "cow")
71
72 ###############################
73 # Plot number of states parties
74 # to each treaty over time
75 ###############################
76 # Count the numbers of states parties to each procedure over time since 2003
77 ratify <− data.frame(data %>% group_by(year) %>%
78 summarise(CAT = sum(cat),
79 ICCPR = sum(iccpr),
80 CEDAW = sum(cedaw)))
81
82 # Convert to long format and plot the number of states parties since 2003
83 ratify_long <− melt(ratify, id = "year") %>%
84 filter(year > 1965) %>%
85 rename(Number = value, Treaty = variable, Year = year)
86
87 ggplot(data = ratify_long, aes(x = Year, y = Number, colour = Treaty)) +
88 geom_point(size = 6) + geom_line(size = 2) +
89 xlab("Year") + ylab("Number of States Parties") +
90 labs(x = "Year", y = "Number of States Parties") + theme_wsj() +
91 scale_x_continuous(breaks = c(1965, 1970, 1975, 1980, 1985, 1990,
92 1995, 2000, 2005, 2010, 2015)) +
93 scale_y_continuous(breaks = c(0, 25, 50, 75, 100, 125, 150, 175, 200, 225)) +
94 theme(legend.title = element_text(size = 25, hjust = 3, vjust = 7)) +
95 theme(axis.text = element_text(size = 25)) +
96 theme(legend.text = element_text(size = 25))
97
98 # Time−varying covariates
99 #########################
100 # Calculate regional and global ratification rates from 1966−2013
101 ratify_global <− data.frame(data %>% group_by(year) %>%
102 summarise(iccpr_glbavg = mean(iccpr),
103 cedaw_glbavg = mean(cedaw),
104 cat_glbavg = mean(cat)))
105
106 ratify_regional <− data.frame(data %>% group_by(year, region) %>%
107 summarise(iccpr_regavg = mean(iccpr),
108 cedaw_regavg = mean(cedaw),
109 cat_regavg = mean(cat)))
110
111 # Join data with two diffusion ratification rates
112 data <− dplyr::left_join(data, ratify_global, by = c("year" = "year")) %>%
113 left_join(ratify_regional, by = c("year" = "year", "region" = "region")) %>%
114 dplyr::select(−c(region))
115
116 # CIRI data on 198 states from 1981−2011

40
117 # torture (0 = frequent, 2 = no)
118 # women’s political rights (0 = no, 3 = law and practice)
119 ciri <− read.csv("CIRI.csv") %>%
120 dplyr::select(cow = COW, year = YEAR, torture = TORT, wpol = WOPOL) %>%
121 mutate(wpol = ifelse(wpol < 0, NA, wpol),
122 torture = ifelse(torture < 0, NA, torture))
123

124 # Latent Judicial Independence estimates 200 states from 1948−2012


125 judind <− read.csv("lji.csv") %>%
126 dplyr::select(cow = ccode, year = year, ji = LJI)
127
128 # Population in 185 states from 1966−2015
129 pop <− read.csv("Pop-WDI.csv", header = TRUE, stringsAsFactors = FALSE) %>%
130 gather(year, population, −c(name, cow), factor_key = TRUE) %>%
131 mutate(year = as.integer(gsub("X", "", year)),
132 population = as.numeric(population)) %>%
133 dplyr::select(cow, year, population)
134
135 # GDP per capita in constant US dollars in 185 states from 1966−2015
136 GDPpc <− read.csv("GDPpcWDI.csv", header = TRUE, stringsAsFactors = FALSE) %>%
137 gather(year, gdppc, −c(name, cow), factor_key = TRUE) %>%
138 mutate(year = as.integer(gsub("X", "", year)), gdppc = as.numeric(gdppc)) %>%
139 dplyr::select(cow, year, gdppc) %>%
140 mutate(gdppc = ifelse(gdppc <= 0, NA, gdppc))
141

142 # Trade data 170 states from 1960−2014


143 trade <− read.csv("trade.csv") %>%
144 dplyr::select(−c(Country.Name)) %>%
145 gather(year, trade, −c(COW), factor_key = TRUE) %>%
146 mutate(year = as.integer(gsub("X", "", year))) %>%
147 dplyr::select(cow = COW, year, trade)
148
149 # Net ODA data 172 states from 1960−2015
150 oda <− read.csv("netODA.csv") %>%
151 dplyr::select(−c(name)) %>%
152 gather(year, oda, −c(cow), factor_key = TRUE) %>%
153 mutate(year = as.integer(gsub("X", "", year))) %>%
154 mutate(oda = ifelse(is.na(oda), 0, oda))
155
156 # Democracy−dictatorship data on 202 countries from 1946 − 2008
157 # defacto multiple parties (0 = no, 1 = single, 2 = multiple)
158 # transition to/from democracy
159 dd <− read.dta("DD.dta") %>%
160 dplyr::select(year = year, cow = cowcode2,
161 parties = defacto, transition = tt, democracy = democracy)
162
163 # MID data on 178 states from 1966 to 2013
164 MID <− read.csv("MIDB.csv") %>%
165 dplyr::select(cow = ccode, start = StYear, end = EndYear) %>%
166 right_join(hrscores, by = c("cow")) %>%
167 dplyr::select(cow, start, end, year) %>%
168 filter(start > 1965) %>% filter(year > 1965) %>%
169 mutate(dispute = ifelse(year >= start, ifelse(year <= end, 1, 0), 0)) %>%

41
170 dplyr::select(cow, year, dispute) %>%
171 distinct() %>% group_by(cow, year) %>% summarize(dispute = max(dispute))
172
173 # Combine data on 192 states from 1966−2013
174 data <− left_join(data, ciri, by = c("cow" = "cow", "year" = "year")) %>%
175 left_join(dd, by = c("cow" = "cow", "year" = "year")) %>%
176 left_join(judind, by = c("cow" = "cow", "year" = "year")) %>%
177 left_join(pop, by = c("cow" = "cow", "year" = "year")) %>%
178 left_join(GDPpc, by = c("cow" = "cow", "year" = "year")) %>%
179 left_join(trade, by = c("cow" = "cow", "year" = "year")) %>%
180 left_join(oda, by = c("cow" = "cow", "year" = "year")) %>%
181 left_join(MID, by = c("cow" = "cow", "year" = "year"))
182

183 # Summary statistics in LaTeX


184 sum(complete.cases(data))
185 stargazer(data)
186
187 # Saving data into drive
188 write.csv(data, "finaldata.csv", row.names = FALSE)
189
190 # Saving data in RData file
191 save.image("datawork.RData")

42
C Multiple Imputation of Missing Data
C.1 Multiple imputation
Multiple imputation is used to fill in missing data and create five imputed datasets, covering 192 countries
from 1965 – 2013. All variables in Table 6 are used to make the MAR assumption as plausible as possible.
When modeling and estimating causal effects, however, I subset the observations by their appropriate
time periods. For example, I only use observations from 1985–2013 when estimating the causal effects
of predictive covariates on CAT ratification and 1982–2013 for modeling CEDAW ratification. As a result,
the fractions of imputed missing data that are actually used for estimation tend to be lower. Variables
with the highest missing fractions that are in use are CIRI torture index (missing fraction is 0.197) and
CIRI measures of women’s political rights (missing fraction is 0.196).

Table 7: Fractions of missing data by variables

Variables Missing fraction


CIRI women’s political rights 0.400
CIRI torture index 0.398
Trade participation 0.189
DD transition 0.146
DD multiple parties 0.146
DD democracy 0.146
GDP per capita 0.125
Judicial independence 0.048
Net ODA 0.071
Involvement in militarized dispute 0.070
Population size 0.033
Ratification rules 0.033
Legal origins 0.013
CAT ratification 0.000
CAT global ratification rate 0.000
CAT regional ratification rates 0.000
CEDAW ratification 0.000
CEDAW global ratification rate 0.000
CEDAW regional ratification rates 0.000
ICCPR ratification 0.000
ICCPR global ratification rate 0.000
ICCPR regional ratification rates 0.000
N of obs. after list-wise deletion 3,615
N of obs. after imputation 8,062

43
Missingness Map

2
20
31
40
41
42
51
52
53
54
55
56
57
58
60
70
80
90
91
92
93
94
95
100
101
110
115
130
135
140
145
150
155
160
165
200
205
210
211
212
220
221
223
225
230
232
235
255
290
305
310
316
317
325
331
338
339
341
343
344
345
346
349
350
352
355
359
360
365
366
367
368
369
370
371
372
373
375
380
385
390
395
403
404
411
420
432
433
434
435
436
437
438
439
450
451
452
461
471
475
481
482
483
484
490
500
501
510
516
517
520
522
530
531
540
541
551
552
553
560
565
570
571
572
580
581
590
591
600
615
616
620
625
626
630
640
645
651
652
660
663
666
670
679
690
692
694
696
698
700
701
702
703
704
705
710
712
731
732
740
750
760
770
771
775
780
781
790
800
811
812
816
820
830
835
840
850
860
900
910
920
935
940
946
947
950
955
970
983
986
987
990
wpol

torture

trade

democracy

transition

parties

gdppc

oda

dispute

ji

ratifrule

population

legor

cat_regavg

cedaw_regavg

iccpr_regavg

cat_glbavg

cedaw_glbavg

iccpr_glbavg

cat

cedaw

iccpr

hrs

year

cow
Figure 7: Map of missing data for multiple imputation

C.2 R code for multiple imputation

1 options(digits = 2)
2 options(dplyr.width = Inf)
3 rm(list = ls())
4 cat("\014")
5

6 library(dplyr) # Upload dplyr to process data


7 library(tidyr) # tinyr package to tidy data
8 library(foreign) # Read Stata data
9 library(ggplot2) # ggplot graphics
10 library(Amelia) # Multiple imputation
11 library(lubridate) # Handle dates data
12
13 # Summary statistics of raw data and impute GDP = min and conflict NA = 0
14 data <− read.csv("finaldata.csv")
15 stargazer(data)
16

17 # Multiple imputation using Amelia package


18 set.seed(123)
19 mi.data <− amelia(data, m = 5, ts = "year", cs = "cow",

44
20 p2s = 2, polytime = 3,
21 logs = c("gdppc", "population"),
22 noms = c("legor", "ratifrule", "democracy", "transition",
23 "parties", "dispute"),
24 ords = c("wpol", "torture"),
25 emburn = c(50, 500), boot.type = "none",
26 bounds = rbind(c(3, −3.1, 4.7), c(20, 0, 1),
27 c(21, 9, 21), c(22, 4, 12), c(23, 0, 532),
28 c(24, 0, 24)))
29
30 # Write imputed data sets into CSV files
31 save(mi.data, file = "midata.RData")
32 write.amelia(obj = mi.data, file.stem = "midata", row.names = FALSE)
33
34 # Create missingness map and diagnostics
35 missmap(mi.data)
36 summary(mi.data)
37
38 # Stack all five data sets and export into a CSV file
39 data1 <− read.csv("midata1.csv")
40 data2 <− read.csv("midata2.csv")
41 data3 <− read.csv("midata3.csv")
42 data4 <− read.csv("midata4.csv")
43 data5 <− read.csv("midata5.csv")
44 stackdata <− rbind(data1, data2, data3, data4, data5)
45 write.csv(stackdata, file = "stackeddata.csv", row.names = FALSE)

45
D R code for main analysis
D.1 R code for XGBoost tuning

1 ###########################################
2 # Tuning XGBoost Hyperparameters Using a Combination of
3 # Grid Search and Cross−validated Super Learner
4 ###########################################
5 options(digits = 3)
6 options(dplyr.width = Inf)
7 options("scipen" = 5)
8 rm(list = ls())
9 cat("\014")
10
11 # Load packages
12 library(dplyr) # Upload dplyr to process data
13 library(ggplot2) # visualize data
14 library(ggthemes) # use various themes in ggplot
15 library(SuperLearner) # use Super Learner predictive method
16 library(gam) # algorithm used within TMLE
17 library(glmnet) # algorithm used within TMLE
18 library(randomForest) # algorithm used within TMLE
19 library(xgboost) # algorithm for XGBoost
20 library(xtable)
21 library(Amelia)
22 library(foreach) # do parallel loop
23 library(doParallel) # do parallel loop
24 library(RhpcBLASctl) #multicore
25

26 # Setup parallel computation − use all cores on our computer.


27 num_cores = RhpcBLASctl::get_num_cores()
28
29 # Use all of those cores for parallel SuperLearner.
30 options(mc.cores = num_cores)
31

32 # Check how many parallel workers we are using:


33 getOption("mc.cores")
34
35 # We need to set a different type of seed that works across cores.
36 set.seed(1, "L’Ecuyer-CMRG")
37

38 # Create function rescaling outcome into 0−1


39 std <− function(x) {x = (x − min(x))/(max(x) − min(x))}
40
41 # Read stacked data sets and process data
42 data <− read.csv("midata5.csv")
43 datatuning <− data %>%
44 mutate(legor = ifelse(legor == 1, 1, 0),
45 ratifrule = std(ratifrule),
46 iccpr_glbavg = std(iccpr_glbavg),
47 cedaw_glbavg = std(cedaw_glbavg),
48 cat_glbavg = std(cat_glbavg),

46
49 iccpr_regavg = std(iccpr_regavg),
50 cedaw_regavg = std(cedaw_regavg),
51 cat_regavg = std(cat_regavg),
52 ji = std(ji), population = std(population),
53 gdppc = std(gdppc), trade = std(trade), oda = std(oda),
54 parties = ifelse(parties < 2, 0, 1),
55 hrs = std(hrs),
56 wpol = std(wpol),
57 torture = std(torture)) %>%
58 group_by(cow) %>%
59 mutate(laghrs = lag(hrs, 1),
60 lagwpol = lag(wpol, 1),
61 lagtorture = lag(torture, 1),
62 lagiccpr = lag(iccpr, 1),
63 lagcedaw = lag(cedaw, 1),
64 lagcat = lag(cat, 1))
65
66 # Use 1967−2013 because ICCPR opened in 12/1966
67 # HRP scores 1948−2013
68 # Total 7,870 obs across 192 countries over 47 years
69 datatuning_iccpr <− datatuning %>%
70 dplyr::select(c(iccpr_glbavg, iccpr_regavg,
71 population, gdppc, trade, oda, ji,
72 democracy, parties, transition, dispute,
73 legor, ratifrule,
74 lagiccpr, laghrs,
75 iccpr,
76 year, cow)) %>%
77 filter(year >= 1967) %>% na.omit()
78 Y1 <− datatuning_iccpr$iccpr
79 X1 <− data.frame(datatuning_iccpr[, 1:15])
80 id1 <− factor(datatuning_iccpr$cow)
81
82 # Use 1982−2012 because CIRI wpol starts at 1981 and stops at 2011
83 # Total 5,631 obs across 192 countries over 31 years
84 datatuning_cedaw <− datatuning %>%
85 dplyr::select(c(cedaw_glbavg, cedaw_regavg,
86 population, gdppc, trade, oda, ji,
87 democracy, parties, transition, dispute,
88 legor, ratifrule,
89 lagcedaw, lagwpol,
90 cedaw,
91 year, cow)) %>%
92 filter(year >= 1982) %>% na.omit()
93 Y2 <− datatuning_cedaw$cedaw
94 X2 <− data.frame(datatuning_cedaw[, 1:15])
95 id2 <− factor(datatuning_cedaw$cow)
96
97 # Use 1985−2012 because CAT opened in 1984
98 # CIRI torture stops at 2011
99 # Total 5,162 obs across 192 countries over 28 years
100 datatuning_cat <− datatuning %>%
101 dplyr::select(c(cat_glbavg, cat_regavg,

47
102 population, gdppc, trade, oda, ji,
103 democracy, parties, transition, dispute,
104 legor, ratifrule,
105 lagcat, lagtorture,
106 cat,
107 cow, year)) %>%
108 filter(year >= 1985) %>% na.omit()
109 Y3 <− datatuning_cat$cat
110 X3 <− data.frame(datatuning_cat[, 1:15])
111 id3 <− factor(datatuning_cat$cow)
112
113 # 3∗4∗4 = 48 different configurations.
114 tune = list(ntrees = c(200, 500, 1000),
115 max_depth = c(4:7),
116 shrinkage = c(0.01, 0.05, 0.1, 0.2))
117
118 # Set detailed names = T so we can see the configuration for each function.
119 learners = create.Learner("SL.xgboost", tune = tune,
120 detailed_names = T, name_prefix = "XGB")
121
122 # Fit the SuperLearner using ICCPR, CEDAW, and CAT stacked datasets
123 SL.library <− c(learners$names)
124
125 set.seed(3)
126 sl3 = CV.SuperLearner(Y = Y3, X = X3,
127 family = binomial(), SL.library = SL.library,
128 method = "method.NNLS", id = id3, verbose = TRUE,
129 control = list(saveFitLibrary = TRUE, trimLogit = 1e−04),
130 cvControl = list(V = 5L, shuffle = TRUE),
131 parallel = "multicore")
132 plot.CV.SuperLearner(sl3)
133 result_CVcat <− summary.CV.SuperLearner(sl3)$Table
134 result_CVcat[order(result_CVcat$Ave), ]
135
136 set.seed(2)
137 sl2 = CV.SuperLearner(Y = Y2, X = X2,
138 family = binomial(), SL.library = SL.library,
139 method = "method.NNLS", id = id2, verbose = TRUE,
140 control = list(saveFitLibrary = TRUE, trimLogit = 1e−04),
141 cvControl = list(V = 5L, shuffle = TRUE),
142 parallel = "multicore")
143 plot.CV.SuperLearner(sl2)
144 result_CVcedaw <− summary.CV.SuperLearner(sl2)$Table
145 result_CVcedaw[order(result_CVcedaw$Ave), ]
146
147 set.seed(1)
148 sl1 = CV.SuperLearner(Y = Y1, X = X1,
149 family = binomial(), SL.library = SL.library,
150 method = "method.NNLS", id = id1, verbose = TRUE,
151 control = list(saveFitLibrary = TRUE, trimLogit = 1e−04),
152 cvControl = list(V = 5L, shuffle = TRUE),
153 parallel = "multicore")
154 plot.CV.SuperLearner(sl1)

48
155 result_CViccpr <− summary.CV.SuperLearner(sl1)$Table
156 result_CViccpr[order(result_CViccpr$Ave), ]
157
158 save.image("xgboost-tuning-updated.RData")

D.2 R code for comparing predictive algorithms

1 options(digits = 4)
2 options(dplyr.width = Inf)
3 options("scipen" = 5)
4 rm(list = ls())
5 cat("\014")
6
7 # Load packages
8 library(dplyr) # Upload dplyr to process data
9 library(ggplot2) # visualize data
10 library(ggthemes) # use various themes in ggplot
11 library(SuperLearner) # use Super Learner predictive method
12 library(gam) # algorithm used within TMLE
13 library(glmnet) # algorithm used within TMLE
14 library(randomForest) # algorithm used within TMLE
15 library(xgboost) # algorithm for XGBoost
16 library(xtable)
17 library(Amelia)
18 library(foreach) # do parallel loop
19 library(doParallel) # do parallel loop
20 library(RhpcBLASctl) #multicore
21
22 # Setup parallel computation − use all cores on our computer.
23 num_cores = RhpcBLASctl::get_num_cores()
24
25 # Use all of those cores for parallel SuperLearner.
26 options(mc.cores = num_cores)
27
28 # Check how many parallel workers we are using:
29 getOption("mc.cores")
30
31 # We need to set a different type of seed that works across cores.
32 set.seed(1, "L’Ecuyer-CMRG")
33
34 # Create function rescaling outcome into 0−1
35 std <− function(x) {
36 x = (x − min(x))/(max(x) − min(x))
37 }
38

39 # Read stacked data sets and process data


40 data <− read.csv("midata1.csv")
41 datatuning <− data %>%
42 mutate(legor = ifelse(legor == 1, 1, 0),
43 ratifrule = std(ratifrule),
44 iccpr_glbavg = std(iccpr_glbavg),

49
45 cedaw_glbavg = std(cedaw_glbavg),
46 cat_glbavg = std(cat_glbavg),
47 iccpr_regavg = std(iccpr_regavg),
48 cedaw_regavg = std(cedaw_regavg),
49 cat_regavg = std(cat_regavg),
50 ji = std(ji), population = std(population),
51 gdppc = std(gdppc), trade = std(trade), oda = std(oda),
52 parties = ifelse(parties < 2, 0, 1),
53 hrs = std(hrs),
54 wpol = std(wpol),
55 torture = std(torture)) %>%
56 group_by(cow) %>%
57 mutate(laghrs = lag(hrs, 1),
58 lagwpol = lag(wpol, 1),
59 lagtorture = lag(torture, 1),
60 lagiccpr = lag(iccpr, 1),
61 lagcedaw = lag(cedaw, 1),
62 lagcat = lag(cat, 1))
63

64 # Use 1967−2013 because ICCPR opened in 12/1966


65 # HRP scores 1948−2013
66 # Total 7,870 obs across 192 countries over 47 years
67 datatuning_iccpr <− datatuning %>%
68 dplyr::select(c(iccpr_glbavg, iccpr_regavg,
69 population, gdppc, trade, oda, ji,
70 democracy, parties, transition, dispute,
71 legor, ratifrule,
72 lagiccpr, laghrs,
73 iccpr,
74 year, cow)) %>%
75 filter(year >= 1967) %>% na.omit()
76 Y1 <− datatuning_iccpr$iccpr
77 X1 <− data.frame(datatuning_iccpr[, 1:15])
78 id1 <− factor(datatuning_iccpr$cow)
79
80 # Tuning XGB
81 XGB_iccpr1 = create.Learner("SL.xgboost",
82 tune = list(ntrees = 500, max_depth = 4, shrinkage = 0.01),
83 detailed_names = T, name_prefix = "XGB_iccpr1")
84 XGB_iccpr2 = create.Learner("SL.xgboost",
85 tune = list(ntrees = 500, max_depth = 5, shrinkage = 0.01),
86 detailed_names = T, name_prefix = "XGB_iccpr2")
87 XGB_iccpr3 = create.Learner("SL.xgboost",
88 tune = list(ntrees = 1000, max_depth = 4, shrinkage = 0.01),
89 detailed_names = T, name_prefix = "XGB_iccpr3")
90 XGB_iccpr4 = create.Learner("SL.xgboost",
91 tune = list(ntrees = 500, max_depth = 6, shrinkage = 0.01),
92 detailed_names = T, name_prefix = "XGB_iccpr4")
93 XGB_iccpr5 = create.Learner("SL.xgboost",
94 tune = list(ntrees = 200, max_depth = 4, shrinkage = 0.05),
95 detailed_names = T, name_prefix = "XGB_iccpr5")
96
97 # Create Super Learner library

50
98 SL.library_iccpr <− c("SL.glm", "SL.glmnet",
99 "SL.gam", "SL.polymars",
100 "SL.randomForest", "SL.gbm", "SL.xgboost",
101 XGB_iccpr1$names, XGB_iccpr2$names,
102 XGB_iccpr3$names, XGB_iccpr4$names,
103 XGB_iccpr5$names)
104

105 set.seed(1)
106 sl1full = CV.SuperLearner(Y = Y1, X = X1,
107 family = binomial(), SL.library = SL.library_iccpr,
108 method = "method.NNloglik", id = id1, verbose = TRUE,
109 control = list(saveFitLibrary = TRUE, trimLogit = 1e−04),
110 cvControl = list(V = 5L, shuffle = TRUE),
111 parallel = "multicore")
112 plot.CV.SuperLearner(sl1full)
113 result_CViccprfull <− summary.CV.SuperLearner(sl1full)$Table
114 tuning_iccpr <− result_CViccprfull[order(result_CViccprfull$Ave), ]
115 xtable(tuning_iccpr, digits = rep(5, 6))
116

117 save.image("tuning-full.RData")

D.3 R code for variable importance analysis and theory testing

1 #########################################
2 # Estimating Causal Effects of Binary Variables Using SL−based TMLE
3 #########################################
4 options(digits = 4)
5 options(dplyr.width = Inf)
6 rm(list = ls())
7 cat("\014")
8 options("scipen" = 100, "digits" = 4)
9
10 # Load packages
11 library(dplyr) # Manage data
12 library(ggplot2) # visualize data
13 library(ggthemes) # use various themes in ggplot
14 library(SuperLearner) # use Super Learner predictive method
15 library(tmle) # use TMLE method
16 library(gam) # algorithm used within TMLE
17 library(glmnet) # algorithm used within TMLE
18 library(randomForest) # algorithm used within TMLE
19 library(polspline) # algorithm used within TMLE
20 library(xgboost) # algorithm used within TMLE
21 library(xtable) # create LaTeX tables
22 library(Amelia) # combine estimates from multiple imputation
23 library(RhpcBLASctl) #multicore
24 library(parallel) # parallel computing
25
26 # Tuning XGB
27 XGB_cat = create.Learner("SL.xgboost",
28 tune = list(ntrees = 500, max_depth = 4, shrinkage = 0.01),

51
29 detailed_names = T, name_prefix = "XGB_cat")
30
31 # Create Super Learner library
32 SL.library <− c("SL.glmnet", "SL.gam", "SL.polymars",
33 "SL.randomForest", XGB_cat$names)
34
35 # Set multicore compatible seed.
36 set.seed(1, "L’Ecuyer-CMRG")
37
38 # Setup parallel computation − use all cores on our computer.
39 num_cores = RhpcBLASctl::get_num_cores()
40
41 # Use all of those cores for parallel SuperLearner.
42 options(mc.cores = num_cores)
43
44 # Check how many parallel workers we are using:
45 getOption("mc.cores")
46
47 # Create function rescaling outcome into 0−1
48 std <− function(x) {
49 x = (x − min(x))/(max(x) − min(x))
50 }
51
52 ###############################
53 # Read stacked data sets and process data
54 data <− read.csv("stackeddata.csv")
55 d <− split(data, rep(1:5, each = nrow(data)/5))
56 iccpr_bin <− data.frame(matrix(NA, nrow = 10, ncol = 4))
57 cedaw_bin <− data.frame(matrix(NA, nrow = 10, ncol = 4))
58 cat_bin <− data.frame(matrix(NA, nrow = 10, ncol = 4))
59

60 for (m in 1:5) {
61
62 # Create holders for TMLE estimates for each imputed dataset
63 tmle_bin_iccpr <− data.frame(matrix(NA, nrow = 2, ncol = 4))
64 tmle_bin_cat <− data.frame(matrix(NA, nrow = 2, ncol = 4))
65 tmle_bin_cedaw <− data.frame(matrix(NA, nrow = 2, ncol = 4))
66
67 # Transform variables
68 d[[m]] <− d[[m]] %>%
69 mutate(legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
70 iccpr_glbavg = std(iccpr_glbavg),
71 cedaw_glbavg = std(cedaw_glbavg),
72 cat_glbavg = std(cat_glbavg),
73 iccpr_regavg = std(iccpr_regavg),
74 cedaw_regavg = std(cedaw_regavg),
75 cat_regavg = std(cat_regavg),
76 ji = std(ji), population = std(population),
77 gdppc = std(gdppc), trade = std(trade), oda = std(oda),
78 parties = ifelse(parties < 2, 0, 1),
79 hrs = std(hrs), wpol = std(wpol), torture = std(torture)) %>%
80 group_by(cow) %>%
81 mutate(laghrs = lag(hrs), lagwpol = lag(wpol), lagtorture = lag(torture),

52
82 lagiccpr = lag(iccpr), lagcedaw = lag(cedaw), lagcat = lag(cat))
83
84 # Create a dataset for each treaty ICCPR 1966, CEDAW 1979, CAT 1984
85 # Use 1967−2013 because ICCPR opened in 12/1966 and HRP scores 1948−2013
86 # Use 1982−2013 because CIRI wpol starts at 1981
87 # Use 1985−2013 because CAT opened in 12/1984
88 data_iccpr <− data.frame(d[[m]]) %>%
89 filter(year >= 1967) %>% na.omit()
90 data_cedaw <− data.frame(d[[m]]) %>%
91 filter(year >= 1982) %>% na.omit()
92 data_cat <− data.frame(d[[m]]) %>%
93 filter(year >= 1985) %>% na.omit()
94

95 # Create ICCPR ratification dataset


96 data_iccpr <− data_iccpr %>%
97 dplyr::select(c(democracy, parties, transition, dispute,
98 iccpr_glbavg, iccpr_regavg,
99 population, gdppc, trade, oda, ji,
100 legor, ratifrule,
101 lagiccpr, laghrs,
102 iccpr, cow))
103 # Model ICCPR ratification
104 for (i in 1:4) {
105 # Identify model variables
106 id <− factor(data_iccpr$cow)
107 Y <− data_iccpr$iccpr
108 A <− data_iccpr[, i]
109 W <− data.frame(dplyr::select(data_iccpr, −c(i, 16, 17)))
110 tmle_iccpr <− tmle(Y, A, W,
111 Qbounds = c(0, 1), Q.SL.library = SL.library,
112 gbound = 1e−4, g.SL.library = SL.library,
113 family = "binomial", fluctuation = "logistic",
114 id = id, verbose = TRUE)
115 tmle_bin_iccpr[1:2, i] <− c(tmle_iccpr$estimates$ATE$psi,
116 sqrt(tmle_iccpr$estimates$ATE$var.psi))
117 print(c(m, "ICCPR", i))
118 }
119
120 # Create CEDAW ratification dataset
121 data_cedaw <− data_cedaw %>%
122 dplyr::select(c(democracy, parties, transition, dispute,
123 cedaw_glbavg, cedaw_regavg,
124 ji, population, gdppc, trade, oda,
125 legor, ratifrule,
126 lagcedaw, lagwpol,
127 cedaw, cow))
128 # Model CEDAW ratification
129 for (i in 1:4) {
130 # Identify model variables
131 id <− factor(data_cedaw$cow)
132 Y = data_cedaw$cedaw
133 A = data_cedaw[, i]
134 W = data.frame(dplyr::select(data_cedaw, −c(i, 16, 17)))

53
135 tmle_cedaw <− tmle(Y, A, W,
136 Qbounds = c(0, 1), Q.SL.library = SL.library,
137 gbound = 1e−4, g.SL.library = SL.library,
138 family = "binomial", fluctuation = "logistic",
139 id = id, verbose = TRUE)
140 tmle_bin_cedaw[1:2, i] <− c(tmle_cedaw$estimates$ATE$psi,
141 sqrt(tmle_cedaw$estimates$ATE$var.psi))
142 print(c(m, "CEDAW", i))
143 }
144
145 # Create CAT ratification dataset
146 data_cat <− data_cat %>%
147 dplyr::select(c(democracy, parties, transition, dispute,
148 cat_glbavg, cat_regavg,
149 ji, population, gdppc, trade, oda,
150 legor, ratifrule,
151 lagcat, lagtorture,
152 cat, cow))
153 # Model CAT ratification
154 for (i in 1:4) {
155 # Identify model variables
156 id <− factor(data_cat$cow)
157 Y = data_cat$cat
158 A = data_cat[, i]
159 W = data.frame(dplyr::select(data_cat, −c(i, 16, 17)))
160 tmle_cat <− tmle(Y, A, W,
161 Qbounds = c(0, 1), Q.SL.library = SL.library,
162 gbound = 1e−4, g.SL.library = SL.library,
163 family = "binomial", fluctuation = "logistic",
164 id = id, verbose = TRUE)
165 tmle_bin_cat[1:2, i] <− c(tmle_cat$estimates$ATE$psi,
166 sqrt(tmle_cat$estimates$ATE$var.psi))
167 print(c(m, "CAT", i))
168 }
169
170 iccpr_bin[(2∗m − 1):(2∗m), ] <− tmle_bin_iccpr
171 cedaw_bin[(2∗m − 1):(2∗m), ] <− tmle_bin_cedaw
172 cat_bin[(2∗m − 1):(2∗m), ] <− tmle_bin_cat
173 }
174
175 # Combine TMLE estimates from 5 imputed datasets
176 variables <− c("Democracy", "Multiple parties",
177 "Transition", "Militarized disputes")
178 vim_bin_iccpr <− data.frame(mi.meld(q = iccpr_bin[c(1, 3, 5, 7, 9), ],
179 se = iccpr_bin[c(2, 4, 6, 8, 10), ],
180 byrow = TRUE))
181 result_iccpr <− data.frame(cbind(t(vim_bin_iccpr[, 1:4]),
182 t(vim_bin_iccpr[, 5:8]))) %>%
183 mutate(variables = variables, mean = X1, sd = X2,
184 lower = X1 − 1.96∗X2, upper = X1 + 1.96∗X2) %>%
185 dplyr::select(variables, mean, lower, upper)
186
187 vim_bin_cedaw <− data.frame(mi.meld(q = cedaw_bin[c(1, 3, 5, 7, 9), ],

54
188 se = cedaw_bin[c(2, 4, 6, 8, 10), ],
189 byrow = TRUE))
190 result_cedaw <− data.frame(cbind(t(vim_bin_cedaw[, 1:4]),
191 t(vim_bin_cedaw[, 5:8]))) %>%
192 mutate(variables = variables, mean = X1, sd = X2,
193 lower = X1 − 1.96∗X2, upper = X1 + 1.96∗X2) %>%
194 dplyr::select(variables, mean, lower, upper)
195
196 vim_bin_cat <− data.frame(mi.meld(q = cat_bin[c(1, 3, 5, 7, 9), ],
197 se = cat_bin[c(2, 4, 6, 8, 10), ],
198 byrow = TRUE))
199 result_cat <− data.frame(cbind(t(vim_bin_cat[, 1:4]),
200 t(vim_bin_cat[, 5:8]))) %>%
201 mutate(variables = variables, mean = X1, sd = X2,
202 lower = X1 − 1.96∗X2, upper = X1 + 1.96∗X2) %>%
203 dplyr::select(variables, mean, lower, upper)
204
205 effect <− data.frame(rbind(result_iccpr, result_cedaw, result_cat))
206 xtable(effect, digits = c(rep(3, 5)))
207
208 save.image("CVIA-bin-TMLE.RData")
209
210 ################################################
211 # Estimating Causal Effects of Continuous Variables Using SL−based Substitution
212 ################################################
213 options(digits = 4)
214 options(dplyr.width = Inf)
215 rm(list = ls())
216 cat("\014")
217
218 # Load packages
219 library(dplyr) # Manage data
220 library(ggplot2) # visualize data
221 library(ggthemes) # use various themes in ggplot
222 library(SuperLearner) # use Super Learner predictive method
223 library(tmle) # use TMLE method
224 library(gam) # algorithm used within TMLE
225 library(glmnet) # algorithm used within TMLE
226 library(randomForest) # algorithm used within TMLE
227 library(polspline) # algorithm used within TMLE
228 library(xgboost) # algorithm used within TMLE
229 library(xtable) # create LaTeX tables
230 library(Amelia) # combine estimates from multiple imputation
231 library(RhpcBLASctl) #multicore
232 library(doParallel) # parallel computing
233 library(foreach) # parallel computing
234
235 # Tuning XGB
236 XGB = create.Learner("SL.xgboost",
237 tune = list(ntrees = 500, max_depth = 4, shrinkage = 0.01),
238 detailed_names = T, name_prefix = "XGB_cat")
239
240 # Create Super Learner library

55
241 SL.library <− c("SL.glmnet", "SL.gam", XGB$names)
242
243 # Set multicore compatible seed.
244 set.seed(1, "L’Ecuyer-CMRG")
245
246 # Setup parallel computation − use all cores on our computer.
247 num_cores = RhpcBLASctl::get_num_cores()
248
249 # Use all of those cores for parallel SuperLearner.
250 options(mc.cores = num_cores)
251
252 # Check how many parallel workers we are using:
253 getOption("mc.cores")
254
255 # Create function rescaling outcome into 0−1
256 std <− function(x) {
257 x = (x − min(x))/(max(x) − min(x))
258 }
259

260 # Read 5th imputed data set and process data


261 data <− read.csv("midata1.csv")
262 data <− data %>%
263 mutate(legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
264 iccpr_glbavg = std(iccpr_glbavg),
265 cedaw_glbavg = std(cedaw_glbavg),
266 cat_glbavg = std(cat_glbavg),
267 iccpr_regavg = std(iccpr_regavg),
268 cedaw_regavg = std(cedaw_regavg),
269 cat_regavg = std(cat_regavg),
270 ji = std(ji), population = std(population),
271 gdppc = std(gdppc), trade = std(trade), oda = std(oda),
272 parties = ifelse(parties < 2, 0, 1),
273 hrs = std(hrs), wpol = std(wpol), torture = std(torture)) %>%
274 group_by(cow) %>%
275 mutate(laghrs = lag(hrs, 1), lagwpol = lag(wpol, 1), lagtorture = lag(torture, 1),
276 lagiccpr = lag(iccpr, 1), lagcedaw = lag(cedaw, 1), lagcat = lag(cat, 1))
277

278 # Create a dataset for each treaty ICCPR 1966, CEDAW 1979, CAT 1984
279 # Use 1966−2013 because ICCPR opened in 1966 and HRP scores 1948−2013
280 # Use 1982−2012 because CIRI wpol starts at 1981 and stops at 2011
281 # Use 1984−2012 because CAT opened in 1984 and CIRI torture stops at 2011
282 # Count number of observations for each dataset
283 data_iccpr <− data %>%
284 filter(year >= 1967) %>% na.omit()
285 n_iccpr <− nrow(data_iccpr)
286 data_cedaw <− data %>%
287 filter(year >= 1982) %>% na.omit()
288 n_cedaw <− nrow(data_cedaw)
289 data_cat <− data %>%
290 filter(year >= 1985) %>% na.omit()
291 n_cat <− nrow(data_cat)
292
293 # Set multicore compatible seed.

56
294 set.seed(1, "L’Ecuyer-CMRG")
295
296 # Setup parallel computation − use all cores on our computer.
297 num_cores = RhpcBLASctl::get_num_cores()
298
299 # Use all of those cores for parallel SuperLearner.
300 options(mc.cores = num_cores)
301
302 # Check how many parallel workers we are using:
303 getOption("mc.cores")
304
305 # For bootstrap−based inference, use stochastic imputation with 1 imputed dataset
306 # Take quantile for CI, no need for normality assumption
307 cl <− makeCluster(2)
308 registerDoParallel(cl)
309 B <− 500
310 psi_boot <− data.frame(matrix(NA, nrow = B, ncol = 21))
311
312 foreach(b = 1:B, .packages = c("dplyr", "xgboost", "glmnet", "SuperLearner"),
313 .verbose = TRUE) %do% {
314 bootIndices_iccpr <− sample(1:n_iccpr, replace = TRUE)
315 bootIndices_cedaw <− sample(1:n_cedaw, replace = TRUE)
316 bootIndices_cat <− sample(1:n_cat, replace = TRUE)
317
318 bootData_iccpr <− data_iccpr[bootIndices_iccpr, ]
319 bootData_cedaw <− data_cedaw[bootIndices_cedaw, ]
320 bootData_cat <− data_cat[bootIndices_cat, ]
321
322 # Create ICCPR ratification resample dataset
323 bootData_iccpr <− bootData_iccpr %>%
324 dplyr::select(c(iccpr_glbavg, iccpr_regavg,
325 population, gdppc, trade, oda, ji,
326 democracy, parties, transition, dispute,
327 legor, ratifrule,
328 lagiccpr, laghrs,
329 iccpr, cow))
330 bootData_iccpr <− data.frame(bootData_iccpr)
331 niccpr <− nrow(bootData_iccpr)
332 # Model ICCPR ratification using resample dataset
333 psi_iccpr <− data.frame(matrix(NA, nrow = 1, ncol = 7))
334 for (i in 1:7) {
335 id <− factor(bootData_iccpr$cow)
336 Y <− bootData_iccpr$iccpr
337 X <− data.frame(bootData_iccpr[, 1:15])
338 X1 <− X0 <− X
339 X1[, i] <− 1
340 X0[, i] <− 0
341 newdata <− rbind(X, X1, X0)
342 Q_iccpr <− mcSuperLearner(Y = Y, X = X, newX = newdata,
343 SL.library = SL.library,
344 family = "binomial",
345 method = "method.NNloglik",
346 cvControl = list(V = 5L),

57
347 verbose = TRUE)
348 predX1 <− Q_iccpr$SL.predict[(niccpr + 1):(2∗niccpr)]
349 predX0 <− Q_iccpr$SL.predict[(2∗niccpr + 1):(3∗niccpr)]
350 psi_iccpr[, i] <− mean(predX1 − predX0)
351 print(c(b, "ICCPR", i))
352 }
353

354 # Create CEDAW ratification resample dataset


355 bootData_cedaw <− bootData_cedaw %>%
356 dplyr::select(c(cedaw_glbavg, cedaw_regavg,
357 population, gdppc, trade, oda, ji,
358 democracy, parties, transition, dispute,
359 legor, ratifrule,
360 lagcedaw, lagwpol,
361 cedaw, cow))
362 bootData_cedaw <− data.frame(bootData_cedaw)
363 ncedaw <− nrow(bootData_cedaw)
364 # Model CEDAW ratification using resample dataset
365 psi_cedaw <− data.frame(matrix(NA, nrow = 1, ncol = 7))
366 for (i in 1:7) {
367 id <− factor(bootData_cedaw$cow)
368 Y <− bootData_cedaw$cedaw
369 X <− data.frame(bootData_cedaw[, 1:15])
370 X1 <− X0 <− X
371 X1[, i] <− 1
372 X0[, i] <− 0
373 newdata <− rbind(X, X1, X0)
374 Q_cedaw <− mcSuperLearner(Y = Y, X = X, newX = newdata,
375 SL.library = SL.library,
376 family = "binomial",
377 method = "method.NNloglik",
378 cvControl = list(V = 5L),
379 verbose = TRUE)
380 predX1 <− Q_cedaw$SL.predict[(ncedaw + 1):(2∗ncedaw)]
381 predX0 <− Q_cedaw$SL.predict[(2∗ncedaw + 1):(3∗ncedaw)]
382 psi_cedaw[, i] <− mean(predX1 − predX0)
383 print(c(b, "CEDAW", i))
384 }
385
386 # Create CAT ratification resampled dataset
387 bootData_cat <− bootData_cat %>%
388 dplyr::select(c(cat_glbavg, cat_regavg,
389 population, gdppc, trade, oda, ji,
390 democracy, parties, transition, dispute,
391 legor, ratifrule,
392 lagcat, lagtorture,
393 cat, cow))
394 bootData_cat <− data.frame(bootData_cat)
395 ncat <− nrow(bootData_cat)
396 # Model CAT ratification using resampled dataset
397 psi_cat <− data.frame(matrix(NA, nrow = 1, ncol = 7))
398 for (i in 1:7) {
399 id <− factor(bootData_cat$cow)

58
400 Y <− bootData_cat$cat
401 X <− data.frame(bootData_cat[, 1:15])
402 X1 <− X0 <− X
403 X1[, i] <− 1
404 X0[, i] <− 0
405 newdata <− rbind(X, X1, X0)
406 Q_cat <− mcSuperLearner(Y = Y, X = X, newX = newdata,
407 SL.library = SL.library,
408 family = "binomial",
409 method = "method.NNloglik",
410 cvControl = list(V = 5L),
411 verbose = TRUE)
412 predX1 <− Q_cat$SL.predict[(ncat + 1):(2∗ncat)]
413 predX0 <− Q_cat$SL.predict[(2∗ncat + 1):(3∗ncat)]
414 psi_cat[, i] <− mean(predX1 − predX0)
415 print(c(b, "CAT", i))
416 }
417
418 # Combine bootstrap estimates
419 psi_boot[b, 1:21] <− cbind(psi_iccpr, psi_cedaw, psi_cat)
420 }
421 psi_boot2 <− psi_boot
422 psi_boot3 <− psi_boot
423
424 lower_quantile <− function(x, prob){quantile(x, prob = 0.025)}
425 upper_quantile <− function(x, prob){quantile(x, prob = 0.975)}
426 mean_boot <− apply(psi_boot, 2, mean)
427 lower_boot <− apply(psi_boot, 2, lower_quantile)
428 upper_boot <− apply(psi_boot, 2, upper_quantile)
429
430 # Combine effect estimates across 5 datasets
431 variables <− c("Global rate", "Regional rate",
432 "Population", "GDP per capita",
433 "Trade", "Net ODA", "Judicial independence")
434
435 via_iccpr <− data.frame(cbind(mean = mean_boot[1:7],
436 lower = lower_boot[1:7],
437 upper = upper_boot[1:7]))
438 via_cedaw <− data.frame(cbind(mean = mean_boot[8:14],
439 lower = lower_boot[8:14],
440 upper = upper_boot[8:14]))
441 via_cat <− data.frame(cbind(mean = mean_boot[15:21],
442 lower = lower_boot[15:21],
443 upper = upper_boot[15:21]))
444
445 # Plot VIM results for all three treaty ratifications
446 effect <− rbind(via_iccpr, via_cedaw, via_cat) %>%
447 mutate(treaty = rep(c("ICCPR", "CEDAW", "CAT"), each = 7),
448 variables = rep(variables, 3)) %>%
449 dplyr::select(c(treaty, variables, mean, lower, upper))
450 row.names(effect) <− NULL
451 colnames(effect) <− c( "Treaty","Covariate", "Mean", "Lower", "Upper")
452 xtable(effect, digits = c(rep(3, 6)))

59
453

454 save.image("CVIA-continuous-SL.RData")
455
456 ######################################
457 # Testing Theories of Treaty Ratification Using SL−based TMLE
458 ######################################
459 options(digits = 4)
460 options(dplyr.width = Inf)
461 rm(list = ls())
462 cat("\014")
463 options("scipen" = 100, "digits" = 4)
464
465 # Load packages
466 library(dplyr) # Manage data
467 library(ggplot2) # visualize data
468 library(ggthemes) # use various themes in ggplot
469 library(SuperLearner) # use Super Learner predictive method
470 library(tmle) # use TMLE method
471 library(gam) # algorithm used within TMLE
472 library(glmnet) # algorithm used within TMLE
473 library(randomForest) # algorithm used within TMLE
474 library(polspline) # algorithm used within TMLE
475 library(xgboost) # algorithm used within TMLE
476 library(xtable) # create LaTeX tables
477 library(Amelia) # combine estimates from multiple imputation
478 library(RhpcBLASctl) #multicore
479 library(parallel) # parallel computing
480
481 # Tuning XGB
482 XGB = create.Learner("SL.xgboost",
483 tune = list(ntrees = 500, max_depth = 4, shrinkage = 0.01),
484 detailed_names = T, name_prefix = "XGB")
485
486 # Create Super Learner library
487 SL.library <− c("SL.glmnet", "SL.gam", "SL.polymars",
488 "SL.randomForest", XGB$names)
489

490 # Set multicore compatible seed.


491 set.seed(1, "L’Ecuyer-CMRG")
492
493 # Setup parallel computation − use all cores on our computer.
494 num_cores = RhpcBLASctl::get_num_cores()
495

496 # Use all of those cores for parallel SuperLearner.


497 options(mc.cores = num_cores)
498
499 # Check how many parallel workers we are using:
500 getOption("mc.cores")
501

502 # Create function rescaling outcome into 0−1


503 std <− function(x) {
504 x = (x − min(x))/(max(x) − min(x))
505 }

60
506

507 ###############################
508 # Test the effect of Democracy on Ratification among Torture = 0
509 set.seed(0)
510 data <− read.csv("stackeddata.csv")
511 d <− split(data, rep(1:5, each = nrow(data)/5))
512 DemTor0 <− data.frame(matrix(NA, nrow = 2, ncol = 5))
513
514 for (m in 1:5) {
515 print(c("Democracy on Ratification among Frequent Torture (0)", m))
516 # Subset to data among democracies only and transform variables
517 data_dem0 <− d[[m]] %>%
518 mutate(parties = ifelse(parties < 2, 0, 1),
519 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
520 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
521 ji = std(ji), population = std(population),
522 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
523 group_by(cow) %>% mutate(lagtorture = lag(torture),
524 lagcat = lag(cat)) %>%
525 filter(year >= 1985, lagtorture == 0) %>% na.omit()
526
527 # Identify model variables
528 id <− factor(data_dem0$cow)
529 Y = data_dem0$cat
530 A <− data_dem0$democracy
531 W <− data_dem0 %>% dplyr::select(legor, ratifrule,
532 lagcat,
533 parties, transition, dispute,
534 cat_glbavg, cat_regavg,
535 population, gdppc, trade, oda, ji,
536 cow)
537 W <− data.frame(W) %>% dplyr::select(−c(cow))
538 tmle_demtor0 <− tmle(Y, A, W,
539 Qbounds = c(0, 1), Q.SL.library = SL.library,
540 gbound = 1e−4, g.SL.library = SL.library,
541 family = "binomial", fluctuation = "logistic",
542 id = id, verbose = TRUE)
543 DemTor0[1:2, m] <− c(tmle_demtor0$estimates$ATE$psi,
544 sqrt(tmle_demtor0$estimates$ATE$var.psi))
545 }
546
547 # Combine estimates of TorDic effect
548 demtor0_comest <− data.frame(mi.meld(q = DemTor0[1, ], se = DemTor0[2, ],
549 byrow = FALSE))
550
551 ###############################
552 # Test the effect of Democracy on Ratification among Torture = 1
553 set.seed(1)
554 data <− read.csv("stackeddata.csv")
555 d <− split(data, rep(1:5, each = nrow(data)/5))
556 DemTor1 <− data.frame(matrix(NA, nrow = 2, ncol = 5))
557
558 for (m in 1:5) {

61
559 print(c("Democracy on Ratification among Occasional Torture (1)", m))
560 # Subset to data among democracies only and transform variables
561 data_dem1 <− d[[m]] %>%
562 mutate(parties = ifelse(parties < 2, 0, 1),
563 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
564 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
565 ji = std(ji), population = std(population),
566 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
567 group_by(cow) %>% mutate(lagtorture = lag(torture),
568 lagcat = lag(cat)) %>%
569 filter(year >= 1985, lagtorture == 1) %>% na.omit()
570
571 # Identify model variables
572 id <− factor(data_dem1$cow)
573 Y = data_dem1$cat
574 A <− data_dem1$democracy
575 W <− data_dem1 %>% dplyr::select(legor, ratifrule,
576 lagcat,
577 parties, transition, dispute,
578 cat_glbavg, cat_regavg,
579 population, gdppc, trade, oda, ji,
580 cow)
581 W <− data.frame(W) %>% dplyr::select(−c(cow))
582 tmle_demtor1 <− tmle(Y, A, W,
583 Qbounds = c(0, 1), Q.SL.library = SL.library,
584 gbound = 1e−4, g.SL.library = SL.library,
585 family = "binomial", fluctuation = "logistic",
586 id = id, verbose = TRUE)
587 DemTor1[1:2, m] <− c(tmle_demtor1$estimates$ATE$psi,
588 sqrt(tmle_demtor1$estimates$ATE$var.psi))
589 }
590
591 # Combine estimates of TorDic effect
592 demtor1_comest <− data.frame(mi.meld(q = DemTor1[1, ], se = DemTor1[2, ],
593 byrow = FALSE))
594
595 ###############################
596 # Test the effect of Democracy on Ratification among Torture = 2
597 set.seed(2)
598 data <− read.csv("stackeddata.csv")
599 d <− split(data, rep(1:5, each = nrow(data)/5))
600 DemTor2 <− data.frame(matrix(NA, nrow = 2, ncol = 5))
601

602 for (m in 1:5) {


603 print(c("Democracy on Ratification among No Torture (2)", m))
604 # Subset to data among democracies only and transform variables
605 data_dem2 <− d[[m]] %>%
606 mutate(parties = ifelse(parties < 2, 0, 1),
607 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
608 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
609 ji = std(ji), population = std(population),
610 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
611 group_by(cow) %>% mutate(lagtorture = lag(torture),

62
612 lagcat = lag(cat)) %>%
613 filter(year >= 1985, lagtorture == 2) %>% na.omit()
614
615 # Identify model variables
616 id <− factor(data_dem2$cow)
617 Y = data_dem2$cat
618 A <− data_dem2$democracy
619 W <− data_dem2 %>% dplyr::select(legor, ratifrule,
620 lagcat,
621 parties, transition, dispute,
622 cat_glbavg, cat_regavg,
623 population, gdppc, trade, oda, ji,
624 cow)
625 W <− data.frame(W) %>% dplyr::select(−c(cow))
626 tmle_demtor2 <− tmle(Y, A, W,
627 Qbounds = c(0, 1), Q.SL.library = SL.library,
628 gbound = 1e−4, g.SL.library = SL.library,
629 family = "binomial", fluctuation = "logistic",
630 id = id, verbose = TRUE)
631 DemTor2[1:2, m] <− c(tmle_demtor2$estimates$ATE$psi,
632 sqrt(tmle_demtor2$estimates$ATE$var.psi))
633 }
634
635 # Combine estimates of TorDic effect
636 demtor2_comest <− data.frame(mi.meld(q = DemTor2[1, ], se = DemTor2[2, ],
637 byrow = FALSE))
638
639 ###############################
640 # Test the effect of Torture on Ratification among All countries
641 # When Fix No Torture into Occassional/Frequent Torture
642 set.seed(3)
643 data <− read.csv("stackeddata.csv")
644 d <− split(data, rep(1:5, each = nrow(data)/5))
645 TorAll <− data.frame(matrix(NA, nrow = 2, ncol = 5))
646
647 for (m in 1:5) {
648 print(c("Torture on Ratification among All", m))
649 # Transform variables (0 = no torture, 1 = yes torture, parties < 2)
650 data_cat <− d[[m]] %>%
651 mutate(torture = ifelse(torture < 2, 1, 0),
652 parties = ifelse(parties < 2, 0, 1),
653 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
654 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
655 ji = std(ji), population = std(population),
656 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
657 group_by(cow) %>% mutate(lagtorture = lag(torture),
658 lagcat = lag(cat),
659 lagdemocracy = lag(democracy),
660 lagparties = lag(parties),
661 lagtransition = lag(transition),
662 lagdispute = lag(dispute),
663 lagcat_glbavg = lag(cat_glbavg),
664 lagcat_regavg = lag(cat_regavg),

63
665 lagji = lag(ji),
666 lagpopulation = lag(population),
667 laggdppc = lag(gdppc),
668 lagtrade = lag(trade),
669 lagoda = lag(oda)) %>%
670 filter(year >= 1985) %>% na.omit()
671

672 # Identify model variables


673 id <− factor(data_cat$cow)
674 Y <− data_cat$cat
675 A <− data_cat$lagtorture
676 W <− data_cat %>% dplyr::select(legor, ratifrule,
677 lagcat,
678 lagdemocracy, lagparties, lagtransition, lagdispute,
679 lagcat_glbavg, lagcat_regavg,
680 lagpopulation, laggdppc, lagtrade, lagoda, lagji,
681 cow)
682 W <− data.frame(W) %>% dplyr::select(−c(cow))
683 tmle_torall <− tmle(Y, A, W,
684 Qbounds = c(0, 1), Q.SL.library = SL.library,
685 gbound = 1e−4, g.SL.library = SL.library,
686 family = "binomial", fluctuation = "logistic",
687 id = id, verbose = TRUE)
688 TorAll[1:2, m] <− c(tmle_torall$estimates$ATE$psi,
689 sqrt(tmle_torall$estimates$ATE$var.psi))
690 }
691
692 # Combine estimates of TorDic effect
693 torall_comest <− data.frame(mi.meld(q = TorAll[1, ], se = TorAll[2, ],
694 byrow = FALSE))
695

696 ###############################
697 # Test the effect of Torture on Ratification among Democracies
698 # When Fix No Torture into Occassional/Frequent Torture
699 set.seed(4)
700 data <− read.csv("stackeddata.csv")
701 d <− split(data, rep(1:5, each = nrow(data)/5))
702 TorDemo <− data.frame(matrix(NA, nrow = 2, ncol = 5))
703
704 for (m in 1:5) {
705 print(c("Torture on Ratification among Democracies", m))
706 # Transform variables (0 = no torture, 1 = yes torture, parties < 2)
707 data_tordemo <− d[[m]] %>%
708 mutate(torture = ifelse(torture < 2, 1, 0),
709 parties = ifelse(parties < 2, 0, 1),
710 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
711 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
712 ji = std(ji), population = std(population),
713 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
714 group_by(cow) %>% mutate(lagtorture = lag(torture),
715 lagcat = lag(cat),
716 lagdemocracy = lag(democracy),
717 lagparties = lag(parties),

64
718 lagtransition = lag(transition),
719 lagdispute = lag(dispute),
720 lagcat_glbavg = lag(cat_glbavg),
721 lagcat_regavg = lag(cat_regavg),
722 lagji = lag(ji),
723 lagpopulation = lag(population),
724 laggdppc = lag(gdppc),
725 lagtrade = lag(trade),
726 lagoda = lag(oda)) %>%
727 filter(year >= 1985, lagdemocracy == 1) %>% na.omit()
728
729 # Identify model variables
730 id <− factor(data_tordemo$cow)
731 Y <− data_tordemo$cat
732 A <− data_tordemo$lagtorture
733 W <− data_tordemo %>% dplyr::select(legor, ratifrule,
734 lagcat,
735 lagparties, lagtransition, lagdispute,
736 lagcat_glbavg, lagcat_regavg,
737 lagpopulation, laggdppc, lagtrade, lagoda, lagji,
738 cow)
739 W <− data.frame(W) %>% dplyr::select(−c(cow))
740 tmle_tordemo <− tmle(Y, A, W,
741 Qbounds = c(0, 1), Q.SL.library = SL.library,
742 gbound = 1e−4, g.SL.library = SL.library,
743 family = "binomial", fluctuation = "logistic",
744 id = id, verbose = TRUE)
745 TorDemo[1:2, m] <− c(tmle_tordemo$estimates$ATE$psi,
746 sqrt(tmle_tordemo$estimates$ATE$var.psi))
747 }
748

749 # Combine estimates of TorDic effect


750 tordemo_comest <− data.frame(mi.meld(q = TorDemo[1, ], se = TorDemo[2, ],
751 byrow = FALSE))
752
753 ###############################
754 # Test the effect of Torture on Ratification among Dictatorships
755 # When Fix No Torture into Occassional/Frequent Torture
756 set.seed(5)
757 data <− read.csv("stackeddata.csv")
758 d <− split(data, rep(1:5, each = nrow(data)/5))
759 TorAuto <− data.frame(matrix(NA, nrow = 2, ncol = 5))
760

761 for (m in 1:5) {


762 print(c("Torture on Ratification among Dictatorships", m))
763 # Transform variables (0 = no torture, 1 = yes torture, parties < 2)
764 data_torauto <− d[[m]] %>%
765 mutate(torture = ifelse(torture < 2, 1, 0),
766 parties = ifelse(parties < 2, 0, 1),
767 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
768 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
769 ji = std(ji), population = std(population),
770 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%

65
771 group_by(cow) %>% mutate(lagtorture = lag(torture),
772 lagcat = lag(cat),
773 lagdemocracy = lag(democracy),
774 lagparties = lag(parties),
775 lagtransition = lag(transition),
776 lagdispute = lag(dispute),
777 lagcat_glbavg = lag(cat_glbavg),
778 lagcat_regavg = lag(cat_regavg),
779 lagji = lag(ji),
780 lagpopulation = lag(population),
781 laggdppc = lag(gdppc),
782 lagtrade = lag(trade),
783 lagoda = lag(oda)) %>%
784 filter(year >= 1985, lagdemocracy == 0) %>% na.omit()
785
786 # Identify model variables
787 id <− factor(data_torauto$cow)
788 Y <− data_torauto$cat
789 A <− data_torauto$lagtorture
790 W <− data_torauto %>% dplyr::select(legor, ratifrule,
791 lagcat,
792 lagparties, lagtransition, lagdispute,
793 lagcat_glbavg, lagcat_regavg,
794 lagpopulation, laggdppc, lagtrade, lagoda, lagji,
795 cow)
796 W <− data.frame(W) %>% dplyr::select(−c(cow))
797 tmle_torauto <− tmle(Y, A, W,
798 Qbounds = c(0, 1), Q.SL.library = SL.library,
799 gbound = 1e−4, g.SL.library = SL.library,
800 family = "binomial", fluctuation = "logistic",
801 id = id, verbose = TRUE)
802 TorAuto[1:2, m] <− c(tmle_torauto$estimates$ATE$psi,
803 sqrt(tmle_torauto$estimates$ATE$var.psi))
804 }
805
806 # Combine estimates of TorDic effect
807 torauto_comest <− data.frame(mi.meld(q = TorAuto[1, ], se = TorAuto[2, ],
808 byrow = FALSE))
809
810 #############################
811 # Test the effect of Multiple Parties on Ratification among Dictatorships
812 set.seed(6)
813 data <− read.csv("stackeddata.csv")
814 d <− split(data, rep(1:5, each = nrow(data)/5))
815 PartyDic <− data.frame(matrix(NA, nrow = 2, ncol = 5))
816
817 for (m in 1:5) {
818 print(c("Multiple Parties on Ratification among Dictatorships", m))
819 # Subset to data among democracies only and transform variables
820 # (0 = no torture, 1 = yes torture)
821 data_cat <− d[[m]] %>%
822 mutate(torture = std(torture),
823 parties = ifelse(parties < 2, 0, 1),

66
824 legor = ifelse(legor == 1, 1, 0), ratifrule = std(ratifrule),
825 cat_glbavg = std(cat_glbavg), cat_regavg = std(cat_regavg),
826 ji = std(ji), population = std(population),
827 gdppc = std(gdppc), trade = std(trade), oda = std(oda)) %>%
828 group_by(cow) %>% mutate(lagcat = lag(cat), lagtorture = lag(torture)) %>%
829 filter(year >= 1985, democracy == 0) %>% na.omit()
830

831 # Identify model variables


832 Y <− data_cat$cat
833 id <− factor(data_cat$cow)
834 A <− data_cat$parties
835 W <− data_cat %>% dplyr::select(legor, ratifrule,
836 lagcat, lagtorture,
837 transition, dispute,
838 cat_glbavg, cat_regavg,
839 population, gdppc, trade, oda, ji,
840 cow)
841 W <− data.frame(W) %>% dplyr::select(−c(cow))
842 tmle_partydic <− tmle(Y, A, W,
843 Qbounds = c(0, 1), Q.SL.library = SL.library,
844 gbound = 1e−4, g.SL.library = SL.library,
845 family = "binomial", fluctuation = "logistic",
846 id = id, verbose = TRUE)
847 PartyDic[1:2, m] <− c(tmle_partydic$estimates$ATE$psi,
848 sqrt(tmle_partydic$estimates$ATE$var.psi))
849 }
850
851 # Combine estimates of PartyDic effect
852 partydic_comest <− data.frame(mi.meld(q = PartyDic[1, ], se = PartyDic[2, ],
853 byrow = FALSE))
854

855 #############################
856 effect <− data.frame(rbind(demtor0_comest, demtor1_comest, demtor2_comest,
857 torall_comest, torauto_comest, tordemo_comest,
858 partydic_comest))
859 effect <− cbind(Theory = c("Democracy w/ Torture (Frequent)",
860 "Democracy w/ Torture (Occasional)",
861 "Democracy w/ Torture (Never)",
862 "Torture among All",
863 "Torture among Dictatorships",
864 "Torture among Democracies",
865 "Parties among Dictatorships"),
866 effect) %>%
867 mutate(Lower = X1 − 1.96∗X2, Upper = X1 + 1.96∗X2) %>%
868 rename(Mean = X1, SE = X2)
869 xtable(effect, digits = rep(3, 6))
870
871 save.image("TheoryTesting-TMLE.RData")

67