Title: | Linear Model Weights |
---|---|
Description: | Computes the implied weights of linear regression models for estimating average causal effects and provides diagnostics based on these weights. These diagnostics rely on the analyses in Chattopadhyay and Zubizarreta (2023) <doi:10.1093/biomet/asac058> where several regression estimators are represented as weighting estimators, in connection to inverse probability weighting. 'lmw' provides tools to diagnose representativeness, balance, extrapolation, and influence for these models, clarifying the target population of inference. Tools are also available to simplify estimating treatment effects for specific target populations of interest. |
Authors: | Ambarish Chattopadhyay [aut] , Noah Greifer [aut, cre] , Jose Zubizarreta [aut] |
Maintainer: | Noah Greifer <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.0.2 |
Built: | 2024-11-04 04:19:18 UTC |
Source: | https://github.com/ngreifer/lmw |
lmw
and lmw_est
objectsinfluence()
produces influence measures for lmw
objects that
can be used as regression diagnostics to identify influential cases. These
functions produce similar outputs to lm.influence()
but also
include the sample influence curve (SIC) values, which combine information
about the hat values, residuals, and implied regression weights.
## S3 method for class 'lmw' influence(model, outcome, data = NULL, ...) ## S3 method for class 'lmw_est' influence(model, ...)
## S3 method for class 'lmw' influence(model, outcome, data = NULL, ...) ## S3 method for class 'lmw_est' influence(model, ...)
model |
an |
outcome |
the name of the outcome variable. Can be supplied as a string
containing the name of the outcome variable or as the outcome variable
itself. If not supplied, the outcome variable in the |
data |
an optional data frame containing the outcome variable named in
|
... |
ignored. |
influence()
computes the hat values, (weighted) residuals, and sample
influence curve (SIC) values for each unit, which can be used as regression
diagnostics to assess influence. The weighted residuals are weighted by the
sampling weights (if supplied), not the implied regression weights. The SIC
values are computed as SIC = (N-1) * w * r / (1 - h)
, where N
is the sample size, w
are the units' implied regression weights,
r
are the (weighted) residuals, and h
are the hat values. SIC
values are scaled to have a maximum of 1. Higher values indicate greater
relative influence.
A list with the following components:
hat |
a vector containing the diagonal of the hat matrix. |
wt.res |
a vector of (weighted) residuals. |
sic |
a vector containing the scaled SIC values. |
influence.lmw()
uses non-standard evaluation to interpret its
outcome
argument. For programmers who wish to use
influence.lmw()
inside other functions, an effective way to pass the
name of an arbitrary outcome (e.g., y
passed as a string) is to use
do.call()
, for example:
fun <- function(m, y, d) { do.call("influence", list(m, y, d)) }
When using influence.lmw()
inside lapply()
or purrr::map
to loop over outcomes, this
syntax must be used as well.
plot.lmw()
for plotting the SIC values;
lm.influence()
for influence measures for lm
objects,
which do not include SIC values; hatvalues()
for hat values for
lm
objects (note that lmw_est
objects also have a
hatvalues()
method).
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") # Influence for re78 outcome infl <- influence(lmw.out1, outcome = "re78") str(infl) # Can also be used after lmw_est(): lmw.est1 <- lmw_est(lmw.out1, outcome = "re78") all.equal(infl, influence(lmw.est1))
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") # Influence for re78 outcome infl <- influence(lmw.out1, outcome = "re78") str(infl) # Can also be used after lmw_est(): lmw.est1 <- lmw_est(lmw.out1, outcome = "re78") all.equal(infl, influence(lmw.est1))
This is a subsample of the data from the treated group in the National Supported Work Demonstration (NSW) and the comparison sample from the Population Survey of Income Dynamics (PSID). This data was previously analyzed extensively by Lalonde (1986) and Dehejia and Wahba (1999). The original dataset is available at https://users.nber.org/~rdehejia/nswdata2.html.
lalonde
lalonde
A data frame with 2675 observations (185 treated, 2490 control). There
are 9 variables measured for each individual. In addition, two constructed variables are included: treat_multi
, which splits the original control group into two, and Ins
, which is a constructed instrumental variable.
"treat" is the treatment assignment (1=treated, 0=control).
"age" is age in years.
"education" is education in number of years of schooling.
"race" is the individual's race/ethnicity, (Black, Hispanic, or White).
"married" is an indicator for married (1=married, 0=not married).
"nodegree" is an indicator for whether the individual lacks a high school degree (i.e., has fewer than 12 years of schooling; 1=no degree, 0=degree).
"re74" is income in 1974, in U.S. dollars.
"re75" is income in 1975, in U.S. dollars.
"re78" is income in 1978, in U.S. dollars.
"treat_multi" is a constructed version of "treat" that splits the control group into to levels (1=treated, 2=control group A, 3=control group B).
"Ins" is a binary instrumental variable.
"treat" is the treatment variable, "re78" is the outcome, and the others are pre-treatment covariates. Note that in the original data, "race" is instead coded as two dummy variables, "black" and "hispan".
The data corresponds to the NSW treated sample and the PSID control sample with 1974 earnings included. This specific dataset is different from the one in the MatchIt and cobalt packages, which is a subset of this dataset.
Lalonde, R. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review 76: 604-620.
Dehejia, R.H. and Wahba, S. (1999). Causal Effects in Nonexperimental Studies: Re-Evaluating the Evaluation of Training Programs. Journal of the American Statistical Association 94: 1053-1062.
Computes the weights implied by a linear outcome regression model that would estimate a weighted difference in outcome means equal to the covariate-adjusted treatment effect resulting from the supplied regression model.
lmw( formula, data = NULL, estimand = "ATE", method = "URI", treat = NULL, base.weights = NULL, s.weights = NULL, dr.method = "WLS", obj = NULL, fixef = NULL, target = NULL, target.weights = NULL, contrast = NULL, focal = NULL )
lmw( formula, data = NULL, estimand = "ATE", method = "URI", treat = NULL, base.weights = NULL, s.weights = NULL, dr.method = "WLS", obj = NULL, fixef = NULL, target = NULL, target.weights = NULL, contrast = NULL, focal = NULL )
formula |
a one-sided formula with the treatment and covariates on the right-hand side corresponding to the outcome regression model to be fit. The outcome variable is not involved in computing the weights and does not need to be specified. See Details for how this formula is interpreted in light of other options. |
data |
a data frame containing the variables named in |
estimand |
the estimand of interest, which determines how covariates
are centered. Should be one of |
method |
the method used to estimate the weights; either |
treat |
the name of the treatment variable in |
base.weights |
a vector of base weights. See Details. If omitted and
|
s.weights |
a vector of sampling weights. See Details. If omitted and
|
dr.method |
the method used to incorporate the |
obj |
a |
fixef |
optional; a string or one-sided formula containing the name of
the fixed effects variable in |
target |
a list or data frame containing the target values for each
covariate included in |
target.weights |
a vector of sampling weights to be applied to
|
contrast |
for multi-category treatments with |
focal |
the level of the treatment variable to be considered "focal"
(i.e., the "treated" level when |
formula
is interpreted differently depending on whether method
is "URI"
or "MRI"
. When method = "URI"
, the formula is
taken literally as the right-hand side of the outcome model formula. The
only difference is that the covariates will be centered based on the
argument to estimand
(see below). When method = "MRI"
, all
references to the treatment are removed (i.e., covariate interactions with
treatment become covariate main effects if not already present), and the new
formula is taken as the right-hand side of the model formula fit within each
treatment group. This is equivalent to allowing all covariates to have both
main effects and interactions with treatment after centering the covariates
based on the argument to estimand
. Allowing the treatment to interact
with all covariates with method = "URI"
is equivalent to specifying
method = "MRI"
, and, for binary treatments, the returned weights will
be the same when fixef = NULL
.
When any treatment-by-covariate interactions are present in formula
or when method = "MRI"
, covariates are centered at specific values to
ensure the resulting weights correspond to the desired estimand as supplied
to the estimand
argument. For the ATE, the covariates are centered at
their means in the full sample. For the ATT and ATC, the covariates are
centered at their means in the treatment or control group (i.e., the
focal
group), respectively. For the CATE, the covariates are centered
according to the argument supplied to target
(see below). Note that
when covariate-by-covariate interactions are present, they will be centered
after computing the interaction rather than the interaction being computed
on the centered covariates unless estimand = "CATE"
, in which case
the covariates will be centered at the values specified in target
prior to involvement in interactions.
When estimand = "CATE"
, target
can be supplied either as a
single target profile (i.e., a list or a data frame with one row) or as a
target dataset, potentially with its own sampling weights, which are
supplied to target.weights
. The variables included in target
must correspond to all the named covariates in formula
; for
example, if formula = ~ X1 + log(X1) + X2 + X1:X2
, values in
target
must be given for X1
and X2
, but not
log(X1)
or X1:X2
. To choose a target profile value for a
factor corresponding to a proportion (e.g., a target value of .5 for a
variable like sex
indicating a target population with a 50-50 sex
split), the factor variable must be split into a numeric variable
beforehand, e.g., using model.matrix()
or
cobalt::splitfactor()
. target
values cannot be given to
variables specified using $
, [[]]
, or []
(e.g.,
data$X1
), so an error will be thrown if they are used in
formula
. When a target dataset is supplied, covariates will be
centered at their means in the (target.weights
-weighted) target
dataset.
Base weights (base.weights
) and sampling weights (s.weights
)
are similar in that they both involve combining weights with an outcome
regression model. However, they differ in a few ways. Sampling weights are
primarily used to adjust the target population; when the outcome model is
fit, it is fit using weighted least squares, and when target balance is
assessed, it is assessed using the sampling weighted population as the
target population. Centering of covariates in the outcome model is done
using the sampling weighted covariate means. Base weights are primarily used
to offer a second level of balancing beyond the implied regression weights;
they can be incorporated into the effect estimate either using weighted
least squares or using the augmented inverse probability weighting (AIPW)
estimator. Base weights do not change the target population, so when target
balance is assessed, it is assessed using the unweighted population as the
target population.
Some forms of weights both change the target population and provide an extra
layer of balancing, like propensity score weights that target estimands
other than the ATT, ATC, or ATE (e.g., overlap weights), or matching weights
where the target population is defined by the matching (e.g., matching with
a caliper, cardinality matching, or coarsened exact matching). Because these
weights change the target population, they should be supplied to
s.weights
to ensure covariates are appropriately centered. When there
are no treatment-by-covariate interactions and method = "URI"
,
whether weights are supplied to base.weights
or s.weights
will
not matter for the estimation of the weights but will affect the target
population in balance assessment.
When both base.weights
and s.weights
are supplied, e.g., when
the base weights are the result of a propensity score model fit with
sampling weights, it is assumed the base weights do not incorporate the
sampling weights; that is, it is assumed that to estimate a treatment effect
without regression adjustment, the base weights and the sampling
weights would have to be multiplied together. This is true, for example, for
the weights in a matchit
or weightit
object (see below) but
not for weights in the output of MatchIt::match.data()
unless called
with include.s.weights = FALSE
or weights resulting from
CBPS::CBPS()
.
Regression weights can be computed in a matched or weighted sample by
supplying a matchit
or weightit
object (from MatchIt or
WeightIt, respectively) to the obj
argument of lmw()
.
The estimand, focal group (if any), base weights, and sampling weights (if any) will be taken from
the supplied object and used in the calculation of the implied regression
weights, unless these have been supplied separately to lmw()
. The
weights
component of the supplied object containing the matching or
balancing weights will be passed to base.weights
and the
s.weights
component will be passed to s.weights
. Arguments
supplied to lmw()
will take precedence over the corresponding
components in the obj
object.
There are a few differences when the
treatment has multiple (i.e., more than 2) categories. If estimand
is
"ATT"
or "ATC"
, an argument should be supplied to focal
identifying which group is the treated or control (i.e., "focal") group,
respectively.
The key difference, though, is when method = "URI"
, because in this
case the contrast between each pair of treatment groups has its own weights
and its own implied target population. Because lmw()
only produces
one set of weights, an argument must be supplied to contrast
identifying which groups are to be used as the contrast for computing the
weights. In addition, to compute the treatment effect corresponding to the
chosen contrast as a weighted difference in outcome means, the difference
must be taken between the weighted mean of the non-reference group and the
weighted mean of all other groups combined, rather than simply the
weighted mean of the reference group.
The implication of this is that contrast statistics computed in the weighted
sample involve all units, even those not in the contrasted groups, whereas
statistics computed in the unweighted sample only involve units in the
contrasted groups. See summary.lmw()
for more information on
assessing balance using the regression weights for multi-category
treatments. Given these complications, it is generally best to use
method = "MRI"
with multi-category treatments.
A fixed effects variable can be supplied to the
fixef
argument. This is equivalent to adding the fixed effects
variable as a predictor that does not interact with the treatment or any
other covariate. The difference is that computation is much faster when the
fixed effect has many levels because demeaning is used rather than including
the fixed effect variable as a collection of dummy variables. When using
URI, the weights will be the same regardless of whether the fixed effect
variable is included as a covariate or supplied to fixef
; when using
MRI, results will differ because the fixed effect variable does not interact
with treatment. The fixed effects variable will not appear in the
summary.lmw()
output (but can be added using addlvariables
argument) or in the model output of lmw_est()
or
summary.lmw_est()
. Because it does not interact with the
treatment, the distribution of the fixed effect variable may not correspond
to the target population, so caution should be used if it is expected the
treatment effect varies across levels of this variable (in which case it
should be included as a predictor). Currently only one fixed effect variable
is allowed.
An lmw
object, which contains the following components:
treat |
the treatment variable, given as a factor. |
weights |
the computed implied regression weights. |
covs |
a data frame containing the covariates included the model formula. |
estimand |
the requested estimand. |
method |
the method used to estimate the weights
( |
base.weights |
the weights supplied to
|
s.weights |
the weights supplied to
|
dr.method |
when |
call |
the
original call to |
fixef |
the fixed effects variable if
supplied to |
formula |
the model formula. |
target |
the supplied target profile or dataset when |
contrast |
the contrasted treatment groups. |
focal |
the focal
treatment level when |
Chattopadhyay, A., & Zubizarreta, J. R. (2023). On the implied weights of linear regression for causal inference. Biometrika, 110(3), 615–629. doi:10.1093/biomet/asac058
summary.lmw()
for summarizing balance and
representativeness; plot.lmw()
for plotting features of the
weights; lmw_est()
for estimating treatment effects from
lmw
objects; influence.lmw()
for influence measures;
lm()
for fitting standard regression models.
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 summary(lmw.out1) # MRI regression for ATT lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat") lmw.out2 summary(lmw.out2) # MRI regression for ATT after propensity score matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out3 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) lmw.out3 summary(lmw.out3) # MRI regression for CATE with given target profile target.prof <- list(age = 25, education = 11, race = "black", married = 0, nodegree = 1, re74 = 0, re75 = 0) lmw.out4 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "CATE", method = "MRI", treat = "treat", target = target.prof) lmw.out4 summary(lmw.out4) # MRI regression for CATE with given target dataset (in # this case, will give the same as with estimand = "ATT") target.data <- subset(lalonde, treat == 1) lmw.out4 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "CATE", method = "MRI", treat = "treat", target = target.data) lmw.out4 summary(lmw.out4) # URI regression with fixed effects for 'race' lmw.out5 <- lmw(~ treat + age + education + married + nodegree + re74 + re75, data = lalonde, method = "URI", treat = "treat", fixef = ~race) lmw.out5 # Produces the same weights as when included as a covariate all.equal(lmw.out1$weights, lmw.out5$weights) # MRI for a multi-category treatment, ATT with 1 as the focal # group lmw.out6 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat_multi", focal = "1") lmw.out6 summary(lmw.out6) # URI for a multi-category treatment; need to specify # contrast because only two groups can be compared at # a time lmw.out7 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "URI", treat = "treat_multi", contrast = c("2", "3")) lmw.out7 summary(lmw.out7)
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 summary(lmw.out1) # MRI regression for ATT lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat") lmw.out2 summary(lmw.out2) # MRI regression for ATT after propensity score matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out3 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) lmw.out3 summary(lmw.out3) # MRI regression for CATE with given target profile target.prof <- list(age = 25, education = 11, race = "black", married = 0, nodegree = 1, re74 = 0, re75 = 0) lmw.out4 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "CATE", method = "MRI", treat = "treat", target = target.prof) lmw.out4 summary(lmw.out4) # MRI regression for CATE with given target dataset (in # this case, will give the same as with estimand = "ATT") target.data <- subset(lalonde, treat == 1) lmw.out4 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "CATE", method = "MRI", treat = "treat", target = target.data) lmw.out4 summary(lmw.out4) # URI regression with fixed effects for 'race' lmw.out5 <- lmw(~ treat + age + education + married + nodegree + re74 + re75, data = lalonde, method = "URI", treat = "treat", fixef = ~race) lmw.out5 # Produces the same weights as when included as a covariate all.equal(lmw.out1$weights, lmw.out5$weights) # MRI for a multi-category treatment, ATT with 1 as the focal # group lmw.out6 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat_multi", focal = "1") lmw.out6 summary(lmw.out6) # URI for a multi-category treatment; need to specify # contrast because only two groups can be compared at # a time lmw.out7 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "URI", treat = "treat_multi", contrast = c("2", "3")) lmw.out7 summary(lmw.out7)
lmw_est()
fits the outcome regression corresponding to the model used
to compute the weights in the supplied lmw
object and returns the
model coefficients and their covariance matrix. Use
summary.lmw_est()
to compute and view the treatment effect and
potential outcome mean estimates and their standard errors.
lmw_est(x, ...) ## S3 method for class 'lmw' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...) ## S3 method for class 'lmw_aipw' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...) ## S3 method for class 'lmw_iv' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...)
lmw_est(x, ...) ## S3 method for class 'lmw' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...) ## S3 method for class 'lmw_aipw' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...) ## S3 method for class 'lmw_iv' lmw_est(x, outcome, data = NULL, robust = TRUE, cluster = NULL, ...)
x |
an |
... |
other arguments passed to |
outcome |
the name of the outcome variable. Can be supplied as a string
containing the name of the outcome variable or as the outcome variable
itself. If not supplied, the outcome variable in the |
data |
an optional data frame containing the outcome variable named in
|
robust |
whether to compute the robust covariance matrix for the model
coefficients. Allowable values include those allowed for the |
cluster |
the clustering variable(s) for computing a cluster-robust
covariance matrix. See |
lmw_est()
uses lm.fit()
or lm.wfit()
to fit
the outcome regression model (and first stage model for lmw_iv
objects) and returns the output of these functions augmented with other
components related to the estimation of the weights. Unlike with
lm.[w]fit()
, the covariance matrix of the parameter estimates is also
included in the output.
For lmw
objects, the model fit is that supplied to the formula
input to lmw()
except that it is fit in a dataset appropriately
centered to ensure the estimand corresponds with the one requested. When
method = "MRI"
in the call to lmw()
, the model is fit as an
interaction between the treatment and all the (centered) terms in the model
formula. The results will be similar to those from using lm()
on
this model and supplied data except that the covariates are centered
beforehand. The product of the sampling weights and base weights supplied to
lmw()
, if any, will be supplied to lm.wfit()
to fit the model
using weighted least squares.
For lmw_aipw
objects, the model is fit as above except that base
weights are not included in the model fitting and are instead used to
compute additional augmentation terms that are added to the estimated
potential outcome means from the outcome regression. The variance-covariance
matrix is computed using M-estimation; this corresponds to the HC0 robust
covariance matrix for the model parameters with the base weights treated as
fixed, which yields conservative standard errors for the ATE. Inference is
only approximate for the ATT and ATC.
For lmw_iv
objects, the first stage model is constructed by removing
the treatment from the supplied model formula, adding the instrumental
variable as a main effect, and using the treatment variable as the outcome.
For the second stage (reduced form) model, the fitted values of the
treatment from the first stage model are used in place of the treatment in
the outcome model. The results are similar to those from using
ivreg::ivreg()
, and the coefficients estimates will be the same
except for the intercept due to the centering of covariates.
Although some coefficients in the model may be interpretable as treatment
effect estimates, summary.lmw_est()
should be used to view and
extract the treatment effect and potential outcome mean estimates, standard
errors, and other model statistics. The output of lmw_est()
should
rarely be used except to be supplied to summary()
.
An lmw_est
object with the following components:
coefficients , residuals , fitted.values , effects , weights , rank , df.residual , qr
|
for |
model.matrix |
the model matrix (supplied to the |
vcov |
the estimated covariance matrix of
the parameter estimates as produced by |
lmw.weights |
the implied regression
weights computed by |
call |
the call to
|
estimand |
the requested estimand. |
focal |
the
focal treatment level when |
method |
the method used to estimate the weights ( |
robust |
the type standard error used. |
outcome |
the name of the outcome variable. |
treat_levels |
the levels of the treatment. |
When AIPW is used, the object will be of class lmw_est_aipw
, which
inherits from lmw_est
, and contains the additional components:
coef_aipw |
the model-predicted potential outcome means ( |
vcov_aipw |
the covariance matrix
of the quantities in |
When weights are included in the estimation (i.e., base.weights
or
s.weights
supplied to lmw()
or lmw_iv()
), any units
will weights equal to zero will be removed from the data prior to model
fitting.
Methods exist for lmw_est
objects for model.matrix()
,
vcov()
, hatvalues()
, sandwich::bread()
,
and sandwich::estfun()
, all of which are used internally to
compute the parameter estimate covariance matrix. The first two simply
extract the corresponding component from the lmw_est
object and the
last three imitate the corresponding methods for lm
objects (or
ivreg
objects for lmw_iv
inputs). Other regression-related
functions, such as coef()
, residuals()
, and
fitted()
, use the default methods and should work correctly with
lmw_est
objects.
Note that when fixed effects are supplied through the fixef
argument
to lmw()
or lmw_iv()
, standard error estimates computed using
functions outside lmw may not be accurate due to issues relating to
degrees of freedom. In particular, this affects conventional and HC1-robust
standard errors. Otherwise, sandwich::vcovHC()
can be used to compute
standard errors (setting type = "const"
for conventional standard
errors), though sandwich::vcovCL()
may not work as expected and
should not be used. To calculate cluster-robust standard errors, supply an
argument to cluster
in lmw_est()
.
lmw_est()
uses non-standard evaluation to interpret its
outcome
argument. For programmers who wish to use lmw_est()
inside other functions, an effective way to pass the name of an arbitrary
outcome (e.g., y
passed as a string) is to use do.call()
,
for example:
fun <- function(model, outcome, data) { do.call("lmw_est", list(model, outcome, data)) }
When using
lmw_est()
inside lapply()
or purrr::map
to loop
over outcomes, this syntax must be used as well.
summary.lmw_est()
for viewing and extracting the
treatment effect and potential outcome mean estimates, standard errors, and
other model statistics; lmw()
or lmw_iv()
for
estimating the weights that correspond to the model estimated by
lmw_est()
; lm()
and lm.fit()
for fitting the
corresponding model; ivreg()
in the ivreg package for fitting
2SLS models; influence.lmw_est()
for influence measures
data("lalonde") # MRI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat") lmw.fit1 <- lmw_est(lmw.out1, outcome = "re78") lmw.fit1 summary(lmw.fit1) # MRI regression for ATT after propensity score matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) ## Using a cluster-robust SE with subclass (pair membership) ## as the cluster variable lmw.fit2 <- lmw_est(lmw.out2, outcome = "re78", cluster = ~subclass) lmw.fit2 summary(lmw.fit2) # AIPW for ATE with MRI regression after propensity score # weighting ps <- glm(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, family = binomial)$fitted ipw <- ifelse(lalonde$treat == 1, 1/ps, 1/(1-ps)) lmw.out3 <- lmw(re78 ~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", base.weights = ipw, dr.method = "AIPW") lmw.fit3 <- lmw_est(lmw.out3) lmw.fit3 summary(lmw.fit3) # MRI for multi-category treatment ATE lmw.out3 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "MRI", treat = "treat_multi") lmw.fit3 <- lmw_est(lmw.out3, outcome = "re78") lmw.fit3 summary(lmw.fit3)
data("lalonde") # MRI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "MRI", treat = "treat") lmw.fit1 <- lmw_est(lmw.out1, outcome = "re78") lmw.fit1 summary(lmw.fit1) # MRI regression for ATT after propensity score matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) ## Using a cluster-robust SE with subclass (pair membership) ## as the cluster variable lmw.fit2 <- lmw_est(lmw.out2, outcome = "re78", cluster = ~subclass) lmw.fit2 summary(lmw.fit2) # AIPW for ATE with MRI regression after propensity score # weighting ps <- glm(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, family = binomial)$fitted ipw <- ifelse(lalonde$treat == 1, 1/ps, 1/(1-ps)) lmw.out3 <- lmw(re78 ~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", base.weights = ipw, dr.method = "AIPW") lmw.fit3 <- lmw_est(lmw.out3) lmw.fit3 summary(lmw.fit3) # MRI for multi-category treatment ATE lmw.out3 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "MRI", treat = "treat_multi") lmw.fit3 <- lmw_est(lmw.out3, outcome = "re78") lmw.fit3 summary(lmw.fit3)
Computes the weights implied by an instrumental variable (IV) model that would estimate a weighted difference in outcome means equal to the treatment effect resulting from the supplied model fit with two-stage least squares.
lmw_iv( formula, data = NULL, estimand = "ATE", method = "URI", treat = NULL, iv, base.weights = NULL, s.weights = NULL, obj = NULL, fixef = NULL, target = NULL, target.weights = NULL, contrast = NULL, focal = NULL )
lmw_iv( formula, data = NULL, estimand = "ATE", method = "URI", treat = NULL, iv, base.weights = NULL, s.weights = NULL, obj = NULL, fixef = NULL, target = NULL, target.weights = NULL, contrast = NULL, focal = NULL )
formula |
a one-sided formula with the treatment and covariates on the right-hand side corresponding to the second-stage (reduced form) outcome regression model to be fit. If an outcome variable is supplied on the left-hand side, it will be ignored. This model should not include an IV. See Details for how this formula is interpreted in light of other options. |
data |
a data frame containing the variables named in |
estimand |
the estimand of interest, which determines how covariates
are centered. Should be one of |
method |
the method used to estimate the weights; either |
treat |
the name of the treatment variable in |
iv |
a character vector or one-sided formula containing the names of
the IVs in |
base.weights |
a vector of base weights. See Details. If omitted and
|
s.weights |
a vector of sampling weights. See Details. If omitted and
|
obj |
a |
fixef |
optional; a string or one-sided formula containing the name of
the fixed effects variable in |
target |
a list or data frame containing the target values for each
covariate included in |
target.weights |
a vector of sampling weights to be applied to
|
contrast |
ignored. |
focal |
the level of the treatment variable to be considered "focal"
(i.e., the "treated" level when |
lmw_iv()
computes weights that make the weighted difference in
outcome means between the treatment groups equal to the two-stage least
squares (2SLS) estimate of the treatment effect. formula
corresponds
to the second-stage (reduced form) model, with the treatment replaced by its
fitted values resulting from the first stage model. The first stage is fit
by replacing the treatment in the supplied formula
with the IVs named
in iv
and using the treatment as the outcome. The treatment is
assumed to be endogenous and the supplied instrumental variables assumed to
be instruments conditional on the other covariates, which are assumed to to
be exogenous.
When any treatment-by-covariate interactions are present in formula
or when method = "MRI"
, covariates are centered at specific values to
ensure the resulting weights correspond to the desired estimand as supplied
to the estimand
argument. For the ATE, the covariates are centered at
their means in the full sample. For the ATT and ATC, the covariates are
centered at their means in the treatment or control group (i.e., the
focal
group), respectively. For the CATE, the covariates are centered
according to the argument supplied to target
(see below). Note that
when covariate-by-covariate interactions are present, they will be centered
after computing the interaction rather than the interaction being computed
on the centered covariates unless estimand = "CATE"
, in which case
the covariates will be centered at the values specified in target
prior to involvement in interactions. Note that the resulting effect estimate
does not actually correspond to the estimand supplied unless all effect
heterogeneity is due to the included covariates.
When treatment-by-covariate interactions are included in formula
,
additional instruments will be formed as the product of the supplied IVs and
the interacting covariates. When method = "MRI"
, instruments will be
formed as the product of the supplied IVs and each of the covariates. All
treatment-by-covariate interactions are considered endogenous.
Base weights (base.weights
) and sampling weights (s.weights
)
are similar in that they both involve combining weights with an outcome
regression model. However, they differ in a few ways. Sampling weights are
primarily used to adjust the target population; when the outcome model is
fit, it is fit using weighted least squares, and when target balance is
assessed, it is assessed using the sampling weighted population as the
target population. Centering of covariates in the outcome model is done
using the sampling weighted covariate means. Base weights are primarily used
to offer a second level of balancing beyond the implied regression weights,
i.e., to fit the 2SLS models in the base-weighted sample. Base weights do
not change the target population, so when target balance is assessed, it is
assessed using the unweighted population as the target population.
Some forms of weights both change the target population and provide an extra
layer of balancing, like propensity score weights that target estimands
other than the ATT, ATC, or ATE (e.g., overlap weights), or matching weights
where the target population is defined by the matching (e.g., matching with
a caliper, cardinality matching, or coarsened exact matching). Because these
weights change the target population, they should be supplied to
s.weights
to ensure covariates are appropriately centered. In
lmw_iv()
, whether weights are supplied to base.weights
or
s.weights
will not matter for the estimation of the weights but will
affect the target population in balance assessment.
When both base.weights
and s.weights
are supplied, e.g., when
the base weights are the result of a propensity score model fit with
sampling weights, it is assumed the base weights do not incorporate the
sampling weights; that is, it is assumed that to estimate a treatment effect
without regression adjustment, the base weights and the sampling
weights would have to be multiplied together. This is true, for example, for
the weights in a matchit
or weightit
object (see below) but
not for weights in the output of MatchIt::match.data()
unless called
with include.s.weights = FALSE
or weights resulting from
CBPS::CBPS()
.
Instrumental variable regression weights can be computed in a matched or weighted sample
by supplying a matchit
or weightit
object (from MatchIt
or WeightIt, respectively) to the obj
argument of lmw()
.
The estimand, base weights, and sampling weights (if any) will be taken from
the supplied object and used in the calculation of the implied regression
weights, unless these have been supplied separately to lmw_iv()
. The
weights
component of the supplied object containing the matching or
balancing weights will be passed to base.weights
and the
s.weights
component will be passed to s.weights
. Arguments
supplied to lmw_iv()
will take precedence over the corresponding
components in the obj
object.
Multi-category treatments are not
currently supported for lmw_iv()
.
A fixed effects variable can be supplied to the
fixef
argument. This is equivalent to adding the fixed effects
variable as an exogenous predictor that does not interact with the
treatment, IV, or any other covariate. The difference is that computation is
much faster when the fixed effect has many levels because demeaning is used
rather than including the fixed effect variable as a collection of dummy
variables. When using URI, the weights will be the same regardless of
whether the fixed effect variable is included as a covariate or supplied to
fixef
; when using MRI, results will differ because the fixed effect
variable does not interact with treatment. The fixed effects variable will
not appear in the summary.lmw()
output (but can be added using
addlvariables
argument) or in the model output of lmw_est()
or
summary.lmw_est()
. Because it does not interact with the
treatment, the distribution of the fixed effect variable may not correspond
to the target population, so caution should be used if it is expected the
treatment effect varies across levels of this variable (in which case it
should be included as a predictor). Currently only one fixed effect variable
is allowed.
An lmw_iv
object, which inherits from lmw
objects and
contains the following components:
treat |
the treatment variable, given as a factor. |
weights |
the computed implied regression weights. |
covs |
a data frame containing the covariates included the model formula. |
estimand |
the requested estimand. |
method |
the method
used to estimate the weights ( |
base.weights |
the weights supplied to |
s.weights |
the weights supplied to |
call |
the
original call to |
fixef |
the fixed effects variable
if supplied to |
formula |
the model formula. |
iv |
the instrumental variables, given as a one-sided formula. |
target |
the supplied covariate target values when |
contrast |
the contrasted treatment groups. |
focal |
the focal treatment levels when
|
All functions that lack a specific lmw_iv
method work with
lmw_iv
objects as they do for lmw
objects, such as
summary.lmw()
, plot.lmw()
, etc.
Chattopadhyay, A., & Zubizarreta, J. R. (2023). On the implied weights of linear regression for causal inference. Biometrika, 110(3), 615–629. doi:10.1093/biomet/asac058
summary.lmw()
for summarizing balance and
representativeness; plot.lmw()
for plotting features of the
weights; lmw_est()
for estimating treatment effects from
lmw_iv
objects; influence.lmw()
for influence measures;
ivreg()
in the ivreg package for fitting 2SLS models.
# URI for the ATT using instrument `Ins` lmw.out <- lmw_iv(~ treat + age + education + race + re74, data = lalonde, estimand = "ATT", method = "URI", treat = "treat", iv = ~Ins) lmw.out summary(lmw.out, addlvariables = ~married + re75)
# URI for the ATT using instrument `Ins` lmw.out <- lmw_iv(~ treat + age + education + race + re74, data = lalonde, estimand = "ATT", method = "URI", treat = "treat", iv = ~Ins) lmw.out summary(lmw.out, addlvariables = ~married + re75)
Produces plots to diagnose properties of the weights, including their distribution, to what degree the distribution of covariates involves extrapolation in the weighted sample, and how much influence each unit has on the effect estimate.
## S3 method for class 'lmw' plot(x, type = "weights", ...)
## S3 method for class 'lmw' plot(x, type = "weights", ...)
x |
an |
type |
the type of plot to display. Allowable options include
|
... |
further arguments passed to specific types of plots. When
Other arguments are passed to When
When
|
When type = "weights"
, plot.lmw()
produces a density plot of
the weights within each treatment group. By construction, these weights will
have a mean of 1. Some weights may be negative. The effective sample size
(ESS) and original sample size (N) will be displayed in the upper right
corner of the plot when ess = TRUE
.
When type = "extrapolation"
, plot.lmw()
produces a plot of the
distribution of weights and covariates for each treatment group. Each dot
represents a unit, with values arranged on the x-axis according to their
covariate value and the size of the dots corresponding to the magnitude of
the weight. Units with positive weights are displayed in black in the upper
portion of the plot, and units with negative weights are displayed in red in
the lower portion. Having many and large red points indicates a high degree
of extrapolation. All points are equally transparent, so darker regions
indicate multiple points with the same value. The vertical lines indicates
the weighted mean of the covariate in each group, and the X indicates the
mean of the covariate in the target sample as determined by the
estimand
argument in the original call to lmw()
. A large
discrepancy between the vertical lines and Xs indicates a lack of balance
between the treatment group and target sample. When estimand = "CATE"
in the original call to lmw()
, any variables supplied to variables
that were not given a target value will not have the target mean displayed.
When type = "influence"
, plot.lmw()
produces a plot of the
scaled sample influence curve (SIC) for each unit by index. It does so by
calling influence.lmw()
, which fits the outcome model to extract
residuals and compute the SIC as SIC = (N-1) * w * r / (1 - h)
, where
N
is the sample size, w
are the units' implied regression
weights, r
are the residuals, and h
are the hat values. SIC
values are scaled to have a maximum of 1. Higher values indicate greater
relative influence.
A plot is displayed, and x
is invisibly returned.
lmw()
, summary.lmw()
,
plot.summary.lmw()
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 # Distribution of weights plot(lmw.out1, type = "weights") # Extrapolation/representativeness for age and married plot(lmw.out1, type = "extrapolation", variables = ~age + married) # Extrapolation/representativeness for race plot(lmw.out1, type = "extrapolation", variables = ~race) # Influence for re78 outcome plot(lmw.out1, type = "influence", outcome = "re78")
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 # Distribution of weights plot(lmw.out1, type = "weights") # Extrapolation/representativeness for age and married plot(lmw.out1, type = "extrapolation", variables = ~age + married) # Extrapolation/representativeness for race plot(lmw.out1, type = "extrapolation", variables = ~race) # Influence for re78 outcome plot(lmw.out1, type = "influence", outcome = "re78")
lmw_est
objectProduces plots to diagnose the regression model fit to estimate the
treatment effect. These include an influence plot based on the sample
influence curve (SIC) and the regression diagnostics plots available for
lm
objects in plot.lm()
.
## S3 method for class 'lmw_est' plot(x, type = "influence", ...)
## S3 method for class 'lmw_est' plot(x, type = "influence", ...)
x |
an |
type |
the type of plot to display. Allowable options include
|
... |
When
When |
When type = "influence"
, plot.lmw_est()
produces a plot of the
scaled sample influence curve (SIC) for each unit by index. It does so by
calling influence.lmw_est()
, which extract the model residuals
and computes the SIC as SIC = (N-1) * w * r / (1 - h)
, where N
is the sample size, w
are the units' implied regression weights,
r
are the residuals, and h
are the hat values. SIC values are
scaled to have a maximum of 1. Higher values indicate greater relative
influence.
When type = "lm"
, plot.lmw_est()
produces several plots
displayed sequentially according to the arguments supplied to plot()
.
These plots are produced by plot.lm()
to diagnose the
distribution of residuals and other measures of leverage and influence.
A plot is displayed, and x
is invisibly returned.
lmw_est()
, influence.lmw_est()
,
plot.lm()
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.fit1 <- lmw_est(lmw.out1, outcome = "re78") lmw.fit1 # Influence using SIC plot(lmw.fit1, type = "influence") # Usual regression diagnostics plot(lmw.fit1, type = "lm", which = 1)
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.fit1 <- lmw_est(lmw.out1, outcome = "re78") lmw.fit1 # Influence using SIC plot(lmw.fit1, type = "influence") # Usual regression diagnostics plot(lmw.fit1, type = "lm", which = 1)
Produces Love plots (also known as dot plots) of balance statistics to
summarize balance visually. The plots are generated using
dotchart()
and points()
.
## S3 method for class 'summary.lmw' plot( x, stats, abs = TRUE, var.order = "data", threshold = NULL, layout = "vertical", ... )
## S3 method for class 'summary.lmw' plot( x, stats, abs = TRUE, var.order = "data", threshold = NULL, layout = "vertical", ... )
x |
a |
stats |
a vector of the names of the columns in the |
abs |
|
var.order |
how the variables should be ordered. Allowable options
include |
threshold |
numeric values at which to place vertical lines indicating
a balance threshold. These can make it easier to see for which variables
balance has been achieved given a threshold. Multiple values can be supplied
to add multiple lines. When |
layout |
how the multiple plots should be laid out. Allowable options
include |
... |
further arguments passed to |
Love plots will be produced for the requested statistics in the
summary.lmw
output. How these plots are arranged depends on the value
supplied to layout
, which uses layout()
to arrange the
plots.
A plot is displayed, and x
is invisibly returned.
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 (s <- summary(lmw.out1)) plot(s) plot(s, stats = "SMD", abs = FALSE)
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 (s <- summary(lmw.out1)) plot(s) plot(s, stats = "SMD", abs = FALSE)
lmw
objectComputes balance statistics for an lmw
object created by
lmw()
. Balance involves not only the similarity between the treatment
groups but also the similarity between each treatment group and the target
population.
## S3 method for class 'lmw' summary( object, un = TRUE, addlvariables = NULL, standardize = TRUE, data = NULL, stat = "balance", ... ) ## S3 method for class 'lmw_multi' summary( object, un = TRUE, addlvariables = NULL, standardize = TRUE, data = NULL, contrast = NULL, stat = "balance", ... ) ## S3 method for class 'summary.lmw' print(x, digits = max(3, getOption("digits") - 4), ...)
## S3 method for class 'lmw' summary( object, un = TRUE, addlvariables = NULL, standardize = TRUE, data = NULL, stat = "balance", ... ) ## S3 method for class 'lmw_multi' summary( object, un = TRUE, addlvariables = NULL, standardize = TRUE, data = NULL, contrast = NULL, stat = "balance", ... ) ## S3 method for class 'summary.lmw' print(x, digits = max(3, getOption("digits") - 4), ...)
object |
an |
un |
|
addlvariables |
additional variables for which balance statistics are to
be computed along with the covariates in the |
standardize |
|
data |
a optional data frame containing variables named in
|
stat |
|
... |
ignored. |
contrast |
for multi-category treatments with |
x |
a |
digits |
the number of digits to print. |
summary.lmw()
produces covariate balance or distribution
statistics and effective samples sizes before and after adjustment by the
regression weights and base weights, if supplied. For each covariate, the
following balance statistics are computed when stat = "balance"
:
SMD
- the standardized mean difference (SMD) between the
treated and control groups
TSMD Treated
- the target
standardized mean difference (TSMD) between the treated group and target
sample
TSMD Control
- the TSMD between between the control
group and target sample
KS
- the Kolmogorov-Smirnov (KS)
statistic between the treated and control groups
TKS Treated
-
the target KS (TKS) statistic between the treated group and target sample
TKS Control
- the TKS statistic between the control group and
target sample
For multi-category treatments with method = "MRI"
, balance statistics are
are computed between each treatment group and the target sample.
When stat = "distribution"
the mean and standard deviation of each
covariate is compute before and after adjustment and for the target sample.
(Standard deviations are printed in parentheses for visual clarity.)
After weighting with the regression weights, the mean difference between
the treated and control groups of each covariate included in the original
call to lmw()
will be equal to zero. However, the mean difference between
each treatment group and the target sample may not be equal to zero when
method = "URI"
in the call to lmw()
, and covariates supplied to
addlvariables
not included in the call to lmw()
may not be well
balanced.
When s.weights
are supplied to lmw()
, the unadjusted statistics (if
requested) will incorporate the sampling weights. When base.weights
are
supplied to lmw()
, the unadjusted statistics will not incorporate the
base weights; rather, balance with base weights applied (if supplied) will
be produced in a separate balance table (see Value below).
SMDs are computed as the difference between the (weighted) means divided by
a standardization factor, which is the standard deviation of the covariate
in the target sample. When estimand = "ATT"
in the call to lmw()
, the
standardization factor is the standard deviation in the treated group; when
estimand = "ATC"
, the standardization factor is the standard deviation in
the control group; when estimand = "ATE"
or when estimand = "CATE"
and
a target profile is supplied, the standardization factor is the square root
of the average of the variances of both treatment groups; when estimand = "CATE"
and a target dataset is supplied, the standardization factor is the
standard deviation in the target dataset. When s.weights
is supplied, the
standardization factor is computed including the sampling weights;
otherwise it is computed in the unweighted sample.
For binary covariates, the KS statistic is equal to the unstandardized difference in means and is computed as such.
When estimand = "CATE"
in the original call to lmw()
, any variables
supplied to addlvariables
that were not given a target value will not
have any target statistics computed (e.g., TSMD, TKS, target means, etc.).
The effective sample size (ESS) is computed within each group as .
With uniform weights, this is equal to the sample size.
A summary.lmw
object, which contains the following components:
call |
The original call to |
nn |
The (effective) sample sizes before and after weighting. |
bal.un |
When |
bal.base.weighted |
When |
bal.weighted |
When |
dist.un |
When |
dist.base.weighted |
When
|
dist.weighted |
When |
method |
The method used to estimate the weights (i.e., URI or MRI) |
base.weights.origin |
If base weights were supplied through the
|
With multi-category treatments and method = "MRI"
, the object will also
inherit from class summary.lmw_multi
.
lmw()
for computing the implied regression weights,
plot.summary.lmw()
for plotting the balance statistics in a Love plot,
plot.lmw()
for assessing the representativeness and extrapolation of the
weights
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 summary(lmw.out1) summary(lmw.out1, stat = "distribution") # Adding additional variables to summary, removing unweighted summary(lmw.out1, un = FALSE, addlvariables = ~I(age^2) + I(nodegree*re74)) # MRI regression for ATT after PS matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) lmw.out2 summary(lmw.out2) # MRI for a multi-category treatment ATE lmw.out3 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "MRI", treat = "treat_multi") lmw.out3 summary(lmw.out3) summary(lmw.out3, contrast = c("2", "1"))
data("lalonde") # URI regression for ATT lmw.out1 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "URI", treat = "treat") lmw.out1 summary(lmw.out1) summary(lmw.out1, stat = "distribution") # Adding additional variables to summary, removing unweighted summary(lmw.out1, un = FALSE, addlvariables = ~I(age^2) + I(nodegree*re74)) # MRI regression for ATT after PS matching m.out <- MatchIt::matchit(treat ~ age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "nearest", estimand = "ATT") lmw.out2 <- lmw(~ treat + age + education + race + married + nodegree + re74 + re75, data = lalonde, method = "MRI", treat = "treat", obj = m.out) lmw.out2 summary(lmw.out2) # MRI for a multi-category treatment ATE lmw.out3 <- lmw(~ treat_multi + age + education + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATE", method = "MRI", treat = "treat_multi") lmw.out3 summary(lmw.out3) summary(lmw.out3, contrast = c("2", "1"))
lmw_est
fitssummary()
computes the treatment effect and potential outcome mean
estimates from the supplied lmw_est
object. It functions similarly to
summary.lm()
in producing estimate tables with the estimates,
standard errors, t-statistics, and p-values. Other model statistics can be
additionally requested.
## S3 method for class 'lmw_est_aipw' summary(object, model = FALSE, ci = TRUE, alpha = 0.05, ...) ## S3 method for class 'lmw_est' summary(object, model = FALSE, ci = TRUE, alpha = 0.05, ...)
## S3 method for class 'lmw_est_aipw' summary(object, model = FALSE, ci = TRUE, alpha = 0.05, ...) ## S3 method for class 'lmw_est' summary(object, model = FALSE, ci = TRUE, alpha = 0.05, ...)
object |
an |
model |
|
ci |
|
alpha |
when |
... |
ignored. |
summary.lmw_est()
produces a table of treatment effect estimates
corresponding to all possible pairwise contrasts between the treatment
levels. These treatment effects generalize to the population implied by the
regression weights, which depends on the supplied estimand, whether sampling
weights were provided, and which of the MRI or URI models was requested. The
treatment effects are computed using linear contrasts of the outcome model
coefficients.
When method = "MRI"
, the potential outcome mean estimates are also
reported. These correspond to the potential outcome means in the population
implied by the regression weights. When method = "URI"
, only the
treatment effects are estimated; the model-implied outcome means do not
correspond to the potential outcome means for the population implied by the
regression weights. That is, while the treatment effect generalizes to the
population defined by the regression weights, the estimated potential
outcome means do not and so are not reported.
When model = TRUE
, the model coefficients and their tests statistics
are additionally produced. It is inappropriate to interpret or report these
values as they have no causal interpretation. This is especially true when
using AIPW, as the model coefficients do not incorporate the augmentation
terms.
A summary.lmw_est
object with the following components:
call |
the original call to |
means |
a matrix
containing the estimated potential outcome means, their standard errors,
confidence interval limits (if requested with |
coefficients |
a matrix containing the treatment effect estimates and
their standard errors, t-statistics, and p-values.When |
model.coefficients |
when |
aliased |
when |
sigma , df , r.squared , adj.r.squared
|
the residual standard deviation, degrees of
freedom components, R-squared, and adjusted R-squared. See
|
Other components containing information for printing are also included.
lmw_est()
for fitting the outcome regression model,
summary.lm()
for more information on the output components
# See examples at `help("lmw_est")`
# See examples at `help("lmw_est")`