Course Notes: A Crash Course on Causality -- Week 4: Inverse Probability of Treatment Weighting (IPTW)

For the pdf slides, click here

Inverse Probability of Treatment Weighting

Motivating example

  • Suppose there is a single confounder X, with propensity scores P(A=1X=1)=0.1,P(A=1X=0)=0.8

  • In propensity score matching, for subjects with X=1, 1 out of 9 controls will be matched to the treated

    • Thus, 1 person in the treated group counts the same as 9 people from the control group

    • So rather than matching, we could use all data, but down-weight each control subject to be just 1/9 of the treated subject

Inverse probability of treatment weighting (IPTW)

  • IPTW weights: inverse of the probability of treatment received

    • For treated subjects, weight by 1/P(A=1X)
    • For control subjects, weight by 1/P(A=0X)
  • In the previous example

    • For X=1, the weight for a treated subject is 1/0.1=10, and the weight for a control subject is 1/0.9=109

    • For X=0, the weight for a treated subject is 1/0.8=54, and the weight for a control subject is 1/0.2=5

  • Motivation: in survey sampling, it is common to oversample some subpopulation, and then use Horvitz-Thompson estimator to estimate population means

Pseudo population

  • IPTW creates a pseudo-population where treatment assignment no longer depend on X

    • So there is no confounding in the pseudo-population

  • In the original population, some people were more likely to get treated based on their X’s

  • In the pseudo-population, everyone is equally likely to get treated, regardless of their X’s

Estimation with IPTW

  • We can estimate E(Y1) as below i=1n1πiAiYii=1n1πiAi

    • where πi=P(Ai=1|Xi) is the propensity score
    • The numerator is the sum of Y’s in treated pseudo-population
    • The denominator is the number of subjects in treated pseudo-population
  • We can estimate E(Y0) as below i=1n11πi(1Ai)Yii=1n11πi(1Ai)

  • Average treatment effect: E(Y1)E(Y0)

Marginal Structural Models

Marginal structural models

  • Marginal structural models (MSM): a model for the mean of the potential outcomes

  • Marginal: not conditional on the confounders (population average)

  • Structural: for potential outcomes, not observed outcomes

Linear MSM and logistic MSM

  • Linear MSM E(Ya)=ψ0+ψ1a,a=0,1

    • E(Y0)=ψ0, E(Y0)=ψ0+ψ1
    • So the average causal effect E(Y1)E(Y0)=ψ1
  • Logistic MSM logit{E(Ya)}=ψ0+ψ1a,a=0,1

    • So the causal odds ratio P(Y1=1)1P(Y1=1)P(Y0=1)1P(Y0=1)=ψ1

MSM with effect modification

  • Suppose V is a variable that modifies the effect of A

  • A linear MSM with effect modification E(YaV)=ψ0+ψ1a+ψ3V+ψ4aV,a=0,1

    • So the average causal effect E(Y1)E(Y0)=ψ1+ψ4V
  • General MSM g{E(YaV)}=h(a,V;ψ)
    • g(): link function
    • h(): a function specifying parametric from of a and V (typically additive, linear)

MSM estimation using pseudo-population

  • Because of confounding, MSM g{E(YaV)}=ψ0+ψ1a is difference from GLM (generalized linear model) g{E(YiAi)}=ψ0+ψ1Ai

  • Pseudo-population (obtained from IPTW) is free of confounding

    • We therefore estimate MSM by solving GLM with IPTW

MSM estimation steps

  1. Estimate propensity score, using logistic regression

  2. Create weights

    • Inverse of propensity score for treated subjects
    • Inverse of one minus propensity score for control subjects
  3. Specify the MSM of interest

  4. Use software to fit a weighted generalized linear model

  5. Use asymptotic (sandwich) variance estimator

    • This accounts for fact that pseudo-population might be larger than sample size

Bootstrap

  • We may also use bootstrap to estimate standard error

  • Bootstrap steps

    1. Randomly sample with replacement from the original sample

    2. Estimate parameters

    3. Repeat steps 1 and 2 many times

    4. Use the standard deviation of the bootstrap estimates as an estimate of the standard error

Assessing covariate balance with weights

Covariate balance check with standardized differences

  • Covariate balance: can be checked on the weighted sample using standardized difference smd=X¯treatmentX¯controlstreatment2+scontrol22

    • Weighted means X¯treatment, X¯control
    • Weighted variances streatment2, scontrol2

Balance check tools

  • Table 1

  • SMD plot

If imbalance after weighting

  • Refine propensity score model

    • Interactions
    • Non-linearity
  • Then reaccess balance

Problems and remedies for large weights

Larger weights lead to more noise

  • For an object with a large weight, its outcome data can greatly affect parameter estimation

  • An object with large weight can also affect standard error estimation, via bootstrap, depending on whether the object is selected or not

  • An extremely large weights means the probability of that treatment is very small, thus a potential violation of the positivity assumption

Check weights via plots and summary statistics

  • Investigate very large weights: identify the subjects with large weights and find what’s unusual about them

Option 1: trimming the tails

  • Large weights: occur in the tails of the propensity score distribution

  • Trim the tails to eliminate some extreme weights

    • Remove treated subjects whose propensity scores are above the 98th percentile from the distribution among controls
    • Remove control subjects whose propensity scores are below the 2nd percentile from the distribution among treated
  • Note: trimming the tails changes the population

Option 2: truncating the weights

  • Another option to deal with large weights is truncation

  • Weight truncation steps

    1. Determine a maximum allowable weight
    • Can be a specific value (e.g., 100)
    • Can based on a percentile (e.g., 99th)
    1. If a weight is greater than the maximum allowable, set it to the maximum allowable value
  • Bias-variance trade-off
    • Truncation: bias, but smaller variance
    • No truncation: unbiased, larger variance
  • Truncating extremely large weights can result in estimators with lower MSE

References