Course Notes: A Crash Course on Causality -- Week 2: Confounding and Directed Acyclic Graphs (DAGs)

For the pdf slides, click here

Confounding

Confounding

  • Confounders: variables that affect both the treatment and the outcome

    • If we assign treatment based on a coin flip, since the coin flip doesn’t affect the outcome, it’s not a confounder

    • If older people are at higher risk of heart disease (the outcome) and are more likely to receive the treatment, then age is a confounder

  • To control for confounders, we need to

    1. Identify a set of variables \(X\) that will make the ignorability assumption hold
    • Causal graphs will help answer this question
    1. Use statistical methods to control for these variables and estimate causal effects

Causal Graphs

Overview of graphical models

  • Encode assumption about relationship among variables

    • Tells use which variables are independent, dependent, conditionally independent, etc

Terminologies of Directed Acyclic Graphs (DAGs)

Terminology of graphs

  • Directed graph: shows that \(A\) affects \(Y\)

  • Undirected graph: \(A\) and \(Y\) are associated with each other

  • Nodes or vertices: \(A\) and \(Y\)

    • We can think of them as variables
  • Edge: the link between \(A\) and \(Y\)

  • Directed graph: all edges are directed

  • Adjacent variables: if connected by an edge

Paths

  • A path is a way to get from one vertex to another, traveling along edges

  • There are 2 paths from \(W\) to \(B\):
    • \(W \rightarrow Z \rightarrow B\)
    • \(W \rightarrow Z \rightarrow A \rightarrow B\)

    Directed Acyclic Graphs (DAGs)

    • No undirected paths

  • No cycles

  • This is a DAG

More terminology

  • \(A\) is \(Z\)’s parent
  • \(D\) has two parents, \(B\) and \(Z\)
  • \(B\) is a child of \(Z\)
  • \(D\) is a descendant of \(A\)
  • \(Z\) is a ancestor of \(D\)

Relationship between DAGs and probability distributions

DAG example 1

  • C is independent of all variables \[ P(C\mid A, B, D) = P(C) \]

  • \(B\) and \(C, D\) are independent, conditional on \(A\) \[ P(B\mid A, C, D) = P(B\mid A) \Longleftrightarrow B \perp C, D \mid A \]

  • \(B\) and \(D\) are marginally dependent \[ P(B\mid D) \neq P(B) \]

DAG example 2

  • \(A\) and \(B\) are independent, conditional on \(C\) and \(D\) \[ P(A\mid B, C, D) = P(A\mid C, D) \Longleftrightarrow A \perp B \mid C, D \]

  • \(C\) and \(D\) are independent, conditional on \(A\) and \(B\) \[ P(D\mid A, B, C) = P(D\mid A, B) \Longleftrightarrow D \perp C \mid A, B \]

Decomposition of joint distributions

  1. Start with roots (nodes with no parents)

  2. Proceed down the descendant line, always conditioning on parents

  • \(P(A, B, C, D) = P(C)P(D)P(A\mid D)P(B\mid A)\)

  • \(P(A, B, C, D) = P(D)P(A\mid D)P(B\mid D)P(C\mid A, B)\)

Compatibility between DAGs and distributions

  • In the above examples, the DAGs admit the probability factorizations. Hence, the probability function and the DAG are compatible

  • DAGs that are compatible with a particular probability function are not necessarily unique

  • Example 1:

  • Example 2:

  • In both of the above examples, \(A\) and \(B\) are dependent, i.e., \(P(A, B) \neq P(A) P(B)\)

Types of paths, blocking, and colliders

Types of paths

  • Forks

  • Chains

  • Inverted forks

When do paths induce associations?

  • If nodes \(A\) and \(B\) are on the ends of a path, then they are associated (via this path), if

    • Some information flows to both of them (aka Fork), or
    • Information from one makes it to the other (aka Chain)
  • Example: information flows from \(E\) to \(A\) and \(B\)

  • Example: information from \(A\) makes it to \(B\)

Paths that do not induce association

  • Information from \(A\) and \(B\) collide at \(G\)

  • \(G\) is a collider

  • \(A\) and \(B\) both affect \(G\):

    • Information does not flow from \(G\) to either \(A\) or \(B\)
    • So \(A\) and \(B\) are independent (if this is the only path between them)
  • If there is a collider anywhere on the path from \(A\) to \(B\), then no association between \(A\) and \(B\) comes from this path

Blocking on a chain

  • Paths can be blocked by conditioning on nodes in the path

  • In the graph below, \(G\) is a node in the middle of a chain. If we condition on \(G\), then we block the path from \(A\) to \(B\)

  • For example, \(A\) is the temperature, \(G\) is whether sidewalks are icy, and \(B\) is whether someone falls
    • \(A\) and \(B\) are associated marginally
    • But if we conditional on the sidewalk condition \(G\), then \(A\) and \(B\) are independent

Blocking on a fork

  • Associations on a fork can also be blocked

  • In the following fork, if we condition on \(G\), then the path from \(A\) to \(B\) is block

No need to to block a collider

  • The opposite situation occurs if a conllider is blocked

  • In the following inverted fork

    • Originally \(A\) and \(B\) are not associated, since information collides at \(G\)
    • But if we condition on \(G\), then \(A\) and \(B\) become associated
  • Example: \(A\) and \(B\) are the states of two on/off switches, and \(G\) is whether the lightbulb is lit up.

    • The two switches \(A\) and \(B\) are determined by two independent coin flips

    • \(G\) is lit up only if both \(A\) and \(B\) are in the on state

    • Conditional on \(G\), the two switches are not independent: if \(G\) is off, then \(A\) must be off if \(B\) is on

d-separation

d-separation

  • A path is d-separated by a set of nodes \(C\) if

    • It contains a chain (\(D\rightarrow E \rightarrow F\)) and the middle part is in \(C\), or

    • It contains a fork (\(D\leftarrow E \rightarrow F\)) and the middle part is in \(C\), or

    • It contains an inverted fork (\(D\rightarrow E \leftarrow F\)), and the middle part is not in \(C\), nor are any descendants of it

  • Two nodes, \(A\) and \(B\), are d-separated by a set of nodes \(C\) if it blocks every path from \(A\) to \(B\). Thus \[ A\perp B \mid C \]

  • Recall the ignorability assumption \[ Y^0, Y^1 \perp A \mid X \]

Confounders on paths

  • A simple DAG: \(X\) is a confounder between the relationship between treatment \(A\) and outcome \(Y\)

  • A slightly more complicated graph

    • \(V\) affects \(A\) directly
    • \(V\) affects \(Y\) indirectly, through \(W\)
    • Thus, \(V\) is a confounder

Frontdoor and backdoor paths

Frontdoor paths

  • A frontdoor path from \(A\) to \(Y\) is one that begins with an arrow emanating out of \(A\)

  • We do not worry about frontdoor paths, because they capture effects of treatment

  • Example: \(A\rightarrow Y\) is a frontdoor path from \(A\) to \(Y\)

  • Example: \(A\rightarrow Z \rightarrow Y\) is a frontdoor path from \(A\) to \(Y\)

Do not block nodes on the frontdoor path

  • If we are interested in the causal effect of \(A\) on \(Y\), we should not control for (aka block) \(Z\)

    • This is because controlling for \(Z\) would be controlling for an affect of treatment

  • Causal mediation analysis involves understanding frontdoor paths from \(A\) and \(Y\)

Backdoor paths

  • Backdoor paths from treatment \(A\) to outcome \(Y\) are paths from \(A\) to \(Y\) that travels through arrows going into \(A\)

  • Here, \(A \leftarrow X \rightarrow Y\) is a backdoor path from \(A\) to \(Y\)

  • Backdoor paths confound the relationship between \(A\) and \(Y\), so they need to be blocked!

  • To sufficiently control for confounding, we must identify a set of variables that block all backdoor paths from treatment to outcome

    • Recall the ignorability: if \(X\) is this set of variables, then \(Y^0, Y^1 \perp A \mid X\)

Criteria

  • Next we will discuss two criteria to identify sets of variables that are sufficient to control for confounding

    • Backdoor path criterion: if the graph is known
    • Disjunctive cause criterion: if the graph is not known

Backdoor path criterion

Backdoor path criterion

  • Backdoor path criterion: a set of variables \(X\) is sufficient to control for confounding if

    • It blocks all backdoor paths from treatment to the outcome, and
    • It does not include any descendants of treatment
  • Note: the solution \(X\) is not necessarily unique

Backdoor path criterion: a simple example

  • There is one backdoor path from \(A\) to \(Y\)

    • It is not blocked by a collider
  • Sets of variables that are sufficient to control for confounding:

    • \(\{V\}\), or
    • \(\{W\}\), or
    • \(\{V, W\}\)

Backdoor path criterion: a collider example

  • There is one backdoor path from \(A\) to \(Y\)

    • It is blocked by a collider \(M\), so there is no confounding
  • If we condition on \(M\), then it open a path between \(V\) and \(W\)

  • Sets of variables that are sufficient to control for confounding:
    • \(\{\}\), \(\{V\}\), \(\{W\}\), \(\{M, V\}\), \(\{M, W\}\), \(\{M, V, W\}\)
    • But not \(\{M\}\)

Backdoor path criterion: a multi backdoor paths example

  • First path: \(A \leftarrow Z \leftarrow V \rightarrow Y\)

    • No collider on this path
    • So controlling for either \(Z\), \(V\), or both is sufficient
  • Second path: \(A \leftarrow W \rightarrow Z \leftarrow V \rightarrow Y\)

    • \(Z\) is a collider
    • So controlling \(Z\) opens a path between \(W\) and \(V\)
    • We can block \(\{V\}\), \(\{W\}\), \(\{Z, V\}\), \(\{Z, W\}\), or \(\{Z, V, W\}\)
  • To block both paths, it’s sufficient to control for

  • \(\{V\}\), \(\{Z, V\}\), \(\{Z, W\}\), or \(\{Z, V, W\}\)
    • But not \(\{Z\}\) or \(\{W\}\)

    Disjunctive cause criterion

    Disjunctive cause criterion

    • For many problems, it is difficult to write down accurate DAGs
  • In this case, we can use the disjunctive cause criterion: control for all observed causes of the treatment, the outcome, or both

  • If there exists a set of observed variables that satisfy the backdoor path criterion, then the variables selected based on the disjunctive cause criterion are sufficient to control for confounding

  • Disjunctive cause criterion does not always select the smallest set of variable to control for, but it is conceptually simple

Example

  • Observed pre-treatment variables: \(\{M, W, V\}\)

    • Unobserved pre-treatment variables: \(\{U_1, U_2\}\)

    • Suppose we know: \(W, V\) are causes of \(A\), \(Y\) or both

  • Suppose \(M\) is not a cause of either \(A\) or \(Y\)

    • Comparing two methods for selecting variables
  1. Use all pre-treatment covariates: \(\{M, W, V\}\)
  2. Use variables based on disjunctive cause criterion: \(\{W, V\}\)

Example continued: hypothetical DAG 1

  1. Use all pre-treatment covariates: \(\{M, W, V\}\)
  • Satisfy backdoor path criterion? Yes
  1. Use variables based on disjunctive cause criterion: \(\{W, V\}\)
  • Satisfy backdoor path criterion? Yes

Example continued: hypothetical DAG 2

  1. Use all pre-treatment covariates: \(\{M, W, V\}\)
  • Satisfy backdoor path criterion? Yes
  1. Use variables based on disjunctive cause criterion: \(\{W, V\}\)
  • Satisfy backdoor path criterion? Yes

Example continued: hypothetical DAG 3

  1. Use all pre-treatment covariates: \(\{M, W, V\}\)
  • Satisfy backdoor path criterion? No
  1. Use variables based on disjunctive cause criterion: \(\{W, V\}\)
  • Satisfy backdoor path criterion? Yes

Example continued: hypothetical DAG 4

  1. Use all pre-treatment covariates: \(\{M, W, V\}\)
  • Satisfy backdoor path criterion? No
  1. Use variables based on disjunctive cause criterion: \(\{W, V\}\)
  • Satisfy backdoor path criterion? No

References