*For the pdf slides, click here*

# Confounding

### Confounding

Confounders: variables that affect both the treatment and the outcome

If we assign treatment based on a coin flip, since the coin flip doesn’t affect the outcome, it’s not a confounder

If older people are at higher risk of heart disease (the outcome) and are more likely to receive the treatment, then age is a confounder

To control for confounders, we need to

- Identify a set of variables \(X\) that will make the ignorability assumption hold

- Causal graphs will help answer this question

- Use statistical methods to control for these variables and estimate causal effects

# Causal Graphs

### Overview of graphical models

Encode assumption about relationship among variables

- Tells use which variables are independent, dependent, conditionally independent, etc

## Terminologies of Directed Acyclic Graphs (DAGs)

### Terminology of graphs

- Directed graph: shows that \(A\) affects \(Y\)

- Undirected graph: \(A\) and \(Y\) are associated with each other

Nodes or vertices: \(A\) and \(Y\)

- We can think of them as variables

Edge: the link between \(A\) and \(Y\)

Directed graph: all edges are directed

Adjacent variables: if connected by an edge

### Paths

A path is a way to get from one vertex to another, traveling along edges

- There are 2 paths from \(W\) to \(B\):
- \(W \rightarrow Z \rightarrow B\)
- \(W \rightarrow Z \rightarrow A \rightarrow B\)

### Directed Acyclic Graphs (DAGs)

- No undirected paths

- No cycles

- This is a DAG

### More terminology

- \(A\) is \(Z\)’s parent
- \(D\) has two parents, \(B\) and \(Z\)
- \(B\) is a child of \(Z\)
- \(D\) is a descendant of \(A\)
- \(Z\) is a ancestor of \(D\)

## Relationship between DAGs and probability distributions

### DAG example 1

C is independent of all variables \[ P(C\mid A, B, D) = P(C) \]

\(B\) and \(C, D\) are independent, conditional on \(A\) \[ P(B\mid A, C, D) = P(B\mid A) \Longleftrightarrow B \perp C, D \mid A \]

\(B\) and \(D\) are marginally dependent \[ P(B\mid D) \neq P(B) \]

### DAG example 2

\(A\) and \(B\) are independent, conditional on \(C\) and \(D\) \[ P(A\mid B, C, D) = P(A\mid C, D) \Longleftrightarrow A \perp B \mid C, D \]

\(C\) and \(D\) are independent, conditional on \(A\) and \(B\) \[ P(D\mid A, B, C) = P(D\mid A, B) \Longleftrightarrow D \perp C \mid A, B \]

### Decomposition of joint distributions

Start with roots (nodes with no parents)

Proceed down the descendant line, always conditioning on parents

- \(P(A, B, C, D) = P(C)P(D)P(A\mid D)P(B\mid A)\)

- \(P(A, B, C, D) = P(D)P(A\mid D)P(B\mid D)P(C\mid A, B)\)

### Compatibility between DAGs and distributions

In the above examples, the DAGs admit the probability factorizations. Hence, the probability function and the DAG are compatible

DAGs that are compatible with a particular probability function are not necessarily unique

Example 1:

- Example 2:

- In both of the above examples, \(A\) and \(B\) are dependent, i.e., \(P(A, B) \neq P(A) P(B)\)

## Types of paths, blocking, and colliders

### Types of paths

- Forks

- Chains

- Inverted forks

### When do paths induce associations?

If nodes \(A\) and \(B\) are on the ends of a path, then they are associated (via this path), if

- Some information flows to both of them (aka Fork), or
- Information from one makes it to the other (aka Chain)

Example: information flows from \(E\) to \(A\) and \(B\)

- Example: information from \(A\) makes it to \(B\)

### Paths that do not induce association

- Information from \(A\) and \(B\) collide at \(G\)

\(G\) is a collider

\(A\) and \(B\) both affect \(G\):

- Information does not flow from \(G\) to either \(A\) or \(B\)
**So \(A\) and \(B\) are independent**(if this is the only path between them)

If there is a collider anywhere on the path from \(A\) to \(B\), then no association between \(A\) and \(B\) comes from this path

### Blocking on a chain

Paths can be blocked by conditioning on nodes in the path

In the graph below, \(G\) is a node in the middle of a chain. If we condition on \(G\), then we block the path from \(A\) to \(B\)

- For example, \(A\) is the temperature, \(G\) is whether sidewalks are icy, and \(B\) is whether someone falls
- \(A\) and \(B\) are associated marginally
- But if we conditional on the sidewalk condition \(G\), then \(A\) and \(B\) are independent

### Blocking on a fork

Associations on a fork can also be blocked

In the following fork, if we condition on \(G\), then the path from \(A\) to \(B\) is block

### No need to to block a collider

- The opposite situation occurs if a conllider is blocked

In the following inverted fork

- Originally \(A\) and \(B\) are not associated, since information collides at \(G\)
- But if we condition on \(G\), then \(A\) and \(B\) become associated

Example: \(A\) and \(B\) are the states of two on/off switches, and \(G\) is whether the lightbulb is lit up.

The two switches \(A\) and \(B\) are determined by two independent coin flips

\(G\) is lit up only if both \(A\) and \(B\) are in the on state

Conditional on \(G\), the two switches are not independent: if \(G\) is off, then \(A\) must be off if \(B\) is on

## d-separation

### d-separation

A path is d-separated by a set of nodes \(C\) if

It contains a chain (\(D\rightarrow E \rightarrow F\)) and the middle part is in \(C\), or

It contains a fork (\(D\leftarrow E \rightarrow F\)) and the middle part is in \(C\), or

It contains an inverted fork (\(D\rightarrow E \leftarrow F\)), and the middle part is not in \(C\), nor are any descendants of it

Two nodes, \(A\) and \(B\), are d-separated by a set of nodes \(C\) if it

**blocks every path**from \(A\) to \(B\). Thus \[ A\perp B \mid C \]Recall the ignorability assumption \[ Y^0, Y^1 \perp A \mid X \]

### Confounders on paths

- A simple DAG: \(X\) is a confounder between the relationship between treatment \(A\) and outcome \(Y\)

A slightly more complicated graph

- \(V\) affects \(A\) directly
- \(V\) affects \(Y\) indirectly, through \(W\)
- Thus, \(V\) is a confounder

## Frontdoor and backdoor paths

### Frontdoor paths

A frontdoor path from \(A\) to \(Y\) is one that

**begins with an arrow emanating out of \(A\)**We do not worry about frontdoor paths, because they capture effects of treatment

Example: \(A\rightarrow Y\) is a frontdoor path from \(A\) to \(Y\)

- Example: \(A\rightarrow Z \rightarrow Y\) is a frontdoor path from \(A\) to \(Y\)

### Do not block nodes on the frontdoor path

If we are interested in the causal effect of \(A\) on \(Y\), we should not control for (aka block) \(Z\)

- This is because controlling for \(Z\) would be controlling for an affect of treatment

- Causal mediation analysis involves understanding frontdoor paths from \(A\) and \(Y\)

### Backdoor paths

Backdoor paths from treatment \(A\) to outcome \(Y\) are paths from \(A\) to \(Y\) that

**travels through arrows going into \(A\)**Here, \(A \leftarrow X \rightarrow Y\) is a backdoor path from \(A\) to \(Y\)

Backdoor paths confound the relationship between \(A\) and \(Y\), so they need to be blocked!

To sufficiently control for confounding, we must identify a set of variables that block all backdoor paths from treatment to outcome

- Recall the ignorability: if \(X\) is this set of variables, then \(Y^0, Y^1 \perp A \mid X\)

### Criteria

Next we will discuss two criteria to identify sets of variables that are sufficient to control for confounding

- Backdoor path criterion: if the graph is known
- Disjunctive cause criterion: if the graph is not known

## Backdoor path criterion

### Backdoor path criterion

Backdoor path criterion: a set of variables \(X\) is sufficient to control for confounding if

- It blocks all backdoor paths from treatment to the outcome, and
- It does not include any descendants of treatment

Note: the solution \(X\) is not necessarily unique

### Backdoor path criterion: a simple example

There is one backdoor path from \(A\) to \(Y\)

- It is not blocked by a collider

Sets of variables that are sufficient to control for confounding:

- \(\{V\}\), or
- \(\{W\}\), or
- \(\{V, W\}\)

### Backdoor path criterion: a collider example

There is one backdoor path from \(A\) to \(Y\)

- It is blocked by a collider \(M\), so there is no confounding

If we condition on \(M\), then it open a path between \(V\) and \(W\)

- Sets of variables that are sufficient to control for confounding:
- \(\{\}\), \(\{V\}\), \(\{W\}\), \(\{M, V\}\), \(\{M, W\}\), \(\{M, V, W\}\)
- But not \(\{M\}\)

### Backdoor path criterion: a multi backdoor paths example

First path: \(A \leftarrow Z \leftarrow V \rightarrow Y\)

- No collider on this path
- So controlling for either \(Z\), \(V\), or both is sufficient

Second path: \(A \leftarrow W \rightarrow Z \leftarrow V \rightarrow Y\)

- \(Z\) is a collider
- So controlling \(Z\) opens a path between \(W\) and \(V\)
- We can block \(\{V\}\), \(\{W\}\), \(\{Z, V\}\), \(\{Z, W\}\), or \(\{Z, V, W\}\)

To block both paths, it’s sufficient to control for

- \(\{V\}\), \(\{Z, V\}\), \(\{Z, W\}\), or \(\{Z, V, W\}\)
- But not \(\{Z\}\) or \(\{W\}\)

## Disjunctive cause criterion

### Disjunctive cause criterion

- For many problems, it is difficult to write down accurate DAGs

In this case, we can use the disjunctive cause criterion: control for all observed causes of the treatment, the outcome, or both

If there exists a set of observed variables that satisfy the backdoor path criterion, then the variables selected based on the disjunctive cause criterion are sufficient to control for confounding

Disjunctive cause criterion does not always select the smallest set of variable to control for, but it is conceptually simple

### Example

Observed pre-treatment variables: \(\{M, W, V\}\)

Unobserved pre-treatment variables: \(\{U_1, U_2\}\)

Suppose we know: \(W, V\) are causes of \(A\), \(Y\) or both

Suppose \(M\) is not a cause of either \(A\) or \(Y\)

- Comparing two methods for selecting variables

- Use all pre-treatment covariates: \(\{M, W, V\}\)
- Use variables based on disjunctive cause criterion: \(\{W, V\}\)

### Example continued: hypothetical DAG 1

- Use all pre-treatment covariates: \(\{M, W, V\}\)

- Satisfy backdoor path criterion? Yes

- Use variables based on disjunctive cause criterion: \(\{W, V\}\)

- Satisfy backdoor path criterion? Yes

### Example continued: hypothetical DAG 2

- Use all pre-treatment covariates: \(\{M, W, V\}\)

- Satisfy backdoor path criterion? Yes

- Use variables based on disjunctive cause criterion: \(\{W, V\}\)

- Satisfy backdoor path criterion? Yes

### Example continued: hypothetical DAG 3

- Use all pre-treatment covariates: \(\{M, W, V\}\)

- Satisfy backdoor path criterion? No

- Use variables based on disjunctive cause criterion: \(\{W, V\}\)

- Satisfy backdoor path criterion? Yes

### Example continued: hypothetical DAG 4

- Use all pre-treatment covariates: \(\{M, W, V\}\)

- Satisfy backdoor path criterion? No

- Use variables based on disjunctive cause criterion: \(\{W, V\}\)

- Satisfy backdoor path criterion? No

### References

Coursera class: “A Crash Course on Causality: Inferring Causal Effects from Observational Data”, by Jason A. Roy (University of Pennsylvania)