Should I Include Covariates in Diff-in-Diff?

I have heard the following enough times that it has registered.  
  
͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   ͏   

| |   
---|---|---  
| | | Forwarded this email? Subscribe here for more  
---  
  
# Should I Include Covariates in Diff-in-Diff?

| | scott cunningham  
---  
| Jun 1| | | ∙| | Preview  
---|---|---  
|   
---  
   
---  
| | |   
---  
| |   
---  
| |   
---  
| |   
---  
| | READ IN APP  
---  
   
  
I have heard the following enough times that it has registered. And it happens among people who are usually fairly seasoned researchers. So both the frequency and the speaker has made me think it's probably a common enough belief. And that is this:

> If I include covariates, and my diff-in-diff estimates change, then I do not believe the diff-in-diff estimates.

It comes in many forms, but that's usually it in a nutshell. And today I want to just write what is probably going to be the first of a few substacks on it, but I'm going to try and be brief, which will require doing a couple of these. But first, I flipped a coin 3 times, it came up head all three times, and therefore this will be paywalled (eventually below it will be).

| |   
---|---|---  
  
Thanks again for your support! If you're dying to learn more about the importance of including covariates in diff-in-diff, then consider becoming a paying subscriber! At $5/month, which is the absolute bare minimum Substack allows me to charge, it's a steal!

Scott's Mixtape Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Upgrade to paid

* * *

### Why do you include covariates in diff-in-diff?

It is well known that diff-in-diff has one key assumption called parallel trends. And if you satisfy it, you don't need to include any covariates as controls. Let me start with an illustration of what it means to satisfy parallel trends. Our outcome will be earnings, and I will have compare college educated workers (our treatment group) with high school only workers (out control group). We will represent untreated potential outcome as Y(0) and the treated outcome as Y(1), and therefore a treatment effect as Y(1) - Y(0). 

First, let's say that men's high school only earnings grows +10 a year, but female's high school only earnings grow +8 a year (euros, dollars, pounds, anything). We can write this as: 

where _M_ is a dummy variable equalling 1 if biologically male and 0 if biological female, _alpha_ is a level constant that can be different for males and females if we wanted, and the _epsilon_ is in expectation zero. Hence when _M=1_ , then _E[Y(0)]_ grows at a rate of 10, and when _M=0_ , then _E[Y(0)]_ grows at a rate of 8. Notice that this is an outcome model. It states that there is a "return" to being a male, a "return" to being a female, but that it is not the same.

But subtly, notice also that that return is the same whether you are treated or not. If you are treated, then of course we never see _Y(0)_. We only see _Y(1)_. But that just means that for college educated workers, _Y(0)_ is counterfactual. 

And in this outcome model, we are saying that high school only males have different trends than females -- not just different levels (i.e., alpha) but trends. 

* * *

### Balanced 

Second, let's say that 75% of our college educated workers are males and 75% of our high school educated workers are males. First, let's take a first difference for everyone in the sample. 

When we take expectations, we get:

Note that the _alpha_ dropped out because it was a constant for each person _i_. So even if we allowed males and females to make different baseline earnings, the first difference wipes them out. It just doesn't wipe out the effect of sex on _trends_. That's the key here. 

Now, recall I said that the two groups were balanced. 75% of the treatment group was college educated and 75% of the control group was educated. This means that we can can calculate using that equation the trend in average earnings for both groups, and since it does not depend on treatment status, the trend will be the same. And it will be 9.5. And that is because 8+2 x 0.75 = 8 + 1.5 = 9.5. 

So the two groups are balanced, they both grow at 9.5, and thus the college group and the high school group satisfy _unconditional parallel trends_ and as a result, _you do not need to control for sex_ in your diff-in-diff. You do not because every 2x2 is equal to this:

And since we just showed that there isn't a parallel trends bias, the 2x2 is an unbiased and consistent estimate of the ATT. Done...

## Continue reading this post for free in the Substack app

Claim my free post

Or upgrade your subscription. **Upgrade to paid**

   
---  
| | | Like  
---  
| | Comment  
---  
| | Restack  
---  
   
  
(C) 2026 scott cunningham  
910 North 17th Street, Waco, Texas 76707   
Unsubscribe