9 Fixed Effects: The Intuition
The previous chapter introduced panel data models with equations and estimation procedures. This section takes a step back and builds the intuition visually — why do we need fixed effects, what goes wrong without them, and how the within estimator actually works.
9.1 One Regression Line Isn’t Enough
Imagine a coach tracking training hours and performance for athletes on two teams: Junior Varsity (JV) and Varsity. JV players start with a baseline of 30 skill points; Varsity players start at 60. Both teams have the same return to training — every additional 10 hours adds 1 performance point (slope = 0.1).
If we ignore team membership and run a single regression, we force one intercept on data that has two:
\[ \text{Performance}_i = \beta_0 + \beta_1 \, \text{Hours}_i + \varepsilon_i \]
A single \(\beta_0\) cannot be right for both groups. JV’s true intercept is 30 and Varsity’s is 60. One number has to compromise.
When the two teams train the same amount on average, the OLS slope is approximately correct — you get lucky. But this is fragile. It breaks the moment the distribution of training hours differs across groups.
9.2 Omitted Variable Bias in Disguise
Suppose Varsity players train more than JV players — maybe because they’re more motivated, or because their coaches push harder. Now the group effect (being Varsity = higher baseline) is positively correlated with training hours. OLS can’t distinguish “this player performs better because they trained more” from “this player performs better because they’re on Varsity.” It conflates the two.
This is omitted variable bias. The omitted variable is group membership. Map it to the OVB formula you already know:
\[ \hat{\beta}_1^{\text{short}} = \hat{\beta}_1^{\text{long}} + \hat{\beta}_2 \times \hat{\delta}_1 \]
where \(\hat{\beta}_2\) is the effect of being Varsity on performance (positive — higher baseline) and \(\hat{\delta}_1\) is the relationship between group membership and training hours. When Varsity trains more, \(\hat{\delta}_1 > 0\), so the bias is positive — OLS overstates the return to training.
Flip it: if JV players train more (catching up), \(\hat{\delta}_1 < 0\) and the bias is negative. OLS might even say training has zero effect, when the true effect is positive.
Same data. Same true slope. The OLS estimate swings wildly depending on which group trains more.
9.3 Class Imbalance Shifts the Intercept
Even when training hours are distributed identically across groups, sample composition causes problems. The OLS intercept is a weighted average of the group intercepts:
\[ \hat{\beta}_0 = \frac{n_{\text{JV}}}{n} \cdot \beta_{0,\text{JV}} + \frac{n_{\text{Var}}}{n} \cdot \beta_{0,\text{Var}} \]
With 80% JV, the intercept is pulled toward 30. With 80% Varsity, it’s pulled toward 60. The slope might be fine, but the single intercept is wrong for every subgroup. And the researcher may not control the sample composition.
9.4 The Fix: Let Each Group Have Its Own Intercept
Every problem above stems from forcing one baseline on data with two. The fix is simple: let each group \(j\) have its own intercept:
\[ y_{ij} = \alpha_j + \beta \, x_{ij} + \varepsilon_{ij} \]
The subscript \(j\) on the intercept does all the work. Instead of one number shared by everyone, \(\alpha_j\) is a different number for each group. This is the core idea behind fixed effects.
9.5 Estimation: The Within Estimator
There are two equivalent ways to estimate FE. The brute-force approach is to include a dummy variable for each group (Least Squares Dummy Variables). That works, but with 500 groups you’d need 500 dummies.
The elegant approach is demeaning. Start with the model:
\[ y_{ij} = \alpha_j + \beta \, x_{ij} + \varepsilon_{ij} \]
Take the group mean of both sides:
\[ \bar{y}_j = \alpha_j + \beta \, \bar{x}_j + \bar{\varepsilon}_j \]
Notice \(\alpha_j\) survives averaging because it doesn’t vary within the group. Now subtract:
\[ y_{ij} - \bar{y}_j = \beta(x_{ij} - \bar{x}_j) + (\varepsilon_{ij} - \bar{\varepsilon}_j) \]
The fixed effect cancels. You’re left with a simple regression on demeaned data — no intercept, no \(\alpha_j\). Just run OLS on the within-group deviations.
This is why it’s called the within estimator: it uses only variation within groups. All between-group differences are absorbed by the fixed effects. The between-group variation — which is exactly where the omitted variable bias lives — is gone.
9.6 What Fixed Effects Can’t Do
The power of FE is that it controls for all time-invariant unobservables. But that power comes with a cost: anything that doesn’t vary within a group gets wiped out by the demeaning. If you want to know whether teaching hospitals have faster recovery, or whether Varsity athletes have a higher ceiling, FE can’t help — those variables are constant within each group, so they vanish when you subtract the group mean.
This is the motivation for random effects, which we’ll see next.