2  Heteroskedasticity

Non-Constant Error Variance and Robust Inference

Heteroskedasticity
Inference
Cross-Section
Author

Jake Anderson

Published

March 3, 2026

Modified

March 4, 2026

This chapter covers what happens when the variance of our errors is not constant — a violation called heteroskedasticity — and what we can do about it.

2.1 Motivation

Imagine you’re studying how food expenditure changes with income. For low-income households, spending is tightly clustered — there’s only so much variation when your budget is small. But for high-income households, some people eat out every night while others save aggressively. The spread of food expenditure grows with income.

This is heteroskedasticity: the variance of the error term changes across observations. In our regression framework:

\[ y_i = \beta_0 + \beta_1 x_i + e_i, \quad \text{Var}(e_i) = \sigma_i^2 \]

When \(\sigma_i^2\) is not the same for all \(i\), we have a problem.

2.2 What Goes Wrong?

Here’s the good news and the bad news:

  • Good news: OLS is still unbiased and consistent. The coefficients themselves are fine.
  • Bad news: The usual standard errors are wrong. This means:
    • Confidence intervals have incorrect coverage
    • t-tests reject too often (or too rarely)
    • F-tests are unreliable

In short: your coefficient estimates are okay, but everything you say about them (significance, confidence) could be completely off.

WarningWhy This Is Sneaky

Heteroskedasticity doesn’t show up in your coefficient table — the estimates look normal. It only reveals itself when you look at the residuals or run a formal test. This is why diagnostic checks are essential.

2.3 Detecting Heteroskedasticity

2.3.1 Visual Inspection

The simplest check is plotting residuals against fitted values. Look for:

  • Funnel shape: Spread increases (or decreases) with \(\hat{y}\)
  • Systematic patterns: Any non-random structure in the spread

A flat, random cloud of residuals = good. A funnel or fan = heteroskedasticity.

2.3.2 Breusch-Pagan Test

The Breusch-Pagan (BP) test formalizes the visual check. The idea: if the variance depends on the regressors, then the squared residuals should be predictable from \(x\).

Steps:

  1. Estimate the model by OLS and get residuals \(\hat{e}_i\)
  2. Regress \(\hat{e}_i^2\) on the regressors: \(\hat{e}_i^2 = \alpha_0 + \alpha_1 x_{i1} + \cdots + \alpha_k x_{ik} + v_i\)
  3. Compute \(\chi^2 = N \times R^2\) from step 2
  4. Compare to \(\chi^2_{(k)}\) — reject if test stat exceeds critical value

\[ H_0: \alpha_1 = \alpha_2 = \cdots = \alpha_k = 0 \quad \text{(homoskedasticity)} \tag{2.1}\]

\[ H_1: \text{at least one } \alpha_j \neq 0 \quad \text{(heteroskedasticity)} \]

\(\chi^2 = N \times R^2 = 88 \times 0.0847 = 7.45\)

Compare to \(\chi^2_{(2)}\) at 5%: critical value is 5.99. Since \(7.45 > 5.99\), we reject homoskedasticity.

2.3.3 White Test

The White test is more general — it doesn’t assume a specific form for the heteroskedasticity. Instead of regressing \(\hat{e}_i^2\) on just the regressors, it includes their squares and cross-products:

\[ \hat{e}_i^2 = \alpha_0 + \alpha_1 x_1 + \alpha_2 x_2 + \alpha_3 x_1^2 + \alpha_4 x_2^2 + \alpha_5 x_1 x_2 + v_i \]

Same test statistic: \(\chi^2 = N \times R^2\). The White test can detect more general forms of heteroskedasticity, but uses more degrees of freedom.

2.3.4 Non-Constant Variance Test (ncvTest)

R’s ncvTest() tests whether \(\sigma_i^2 = \sigma^2 f(\hat{y}_i)\) — that is, whether the variance is some continuous function of the fitted values. This is useful when you see a funnel shape but aren’t sure which regressor is causing it.

NoteBP vs. ncvTest vs. White
  • BP test: Tests if variance depends on specific regressors \(x_j\)
  • ncvTest: Tests if variance depends on fitted values \(\hat{y}\)
  • White test: Tests using squares and cross-products — most general

Each has its place. Use ncvTest when you see a funnel in residuals vs. fitted values. Use BP when you suspect a specific variable. Use White for a general check.

2.3.5 Goldfeld-Quandt Test

This test splits the data into two groups (e.g., low income vs. high income) and compares the error variance in each group:

  1. Order observations by the suspected variable
  2. Split at the median (or another point)
  3. Estimate separate regressions for each group
  4. Compute \(F = SSE_{\text{high}} / SSE_{\text{low}}\)
  5. If \(F\) is large, reject homoskedasticity

2.4 Fixing Heteroskedasticity

2.4.1 Option 1: Robust Standard Errors

The simplest fix: keep OLS but use heteroskedasticity-consistent (HC) standard errors, also called White or sandwich standard errors.

These give correct inference without changing the coefficients:

  • HC0: White’s original estimator
  • HC1: HC0 with finite-sample correction (multiply by \(N/(N-K)\)) — this is the most commonly used version

In R: coeftest(model, vcov. = hccm(model, type = "hc1"))

Yes! Robust SEs can go either direction. If the variance is higher where \(x\) has less leverage, robust SEs may actually shrink. The key: robust SEs are correct, while OLS SEs are wrong under heteroskedasticity — wrong can mean too big or too small.

2.4.2 Option 2: Weighted Least Squares (WLS)

If you know (or can estimate) the form of the heteroskedasticity, WLS is more efficient than OLS with robust SEs.

Idea: Give less weight to observations with high variance and more weight to observations with low variance.

If \(\text{Var}(e_i) = \sigma^2 x_i^2\), then \(\sigma_i = \sigma x_i\) and the weight is:

\[ w_i = \frac{1}{\sigma_i} = \frac{1}{\sigma x_i} \]

The transformed model divides everything by \(\sigma_i\):

\[ \frac{y_i}{\sigma_i} = \frac{\beta_0}{\sigma_i} + \beta_1 \frac{x_i}{\sigma_i} + \frac{e_i}{\sigma_i} \]

The transformed error \(e_i / \sigma_i\) now has constant variance \(\sigma^2\).

WarningR’s weights Argument

In R’s lm() function, weights = w means R uses \(\sqrt{w_i}\) as the divisor. So if you set weights = 1/income, R divides by \(1/\sqrt{\text{income}}\), not \(1/\text{income}\). This is a common source of confusion on exams.

2.4.3 Option 3: Feasible GLS (FGLS)

When you don’t know the variance form, you can estimate it:

  1. Estimate the model by OLS, get residuals \(\hat{e}_i\)
  2. Model the variance: regress \(\ln(\hat{e}_i^2)\) on \(\ln(x_i)\) to estimate \(\gamma\) in \(\sigma_i^2 = \sigma^2 x_i^\gamma\)
  3. Compute estimated weights \(\hat{w}_i = 1/\hat{\sigma}_i\)
  4. Run WLS with the estimated weights

This two-step procedure is FGLS — “feasible” because we estimated the weights rather than knowing them.

3 HGL Textbook Exercise 8.1: Food Expenditure

A researcher estimates: \[\widehat{\text{food\_exp}} = 83.42 + 10.21 \cdot \text{income}\]

With two sets of standard errors:

Estimate OLS SE White (HC1) SE
Intercept 83.42 43.41 27.46
income 10.21 2.09 1.81

(a) Does heteroskedasticity affect the coefficient estimates?

No. The coefficient estimates (83.42 and 10.21) are identical in both columns. OLS is still unbiased under heteroskedasticity — only the standard errors change.

(b) Compute the 95% CI for \(\beta_1\) using White SEs. Use \(t_c = 2.024\).

\[CI = 10.21 \pm 2.024 \times 1.81 = 10.21 \pm 3.66 = [6.55, 13.87]\]

Common mistake: Using the OLS SE (2.09) instead of the White SE (1.81) gives \([5.97, 14.45]\) — a wider interval that overstates our uncertainty.

(c) A Goldfeld-Quandt test yields \(F = 3.82\) with \(p = 0.003\). Interpret.

The GQ test splits the sample by income and compares error variances. With \(p = 0.003 < 0.05\), we reject homoskedasticity. The variance of food expenditure is significantly larger for high-income households — consistent with the funnel shape we’d expect.

3.1 Slide Deck

TipWhat’s next?

Heteroskedasticity in panel data leads to cluster-robust standard errors. For time-dependent error structures, see Time Series.