16  Hypothesis Testing in MR

Same Logic, More Coefficients, New Questions

Hypothesis Testing
Multiple Regression
Linear Combinations
Author

Jake Anderson

Published

March 21, 2026

Modified

March 26, 2026

Abstract

Hypothesis testing in multiple regression uses the same \(t\)-test framework as simple regression, with \(N - K\) degrees of freedom. The new capability is testing linear combinations of coefficients, which arise naturally when the quantity of interest (like a marginal effect) involves multiple parameters.

16.1 Individual \(t\)-Tests: What Changes from SLR

The test statistic for a single coefficient is identical to simple regression:

Definition 16.1 (\(t\)-Test in Multiple Regression) \[ t = \frac{b_k - c}{\text{se}(b_k)} \sim t_{(N-K)} \quad \text{under } H_0: \beta_k = c \tag{16.1}\]

The only mechanical change is degrees of freedom: \(N - 2\) becomes \(N - K\), where \(K\) is the total number of parameters (intercept plus all slopes). For the most common test of significance (\(H_0: \beta_k = 0\)), the statistic simplifies to \(t = b_k / \text{se}(b_k)\), and rejection rules follow the usual one-tail or two-tail logic.

Degrees of freedom: Each estimated parameter costs one degree of freedom. With \(K\) parameters, the residual degrees of freedom are \(N - K\). This reduces the power of each individual \(t\)-test.

Test type Rejection rule When to use
Two-sided: \(H_1: \beta_k \neq c\) Reject if \(\lvert t \rvert > t_{c}\) No directional prediction
Right-sided: \(H_1: \beta_k > c\) Reject if \(t > t_c\) Theory predicts positive effect
Left-sided: \(H_1: \beta_k < c\) Reject if \(t < -t_c\) Theory predicts negative effect
\(\alpha\) Two-sided \(t_c\) (large \(N\)) One-sided \(t_c\) (large \(N\))
0.10 1.645 1.282
0.05 1.960 1.645
0.01 2.576 2.326

16.2 Marginal Effects with Polynomial Terms

When the model includes a quadratic term, the individual coefficient on \(x\) is no longer the marginal effect. Consider the wage model:

\[ \text{WAGE}_i = \beta_1 + \beta_2 \text{EDUC}_i + \beta_3 \text{EXPER}_i + \beta_4 \text{EXPER}_i^2 + e_i \]

The marginal effect of experience is the derivative:

\[ \frac{\partial\, E(\text{WAGE})}{\partial\, \text{EXPER}} = \beta_3 + 2\beta_4 \cdot \text{EXPER} \tag{16.2}\]

\(\beta_3\) alone is the marginal effect only at \(\text{EXPER} = 0\), which is rarely informative. Always evaluate the marginal effect at a substantively meaningful experience level.

This depends on the level of experience. With estimates \(b_3 = 0.55\) and \(b_4 = -0.0063\), the marginal effect at 10 years is \(0.55 + 2(-0.0063)(10) = 0.424\) dollars per hour. At 40 years it falls to \(0.046\), which is economically negligible.

\(\implies\) You cannot report a single number for “the effect of experience” without specifying at what level of experience you are evaluating it.

CautionA common exam mistake

Students report \(b_3 = 0.55\) as “the effect of experience.” This is wrong when a quadratic term is present. The coefficient \(b_3\) is the marginal effect at \(\text{EXPER} = 0\) only. At any other experience level, you must use Equation 16.2.

16.3 Testing Linear Combinations of Coefficients

Testing whether the marginal effect at a specific experience level differs from zero requires testing a linear combination \(\theta = \beta_3 + 2\beta_4 \cdot \text{EXPER}_0\). The estimated combination is \(\hat{\theta} = b_3 + 2b_4 \cdot \text{EXPER}_0\), and its variance uses the covariance matrix of the coefficient estimates:

Theorem 16.1 (Variance of a Linear Combination) \[ \widehat{\text{Var}}(\hat{\theta}) = c_3^2 \widehat{\text{Var}}(b_3) + c_4^2 \widehat{\text{Var}}(b_4) + 2c_3 c_4 \widehat{\text{Cov}}(b_3, b_4) \tag{16.3}\]

where \(c_3 = 1\) and \(c_4 = 2 \cdot \text{EXPER}_0\).

The covariance term \(\widehat{\text{Cov}}(b_3, b_4)\) comes from the estimated covariance matrix that software computes for every regression. Omitting this term is a common exam mistake: it is almost never zero in practice.

The test statistic is \(t = \hat{\theta} / \text{se}(\hat{\theta}) \sim t_{(N-K)}\), the same structure as any individual \(t\)-test. The only new ingredient is computing the standard error from the covariance matrix.

Never omit the covariance. For quadratic models, \(b_3\) and \(b_4\) are almost always negatively correlated, so \(\widehat{\text{Cov}}(b_3, b_4) < 0\). Dropping this term inflates the variance estimate.

flowchart TD
    A["Define θ = c₃β₃ + c₄β₄"] --> B["Compute θ̂ = c₃b₃ + c₄b₄"]
    B --> C["Compute Var(θ̂) using<br/>Var(b₃), Var(b₄), Cov(b₃,b₄)"]
    C --> D["se(θ̂) = √Var(θ̂)"]
    D --> E["t = θ̂ / se(θ̂)"]
    E --> F{"Compare to t(N-K)"}
    F -->|"|t| > t_c"| G["Reject H₀"]
    F -->|"|t| ≤ t_c"| H["Fail to reject H₀"]

    style A fill:#1E5A96,color:#fff
    style G fill:#C41E3A,color:#fff
    style H fill:#2E8B57,color:#fff
Figure 16.1: Recipe for testing a linear combination of coefficients.

Interactive: Linear Combination Tester

Enter the coefficient estimates and covariance matrix entries. The widget computes the \(t\)-statistic for the marginal effect at your chosen experience level and shows whether the test rejects at 5%.

Show code
viewof b3_input = Inputs.range([0, 2], {value: 0.55, step: 0.01, label: "b₃ (EXPER coeff)"})
viewof b4_input = Inputs.range([-0.02, 0], {value: -0.0063, step: 0.0001, label: "b₄ (EXPER² coeff)"})
viewof exper0_input = Inputs.range([0, 50], {value: 20, step: 1, label: "Evaluate at EXPER ="})
viewof var_b3 = Inputs.range([0.001, 0.05], {value: 0.0121, step: 0.001, label: "Var(b₃)"})
viewof var_b4 = Inputs.range([0.000001, 0.0001], {value: 0.000004, step: 0.000001, label: "Var(b₄)"})
viewof cov_b3b4 = Inputs.range([-0.001, 0.001], {value: -0.00022, step: 0.00001, label: "Cov(b₃, b₄)"})
viewof nk_input = Inputs.range([30, 2000], {value: 996, step: 1, label: "N - K"})

lc_result = {
  const c3 = 1, c4 = 2 * exper0_input;
  const thetaHat = b3_input + 2 * b4_input * exper0_input;
  const varTheta = c3**2 * var_b3 + c4**2 * var_b4 + 2 * c3 * c4 * cov_b3b4;
  const seTheta = Math.sqrt(Math.max(varTheta, 0.0000001));
  const tStat = thetaHat / seTheta;
  const tc = 1.96; // approximate for large df
  const reject = Math.abs(tStat) > tc;

  return {thetaHat, varTheta, seTheta, tStat, reject, tc};
}

Plot.plot({
  width: 600, height: 200,
  x: {label: "t-statistic", domain: [-6, 6]},
  y: {domain: [0, 0.45]},
  marks: [
    Plot.areaY(
      d3.range(-6, 6, 0.05).map(t => ({t, y: Math.exp(-t*t/2) / Math.sqrt(2 * Math.PI)})),
      {x: "t", y: "y", fill: "#eee"}
    ),
    Plot.areaY(
      d3.range(-1.96, 1.96, 0.05).map(t => ({t, y: Math.exp(-t*t/2) / Math.sqrt(2 * Math.PI)})),
      {x: "t", y: "y", fill: "#ddd"}
    ),
    Plot.ruleX([lc_result.tStat], {stroke: lc_result.reject ? "#C41E3A" : "#2E8B57", strokeWidth: 3}),
    Plot.ruleX([-1.96, 1.96], {stroke: "#333", strokeDasharray: "4,4"})
  ]
})
Show code
html`<div style="padding:1em; background:${lc_result.reject ? '#fde8e8' : '#e8f5e8'}; border-radius:8px; margin-top:0.5em">
  <strong>Marginal effect at EXPER = ${exper0_input}:</strong> θ̂ = ${lc_result.thetaHat.toFixed(4)}<br/>
  <strong>se(θ̂):</strong> ${lc_result.seTheta.toFixed(4)}<br/>
  <strong>t-statistic:</strong> ${lc_result.tStat.toFixed(3)}<br/>
  <strong>Decision (α = 0.05):</strong> ${lc_result.reject ? "Reject H₀: marginal effect ≠ 0" : "Fail to reject H₀"}
</div>`
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Figure 16.2: Linear combination tester. Adjust coefficients and covariance entries to see how the t-statistic for the marginal effect changes. The gray region is the non-rejection zone.

16.4 Individual Significance \(\neq\) Joint Significance

Suppose both EXPER and EXPER\(^2\) have insignificant individual \(t\)-statistics (\(|t| < 1.96\)). Should we drop them and conclude experience has no effect on wages?

No. The two variables are highly correlated (\(r > 0.95\) in most datasets), which inflates both standard errors. Each \(t\)-test asks: “Does this variable contribute, given that the other is already in the model?” When two regressors share most of their variation, neither looks significant alone, but together they capture the curvature of the wage-experience profile.

WarningIndividually insignificant does not mean jointly insignificant

When regressors are correlated, neither may look significant alone. The F-test evaluates them together. Never drop variables based on individual \(t\)-tests alone when theory says they belong.

The question “Does experience affect wages?” requires a joint test:

\[ H_0: \beta_3 = 0 \text{ and } \beta_4 = 0 \qquad H_1: \text{at least one} \neq 0 \]

This is a joint hypothesis, and the \(t\)-test cannot handle it. The appropriate tool is the F-test, which evaluates multiple restrictions simultaneously. For now, remember: individually insignificant does not mean jointly insignificant.

The \(F\)-test for joint significance asks: “Do these variables contribute anything as a group?” The individual \(t\)-test asks: “Does this variable contribute anything given the others?” These are different questions with potentially different answers.

16.5 Practice

Using the wage model estimates \(b_3 = 0.55\), \(b_4 = -0.0063\), \(\widehat{\text{Var}}(b_3) = 0.0121\), \(\widehat{\text{Var}}(b_4) = 0.0000040\), \(\widehat{\text{Cov}}(b_3, b_4) = -0.00022\), and \(N - K = 996\), test whether the marginal return to experience is significantly different from zero at 20 years of experience (\(\alpha = 0.05\)).

The marginal effect at 20 years is \(\hat{\theta} = 0.55 + 2(-0.0063)(20) = 0.55 - 0.252 = 0.298\). The weights are \(c_3 = 1\), \(c_4 = 40\).

\[ \widehat{\text{Var}}(\hat{\theta}) = (1)^2(0.0121) + (40)^2(0.0000040) + 2(1)(40)(-0.00022) = 0.0121 + 0.0064 - 0.0176 = 0.0009 \]

\[ \text{se}(\hat{\theta}) = \sqrt{0.0009} = 0.030 \qquad t = \frac{0.298}{0.030} = 9.93 \]

Since \(|9.93| > 1.962\), we reject \(H_0\). At 20 years, the marginal return to experience is significantly positive.

Slides