13  Functional Forms

When a Straight Line Won’t Do

Functional Forms
Nonlinear Models
Simple Regression
Author

Jake Anderson

Published

March 21, 2026

Modified

March 26, 2026

Abstract

A linear model imposes a constant marginal effect, which often conflicts with economic theory. This chapter develops four nonlinear specifications (log-linear, linear-log, log-log, quadratic), derives the coefficient interpretation for each, and explains how to choose a functional form using theory and residual diagnostics.

13.1 Motivation: Dollars or Percentages?

Does each additional year of schooling raise wages by the same dollar amount, or by the same percentage? These are very different claims. A constant dollar amount means the jump from 12 to 13 years of education adds the same as the jump from 16 to 17. A constant percentage means the dollar gain from 16 to 17 is larger because the base wage is higher. The choice of functional form determines which story the regression tells.

The linear model assumes constant marginal effects. Most economic relationships have diminishing (or changing) returns, so a linear specification is often the wrong default.

When we plot wages against education, the linear fit looks reasonable at first. But the residuals fan out as education increases: the variance grows with \(x\). Plotting \(\ln(\text{wage})\) against education straightens the relationship and stabilizes the residual variance. This motivates the log-linear specification.

13.2 The Log-Linear Model

Definition 13.1 (Log-Linear Model) \[ \ln(y) = \beta_1 + \beta_2 x + e \tag{13.1}\]

A one-unit increase in \(x\) changes \(y\) by approximately \(100\beta_2\)%.

The derivation follows from the log difference:

\[ \ln(y_1) - \ln(y_0) = \beta_2 \cdot \Delta x \implies 100 \times \ln\!\left(\frac{y_1}{y_0}\right) \approx 100\beta_2 \cdot \Delta x \]

The approximation \(100 \times \ln(y_1/y_0) \approx \%\Delta y\) is accurate when the percentage change is under about 20%. For larger changes, the approximation overshoots.

The log-linear model has a natural economic foundation. If education is an investment yielding a constant rate of return \(r\) per year, then \(\text{WAGE} = \text{WAGE}_0 (1 + r)^{\text{EDUC}}\). Taking logs gives \(\ln(\text{WAGE}) = \ln(\text{WAGE}_0) + \ln(1 + r) \cdot \text{EDUC}\), so the slope coefficient estimates the rate of return. This is the Mincer wage equation, the workhorse of labor economics.

TipWhen to use log-linear

The log-linear form is natural when the dependent variable is always positive (wages, prices, expenditures), theory suggests constant percentage effects (returns to education, growth rates), or residuals from a linear model fan out with \(x\).

13.3 The Linear-Log Model

Definition 13.2 (Linear-Log Model) \[ y = \beta_1 + \beta_2 \ln(x) + e \tag{13.2}\]

A 1% increase in \(x\) changes \(y\) by approximately \(\beta_2 / 100\) units.

This model captures diminishing returns: doubling income from $20k to $40k has the same dollar effect on food spending as doubling from $40k to $80k. The marginal effect is \(dy/dx = \beta_2 / x\), which declines as \(x\) grows.

13.4 The Log-Log Model

Definition 13.3 (Log-Log (Constant Elasticity) Model) \[ \ln(y) = \beta_1 + \beta_2 \ln(x) + e \tag{13.3}\]

A 1% increase in \(x\) changes \(y\) by approximately \(\beta_2\)%. The coefficient \(\beta_2\) is a constant elasticity.

Economists use this specification extensively because elasticities are the natural language of demand curves, production functions, and Engel curves. The Cobb-Douglas production function \(Y = AK^{\alpha}L^{\beta}\) takes exactly this form after logging both sides.

Cobb-Douglas: \(\ln Y = \ln A + \alpha \ln K + \beta \ln L\). The coefficients \(\alpha\) and \(\beta\) are output elasticities with respect to capital and labor.

The shape of the relationship in levels depends on \(\beta_2\): values between 0 and 1 give diminishing returns (increasing at a decreasing rate), values above 1 give increasing returns, and negative values give an inverse relationship like a demand curve. Both \(x > 0\) and \(y > 0\) are required for the log transformation.

13.5 The Quadratic Model

\[ y = \beta_1 + \beta_2 x + \beta_3 x^2 + e \tag{13.4}\]

Some relationships are non-monotonic: the effect of \(x\) on \(y\) changes sign. The classic example is experience and wages. Early in a career, each year adds to wages; late in a career, the gains taper off. Log models cannot capture this because they are always monotonic. The quadratic model allows the slope to change sign.

The marginal effect is \(dy/dx = \beta_2 + 2\beta_3 x\), which is a linear function of \(x\). Setting it to zero gives the turning point \(x^* = -\beta_2 / (2\beta_3)\). When \(\beta_3 < 0\), the model traces an inverted U (wages peak at \(x^*\) years of experience). When \(\beta_3 > 0\), it traces a U shape.

Turning point: \(x^* = -\beta_2 / (2\beta_3)\). Check that \(x^*\) falls within the range of your data; if it does not, the quadratic term captures curvature rather than a true peak.

13.6 Coefficient Interpretation: Summary

Functional form interpretations.
Model Equation Interpretation of \(\beta_2\)
Linear \(y = \beta_1 + \beta_2 x\) 1-unit \(\Delta x \implies \beta_2\)-unit \(\Delta y\)
Log-linear \(\ln(y) = \beta_1 + \beta_2 x\) 1-unit \(\Delta x \implies \approx 100\beta_2\%\) \(\Delta y\)
Linear-log \(y = \beta_1 + \beta_2 \ln(x)\) 1% \(\Delta x \implies \approx \beta_2/100\) unit \(\Delta y\)
Log-log \(\ln(y) = \beta_1 + \beta_2 \ln(x)\) 1% \(\Delta x \implies \approx \beta_2\%\) \(\Delta y\)
Quadratic \(y = \beta_1 + \beta_2 x + \beta_3 x^2\) Marginal effect \(= \beta_2 + 2\beta_3 x\) (varies)

13.7 Choosing a Functional Form

flowchart TD
    A["What does theory suggest?"] --> B{"Constant % effect?"}
    A --> C{"Diminishing returns?"}
    A --> D{"Effect changes sign?"}
    A --> E{"Elasticity is the<br/>natural parameter?"}
    B -->|Yes| F["Log-Linear<br/>ln(y) = β₁ + β₂x"]
    C -->|Yes| G{"y always positive?"}
    G -->|Yes| H["Log-Log<br/>ln(y) = β₁ + β₂ln(x)"]
    G -->|No| I["Linear-Log<br/>y = β₁ + β₂ln(x)"]
    D -->|Yes| J["Quadratic<br/>y = β₁ + β₂x + β₃x²"]
    E -->|Yes| H

    style F fill:#1E5A96,color:#fff
    style H fill:#1E5A96,color:#fff
    style I fill:#1E5A96,color:#fff
    style J fill:#1E5A96,color:#fff
Figure 13.1: Decision guide for choosing a functional form. Start with theory, then check residual plots.

Start with economic theory. Does the relationship have diminishing returns? Use log-log, linear-log, or quadratic. Is the effect best expressed in percentages? Use log-linear. Does the effect change sign? Use quadratic. Is the parameter an elasticity? Use log-log.

Then check the residuals. A U-shaped pattern in the residuals against \(x\) suggests a missed quadratic term. A fan shape suggests trying a log transformation. Finally, compare fits using \(R^2\) only when the dependent variable is the same (\(y\) vs. \(y\), or \(\ln y\) vs. \(\ln y\)). If the dependent variables differ, use the generalized \(R^2\): \(R_g^2 = [\text{Corr}(y, \hat{y})]^2\), which measures predictive accuracy in the original units regardless of the internal transformation.

CautionComparing \(R^2\) across functional forms

You cannot compare \(R^2\) from a linear model (\(y\) on the left) with \(R^2\) from a log model (\(\ln y\) on the left). The dependent variables differ, so the SST values differ. Use the generalized \(R^2 = [\text{Corr}(y, \hat{y})]^2\) to make valid comparisons.

If you estimated \(\ln(y) = \hat{\beta}_1 + \hat{\beta}_2 x\), the natural predictor \(\hat{y}_n = \exp(\hat{\beta}_1 + \hat{\beta}_2 x)\) systematically underpredicts \(y\) because of Jensen’s inequality. The corrected predictor \(\hat{y}_c = \hat{y}_n \cdot e^{\hat{\sigma}^2/2}\) adjusts for this by incorporating the residual variance. Use the corrected predictor in samples with \(N > 30\).

Interactive: Functional Form Explorer

Select a functional form from the dropdown to see how it fits simulated wage-education data. The left panel shows the fitted curve in the original units. The right panel shows the residuals against \(x\). The coefficient interpretation updates below.

Show code
viewof modelChoice = Inputs.select(
  ["Linear", "Log-Linear", "Log-Log", "Linear-Log", "Quadratic"],
  {value: "Linear", label: "Functional form"}
)

ff_data = {
  const rng = d3.randomLcg(123);
  const rnorm = d3.randomNormal.source(rng)(0, 1);
  const N = 80;

  // True DGP: log-linear with some curvature
  const educ = Array.from({length: N}, () => 8 + rng() * 12);
  const wage = educ.map(e => Math.exp(1.0 + 0.08 * e + 0.15 * rnorm()));

  // Fit each model
  function fitLinear(xs, ys) {
    const xbar = d3.mean(xs), ybar = d3.mean(ys);
    const Sxx = d3.sum(xs.map(x => (x - xbar)**2));
    const Sxy = d3.sum(xs.map((x,i) => (x - xbar)*(ys[i] - ybar)));
    const b2 = Sxy / Sxx, b1 = ybar - b2 * xbar;
    return {b1, b2, predict: x => b1 + b2 * x};
  }

  let fitted, residuals, interpretation;
  const lnWage = wage.map(w => Math.log(w));
  const lnEduc = educ.map(e => Math.log(e));

  if (modelChoice === "Linear") {
    const m = fitLinear(educ, wage);
    fitted = educ.map(x => m.predict(x));
    residuals = wage.map((y,i) => y - fitted[i]);
    interpretation = `1 year more education → $${m.b2.toFixed(2)} change in wage`;
  } else if (modelChoice === "Log-Linear") {
    const m = fitLinear(educ, lnWage);
    fitted = educ.map(x => Math.exp(m.predict(x)));
    residuals = wage.map((y,i) => y - fitted[i]);
    interpretation = `1 year more education → ≈${(m.b2*100).toFixed(1)}% change in wage`;
  } else if (modelChoice === "Log-Log") {
    const m = fitLinear(lnEduc, lnWage);
    fitted = educ.map(x => Math.exp(m.predict(Math.log(x))));
    residuals = wage.map((y,i) => y - fitted[i]);
    interpretation = `1% more education → ≈${m.b2.toFixed(2)}% change in wage (elasticity)`;
  } else if (modelChoice === "Linear-Log") {
    const m = fitLinear(lnEduc, wage);
    fitted = educ.map(x => m.predict(Math.log(x)));
    residuals = wage.map((y,i) => y - fitted[i]);
    interpretation = `1% more education → ≈$${(m.b2/100).toFixed(3)} change in wage`;
  } else {
    // Quadratic
    const xbar = d3.mean(educ);
    const x2 = educ.map(x => x*x);
    const x2bar = d3.mean(x2);
    // Simple quadratic via normal equations (approximate)
    const m1 = fitLinear(educ, wage);
    // Re-fit with x^2
    const resid1 = wage.map((y,i) => y - m1.predict(educ[i]));
    const m2 = fitLinear(x2, resid1);
    const b3 = m2.b2;
    const b2 = m1.b2;
    const b1 = m1.b1 - m2.b2 * d3.mean(x2);
    const predict = x => m1.b1 + m1.b2 * x + b3 * (x*x - x2bar);
    fitted = educ.map(x => predict(x));
    residuals = wage.map((y,i) => y - fitted[i]);
    interpretation = `Marginal effect = ${b2.toFixed(2)} + 2×(${b3.toFixed(4)})×educ (varies with education level)`;
  }

  const SSE = d3.sum(residuals.map(r => r*r));
  const SST = d3.sum(wage.map(y => (y - d3.mean(wage))**2));
  const R2gen = Math.pow(d3.sum(wage.map((y,i) => (y - d3.mean(wage))*(fitted[i] - d3.mean(fitted)))) / Math.sqrt(SST * d3.sum(fitted.map(f => (f - d3.mean(fitted))**2))), 2);

  return {educ, wage, fitted, residuals, interpretation, R2gen};
}

html`<div style="display:flex; gap:1em; flex-wrap:wrap">
<div style="flex:1; min-width:320px">
${Plot.plot({
  width: 340, height: 300,
  x: {label: "Education (years)"},
  y: {label: "Wage ($/hr)"},
  marks: [
    Plot.dot(ff_data.educ.map((x,i) => ({x, y: ff_data.wage[i]})), {x: "x", y: "y", r: 2.5, fill: "#888"}),
    Plot.line(
      ff_data.educ.map((x,i) => ({x, y: ff_data.fitted[i]})).sort((a,b) => a.x - b.x),
      {x: "x", y: "y", stroke: "#1E5A96", strokeWidth: 2.5}
    )
  ]
})}
</div>
<div style="flex:1; min-width:320px">
${Plot.plot({
  width: 340, height: 300,
  x: {label: "Education (years)"},
  y: {label: "Residual"},
  marks: [
    Plot.ruleY([0], {stroke: "#aaa"}),
    Plot.dot(ff_data.educ.map((x,i) => ({x, y: ff_data.residuals[i]})), {x: "x", y: "y", r: 2.5, fill: "#C41E3A"})
  ]
})}
</div>
</div>`
Show code
html`<div style="margin-top:0.5em">
<strong>Interpretation:</strong> ${ff_data.interpretation}<br/>
<strong>Generalized R²:</strong> ${(ff_data.R2gen * 100).toFixed(1)}%
</div>`
(a)
(b)
(c)
(d)
Figure 13.2: Functional form explorer. Select a model to see the fitted curve and residual pattern. Well-specified models produce random residuals; misspecified ones show systematic patterns.

13.8 Practice

A researcher estimates \(\ln(Q) = 4.10 - 1.12 \ln(P)\) using annual chicken consumption data. Interpret the slope coefficient. Is chicken demand elastic or inelastic?

The coefficient \(-1.12\) is a constant elasticity. A 1% increase in the price of chicken reduces quantity demanded by approximately 1.12%. Since \(|\beta_2| = 1.12 > 1\), demand is slightly elastic: the percentage change in quantity exceeds the percentage change in price, so total revenue falls when price rises.

Slides