LECTURE 5
1 / 49
Introduction to Econometrics
Nonlinear specifications and dummy  variables
November 27, 2020

TESTING MULTIPLE HYPOTHESES REVISITED
2 / 49
e Suppose we have a model
yi = β0 + β1xi1 + β2xi2 + β3xi3 + εi
e Suppose we want to test multiple linear hypotheses in this  model
e For example, we want to see if the following restrictions on  coefficients hold jointly:
β1  + β2 = 1 and β3  = 0
e We cannot use a t-test in this case (t-test can be used only  for one hypothesis at a time)
e We will use an F-test

RESTRICTED VS. UNRESTRICTED MODEL
3 / 49
e We can reformulate the model by plugging the restrictions  as if they were true (model under H0)
e We call this model restricted model as opposed to the
unrestricted model
e The unrestricted model is
yi = β0 + β1xi1 + β2xi2 + β3xi3 + εi
e Restricted model can be derived to have the following  form:
y∗i = β0 + β1x∗i       + εi ,
where   y∗i = yi − xi2    and   x∗i = xi1 − xi2

IDEA OF THE F-TEST
4 / 49
e If the restrictions are true, then the restricted model fits the  data in the same way as the
unrestricted model
residuals are nearly the same
e If the restrictions are false, then the restricted model fits the  data poorly
residuals from the restricted model are much larger than  those from the unrestricted model
e The idea is thus to compare the residuals from the two  models

IDEA OF THE F-TEST
5 / 49
e How to compare residuals in the two models?
vCalculate the sum of squared residuals in the two models
vTest if the difference between the two sums is equal to zero  (statistically)
vH0: the difference is zero (residuals in the two models are  the same, restrictions hold)
vHA: the difference is positive (residuals in the restricted  model are bigger, restrictions do not
hold)
e Sum of squared residuals

F-TEST
6 / 49
e The test statistic is defined as
F =
(SSRr − SSRur)/q  SSRur/(n − k − 1)
∼ F
q,n−k−1
,
. . . sum of squared residuals from the restricted model
. . . sum of squared residuals from the unrestricted model
where:
SSRr
SSRur  q
. . . number of restrictions
n
. . . number of observations
k
. . . number of estimated coefficients

GOODNESS OF FIT MEASURE
7 / 49
e We know that education and experience have a significant  influence on wages
e But how important are they in determining wages?
e How much of difference in wages between people is  explained by differences in education and in
experience?
e How well variation in the independent variable(s) explains  variation in the dependent variable?
e This are the questions answered by the goodness of fit  measure - R2

TOTAL AND EXPLAINED VARIATION
e Total variation in the dependent variable:
e Predicted value of the dependent variable = part that is  explained by independent variables:
(case of regression line - for simplicity of notation)
e Explained variation in the dependent variable:
8 / 49

GOODNESS OF FIT - R2
e Denote:
9 / 49
e Define the measure of the goodness of fit:
R2 = SSE = Explained variation in y  SST Total variation in y

GOODNESS OF FIT - R2
10 / 49
e  In all models: 0 ≤ R2 ≤ 1
e R2 tells us what percentage of the total variation in the  dependent variable is explained by the
variation in the  independent variable(s)
R2 = 0.3 means that the independent variables can explain  30% of the variation in the dependent
variable
e Higher R2 means better fit of the regression model (not  necessarily a better model!)

DECOMPOSING THE VARIANCE
e For models with intercept, R2 can be rewritten using the  decomposition of variance.
e Variance decomposition:
11 / 49

VARIANCE DECOMPOSITION AND R2
12 / 49
e Variance decomposition: SST = SSE + SSR
e Intuition: total variation can be divided between the  explained variation and the unexplained
variation
residual ei (unexplained part)
e We can rewrite R2:
2
R =
=
SSE SST − SSR SST
SST
= 1 −
SSR SST

ADJUSTED R2
13 / 49
e The sum of squared residuals (SSR) decreases when  additional explanatory variables are
introduced in the  model, whereas total sum of squares (SST) remains the  same
2 SSR
SST
R   = 1 − increases if we add explanatory variables
 Models with more variables automatically have better fit.
e To deal with this problem, we define the adjusted R2:
R2
adj
= 1 −
   SSR
 n−k−1
SST
n−1
.≤ R2
(k is the number of coefficients)
e This measure introduces a “punishment” for including more  explanatory variables

FOUR IMPORTANT SPECIFICATION CRITERIA
14 / 49
Does a variable belong to the equation?
1.Theory: Is the variable’s place in the equation  unambiguous and theoretically sound? Does
intuition tells  you it should be included?
2.
2.t-test: Is the variable’s estimated coefficient significant in  the expected direction?
3.
3.R2: Does the overall fit of the equation improve (enough)  when the variable is added to the
equation?
4.
4.Bias: Do other variables’ coefficients change significantly  when the variable is added to the
equation?

FOUR IMPORTANT SPECIFICATION CRITERIA
15 / 49
e If all conditions hold, the variable belongs in the equation
e If none of them holds, the variable is irrelevant and can be  safely excluded
e If the criteria give contradictory answers, most importance  should be attributed to theoretical
justification
Therefore, if theory (intuition) says that variable belongs to  the equation, we include it (even
though its coefficients  might be insignificant!).

NONLINEAR SPECIFICATION
16 / 49
e We will discuss different specifications nonlinear in  dependent and independent variables and
their  interpretation
e We will define the notion of a dummy variable and we will  show its different uses in linear
regression models

NONLINEAR SPECIFICATION
17 / 49
e There is not always a linear relationship between  dependent variable and explanatory variables
The use of OLS requires that the equation be linear in  coefficients
However, there is a wide variety of functional forms that  are linear in coefficients while being
nonlinear in variables!
e We have to choose carefully the functional form of the  relationship between the dependent
variable and each  explanatory variable
The choice of a functional form should be based on the  underlying economic theory and/or intuition
Do we expect a curve instead of a straight line? Does the  effect of a variable peak at some point
and then start to  decline?

LINEAR FORM
y = β0 + β1x1 + β2x2 + ε
e Assumes that the effect of the explanatory variable on the  dependent variable is constant:
∂y
∂xk  = βk k = 1, 2
e Interpretation: if xk increases by 1 unit (in which xk is  measured), then y will change by βk
units (in which y is  measured)
e Linear form is used as default functional form until strong  evidence that it is inappropriate is
found
18 / 49

LOG-LOG FORM
ln y = β0 + β1 ln x1 + β2 ln x2 + ε
e Assumes that the elasticity of the dependent variable with  respect to the explanatory variable
is constant:
∂ ln y ∂y/y
∂ ln xk = ∂xk/xk = βk
19 / 49
k = 1, 2
e Interpretation: if xk increases by 1 percent, then y will  change by βk percents
e Before using a double-log model, make sure that there are  no negative or zero observations in
the data set

EXAMPLE
20 / 49
e Estimating the production function of Indian sugar  industry:
ˆ
ln Q = 2.70 + 0
.
(0.14)       (0.17)
.59 ln L + 0.33 ln K
Q . . . output  L . . . labor  K
. . . capital employed
Interpretation: if we increase the amount of labor by 1%, the  production of sugar will increase by
0.59%, ceteris paribus.
Ceteris paribus is a Latin phrase meaning ’other things  being equal’.

LOG-LINEAR FORMS
21 / 49
e Linear-log form:
y = β0 + β1 ln x1 + β2 ln x2 + ε
Interpretation: if xk increases by 1 percent, then y will  change by (βk/100) units (k = 1, 2)
e Log-linear form:
ln y = β0 + β1x1 + β2x2 + ε
Interpretation: if xk increases by 1 unit, then y will change  by (βk ∗ 100) percent (k = 1, 2)

EXAMPLES OF LOG LINEAR FORMS
22 / 49
e Estimating demand for chicken meat:
Y . . . annual chicken consumption (kg.)
PC . . . price of chicken
PB . . . price of beef
YD . . . annual disposable income
e Interpretation: An increase in the annual disposable income by  1% increases chicken consumption
by 0.12 kg per year, ceteris  paribus.

EXAMPLES OF LOG LINEAR FORMS
23 / 49
e Estimating the influence of education and experience on  wages:
wage  educ  exper
. . . annual wage (USD)
. . . years of education
. . . years of experience
e Interpretation: An increase in education by one year increases  annual wage by 9.8%, ceteris
paribus. An increase in experience  by one year increases annual wage by 1%, ceteris paribus.

POLYNOMIAL FORM
1
y = β0 + β1x1 + β2x2 + ε
e To determine the effect of x1 on y, we need to calculate the  derivative:
∂y
∂x1 = β1 + 2 · β2 · x1
e Clearly, the effect of x1 on y is not constant, but changes  with the level of x1
24 / 49
e We might also have higher order polynomials, e.g.:
y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + ε
1 1 1

EXAMPLE OF POLYNOMIAL FORM
e The impact of the number of hours of studying on the  grade from Introductory Econometrics:
e To determine the effect of hours on grade, calculate the  derivative:
25 / 49
Decreasing returns to hours of studying: more hours  implies higher grade, but the positive effect
of additional  hour of studying decreases with more hours

CHOICE OF CORRECT FUNCTIONAL FORM
26 / 49
e The functional form has to be correctly specified in order  to avoid biased and inconsistent
estimates
Remember that one of the OLS assumptions is that the  model is correctly specified
e Ideally: the specification is given by underlying theory of  the equation
e In reality: underlying theory does not give precise  functional form
e In most cases, either linear form is adequate, or common  sense will point out an easy choice
from among the  alternatives

CHOICE OF CORRECT FUNCTIONAL FORM
27 / 49
e Nonlinearity of explanatory variables
often approximated by polynomial form
missing higher powers of a variable can be detected as  omitted variables (see next lecture)
e Nonlinearity of dependent variable
harder to detect based on statistical fit of the regression
R2 is incomparable across models where the y is  transformed
dependent variables are often transformed to log-form in  order to make their distribution closer
to the normal  distribution

DUMMY VARIABLES
28 / 49
e Dummy variable - takes on the values of 0 or 1, depending  on a qualitative attribute
e Examples of dummy variables:

INTERCEPT DUMMY
29 / 49
e Dummy variable included in a regression alone (not  interacted with other variables) is an
intercept dummy
e It changes the intercept for the subset of data defined by a  dummy variable condition:
yi = β0 + β1Di + β2xi + εi
where
e We have
yi =  (β0 + β1) + β2xi + εi    if Di = 1
yi = β0 + β2xi + εi    if Di = 0

INTERCEPT DUMMY
X
30 / 49
β0+β1
β0
Di=1
Slope = β2
Di=0
Slope = β2

EXAMPLE
31 / 49
e Estimating the determinants of wages:
e Interpretation of the dummy variable M: men earn on  average $2.156 per hour more than women,
ceteris paribus

SLOPE DUMMY
32 / 49
e If a dummy variable is interacted with another variable (x),  it is a slope dummy.
e It changes the relationship between x and y for a subset of  data defined by a dummy variable
condition:
e We      have
yi =  β0 + (β1 + β2)xi + εi    if Di = 1
yi =  β0 + β1xi + εi   if Di = 0

SLOPE DUMMY
X
33 / 49
β0
Di=0
Slope = β1+β2
Di=1
Slope = β1

EXAMPLE
34 / 49
e Estimating the determinants of wages:
e Interpretation: men gain on average 17 cents per hour  more than women for each additional year
of education,  ceteris paribus

SLOPE AND INTERCEPT DUMMIES
35 / 49
e Allow both for different slope and intercept for two  subsets of data distinguished by a
qualitative condition:
yi = β0 + β1Di + β2xi + β3(xi · Di) + εi
where
i
D =
.
1 if the i-th observation meets a particular condition
0 otherwise
e We have
yi =  (β0 + β1) + (β2 + β3)xi + εi    if Di = 1
yi = β0 + β2xi + εi   if Di = 0

SLOPE AND INTERCEPT DUMMIES
X
36 / 49
Di=0
Slope = β2+β3
Di=1
Slope = β2
β0+β1
β0

DUMMY VARIABLES - MULTIPLE CATEGORIES
37 / 49
e What if a variable defines three or more qualitative  attributes?
e Example: level of education - elementary school, high  school, and college
e Define and use a set of dummy variables:
e Should we include also a third dummy in the regression,  which is equal to 1 for people with
elementary education?
No, unless we exclude the intercept!
Using full set of dummies leads to perfect multicollinearity  (dummy variable trap)

SUMMARY
38 / 49
e We discussed different nonlinear specifications of a  regression equation and their
interpretation
e We defined the concept of a dummy variable and we  showed its use
e Further readings:
Studenmund, Chapter 7
Wooldridge, Chapters 6 & 7