1/36
Econometrics
Introduction to Linear
Regression Analysis II
Anna Donina
Lecture 3
REVISION: THE PREVIOUS LECTURE
2/36
(Desired) properties of anestimator:
• An estimator is unbiased if the mean of its distribution is
equal to the value of the parameter it is estimating
• An estimator is consistent if it converges to the value of the
true parameter as the sample size increases
• An estimator is efficient if the variance of its sampling
distribution is the smallest possible
REVISION: THE PREVIOUS LECTURE
• Weexplained the principle of OLS estimator: minimizing
the sum of squared differences between the observation
and the regression line yi = β0 + β1xi + εi
• Wefound the formula for theestimates:
3/36
REVISION: THE PREVIOUS LECTURE
4/36
• We explained that the stochastic error term must be
present in a regression equation because of:
1. omission of many minor influences (unavailable data)
2. measurement error
3. possibly incorrect functional form
4. stochastic character of unpredictable human behavior
• All these factors are included in the error term and may
alter its properties
• The properties of the error term determine the properties
of the estimates
WARM-UP EXERCISE
5/36
• You receive a unique dataset that includes wages of all
citizens of Brno as well as their experience (number of years
spent working). Obviously, you are very curious about what
is the effect of experience on wages.
• You run an OLS regression of monthly wage in CZK on the
number of years of experience and obtain the following
results:
1. Interpret the meaning of the coefficient of experi.
2. Use the estimates to determine the average wage of a person
with 1, 5, 20, and 40 years of experience.
3. Do the predicted wages seem realistic? Explain your answer.
ON TODAY’S LECTURE
6/36
• Wewill derive estimation formulas for multivariate OLS
• Wewill list the assumptions about the error term and the
explanatory variables that are required in classical
regression models
• Wewill show that under these assumptions, OLS is the
best estimator available for regression models
• The rest of the course will mostly deal in one way or
another with the question what to do when one of the
classical assumptions is not met
• Readings:
Studenmund - chapter 4
Wooldridge - chapter 3
ORDINARY LEAST SQUARES WITH SEVERAL
EXPLANATORY VARIABLES
7/36
• Usually, there are more than one explanatory variables
in regression models
• Multivariate model with k explanatory variables:
yi = β0 + β1xi1 + β2xi2 + ... + βkxik + εi
• For observations 1,2,...,n, we have:
y1 = β0 + β1x11 + β2x12 + . . . + βkx1k + ε1
y2 = β0 + β1x21 + β2x22 + . . . + βkx2k + ε2
.. ..
yn = β0 + β1xn1 + β2xn2 + . . . + βkxnk + εn
MATRIX NOTATION
8/36
• Wecan write in matrix form:
or in a simplified notation:
Y = Xβ + ε
k
k
OLS - DERIVATION UNDER MATRIX NOTATION(OPTIONAL)
MEANING OF REGRESSION COEFFICIENT
10/36
• Consider the multivariate model
Q = β0 + β1P + β2Ps + β3Y + ε
^
sestimated as Q = 31.50 − 0.73P + 0.11P + 0.23Y
Q . . . quantitydemanded
P . . . commodity’sprice
Ps . . . price ofsubstitute
Y . . . disposableincome
• Meaning of β1 is the impact of a one unit increase in P on the
dependent variable Q, holding constant the other included
independent variables Ps and Y
• When price increases by 1 unit (and price of a substitute good
and income remain the same), quantity demanded decreases
by 0.73 units
EXERCISE
11/36
• Remember the unique dataset that includes wages of
all citizens of Brno as well as their experience (number
of years spent working).
• Because you realize that wages may not be linearly
dependent on experience, you add an additional
variable exper2
i into your model and you obtain the
following results:
ෟ𝑤𝑎𝑔𝑒i= 14450 + 1160 ·experi − 25 ·exper2
i
1. What is the overall impact of increasing the number of
years of experience by 1 year?
2. Use the estimates to determine the average wage of a
person with 1, 5, 20, and 40 years of experience.
3. Do the predicted wages seem realistic now? Explain your
answer.
THE CLASSICALASSUMPTIONS
12/36
1. Linearity: the regression model is linear in the parameters
(coefficients)
2. Random sampling: the data is a random sample drawn
from the population and each data point follows the
population equation
3. No perfect collinearity: the values of explanatory variables
are not all the same and no explanatory variable is a perfect
linear function of any other explanatory variable(s)
4. Zero conditional mean: values of explanatory variables
must contain no information about the mean of the
unobserved factors - explanatory variables are
uncorrelated with the error term
5. Homoskedasticity: the error term has a constant variance
6. Normality of the error term: the error term is normally
distributed
1. LINEARITY IN PARAMETERS
13/36
The regression model is linear in coefficients.
• Linearity in variables is not required
• Example: production function Y = AKβ1 Lβ2 for which
we suppose A = expβ0+ε can be transformed so that
ln Y = β0 + β1 ln K + β2 ln L + ε
and the linearity in coefficients is restored
• Note that it is the linearity in coefficients that allows us
to rewrite the general regression model in matrix form
EXERCISE
Which of the following model(s) is(are) linear?
14/36
EXERCISE
Which of the following models is/are linear?
15/36
2. RANDOM SAMPLING
16/36
The data is a random sample drawn from the population and each
data point follows the population equation.
• Discussion during previous class
3. NO PERFECT COLLINEARITY
17/36
The values of explanatory variables are not all the same and no
explanatory variable is a perfect linear function of any other
explanatory variable(s).
• If this condition does not hold, we talkabout (multi)
collinearity
• Multicollinearity can be perfect or imperfect
• Perfect multicollinearity: one explanatory variable is an
exact linear function of one or more other explanatory
variables
➢ In this case, the OLS model is incapable to distinguish
one variable from the other
➢ OLS estimation cannot be conducted
➢ Example: we include dummy variables for men
and women together with the intercept
3. NO PERFECT COLLINEARITY
18/36
• Imperfect multicollinearity:
There is a linear relationship between the variables, but
there is some error in that relationship
➢ Example: we include two variables that proxy
for individual health status
• Consequences ofmulticollinearity:
➢ Estimated coefficients remain unbiased
➢ But the standard errors of estimates are inflated making
the variable insignificant even though they
might be significant
• (Potential) Solution: drop one of thevariables
EXERCISE
19/36
• Which of the following pairs of independent variables
would violate the Assumption of no multicollinearity?
(That is, which pairs of variables are perfect linear
functions of each other?)
▪ right shoe size and left shoe size (of students in the class)
▪ consumption and disposable income (in the CR)
▪ Xi and 2Xi
▪ Xi and (Xi)2
4. BEFORE ZERO CONDITIONAL MEAN
20/36
The error term has a zero population mean.
Notation: E[εi] = 0 or E[ε] = 0
• Idea: observations are distributed around the regression
line; the average of deviations is zero
• On average, we make no”mistakes”
• This assumption is satisfied as long as there is an intercept
included in the equation
4. ZERO CONDITIONAL MEAN
21/36
All explanatory variables are uncorrelated with the error term.
• Notation: E[xiεi] = 0 or E[Xjε] = 0
• If an explanatory variable and the error term were
correlated with each other, the OLS estimates would be
likely to attribute some of the variation in y to the x
when it actually came from the error term
➢ Example: Impact of skipping classes on exam
scores:
➢ Motivated students are less likely to skip
classes → negative correlation between
skipped and error term
• Leads to biased and inconsistent estimates
➢ We will solve this problem using IV approach
5. HOMOSKEDASTICITY
22/36
The error term has a constant variance - Var(si|Xi) = σ2
• If it is not satisfied, we talk about heteroskedasticity
• It states that each observation of the error is drawn from
a distribution with the same variance and thus varies in
the same manner around the regression line
• If the error term is heteroskedastic, it is more difficult for
OLS to get precise estimates of the coefficients of the
explanatory variables
• Technically: the OLS estimate will be consistent, but not
efficient
24/36
5. HOMOSKEDASTICITY - GRAPHICAL REPRESENTATION
x
Y
GRAPHICAL REPRESENTATION
25/36x
Y
5. HOMOSKEDASTICITY
23/36
• Heteroskedasticity is often present in cross-sectional data
• Example: Analysis of household consumptionpatterns
➢ Variance of the consumption of certain goods might be
greater for higher-income households
➢ They have more discretionary income than do lowerincome
households
• Wewill solve this problem using Hull-White robust
standard errors
GRAPHICAL REPRESENTATION
6. NORMALITY OF THE ERROR TERM
26/36
The error term is normally distributed.
• This is an empirical question
• Normality of the error term is inherited by the estimate መ𝛽
• Knowing the distribution of the estimate allows us to
find its confidence intervals and to test hypotheses about
coefficients
PROPERTIES OF THE OLS ESTIMATE
27/36
• OLS estimate is defined by the formula
where 𝒚 = 𝑿𝜷 + 𝜺
• Hence, it is dependent on the random variable ε and thus
መ𝛽 is a random variable itself
• The properties of መ𝛽 are based on the properties ofε
EXPECTED VALUE OF THE OLS ESTIMATOR
28/36
• Under the assumptions 1-4, OLS is unbiased:
• The estimated coefficients may be smaller or larger,
depending on the sample
• However, on average, they will be equal to the true
parameters
▪ NOTE: in a given sample, estimates may differ
considerably from true values
VARIANCE OF THE OLS ESTIMATOR
28/36
• Under the assumptions1-5 , OLS is efficient :
• The error variance (σ2): increases the variance of an
estimator
• The variation in explanatory variable reduces the
variance of the estimator
GAUSS-MARKOV THEOREM
30/36
Under the assumptions 1 - 5, the OLS estimator of 𝜷 is the best linear
unbiased estimator (BLUE) of the regression coefficients
• NOTE: assumption 6, normality, is not needed for this theorem
Gauss-Markov Theorem meaning:
EXPECTED VALUE OF THE OLS ESTIMATE (OPTIONAL)
31/36
VARIANCE OF THE OLS ESTIMATE (OPTIONAL)
32/36
NORMALITY OF THE OLS ESTIMATE
33/36
CONSISTENCY OF THE OLS ESTIMATE
34/36
• When no explanatory variables are correlated with the
error term (Assumption 4), OLS estimate is consistent:
• In other words: as the number of observations increases,
the estimate converges to the true value of the coefficient
CONSISTENCY OF THE OLS ESTIMATE
35/36
• As long as the OLS estimate of መ𝛽 is consistent, the
residuals are consistent estimates of the error term
• If we have consistent estimates of the error term, we
can test if it satisfies the classical assumptions
• Moreover, possible deviations from the classical model
can be corrected
• Consequently, the assumption of zero correlation
between explanatory variables and the error term
is the most important one to satisfy in regression models
SUMMARY
36/36
• Weexpressed the multivariate OLS model in matrix
notation 𝒚 = 𝑿𝜷 + 𝜺 and we found the formula of the
estimate:
• Welisted the classical assumptions of regressionmodels:
➢ model linear in parameters, random sampling, explanatory
variables linearly independent
➢ (normally distributed) error term with zero mean and
constant variance
➢ no correlation between error term and explanatory
variables
• Weshowed that if these assumptions hold, OLS estimate is
➢ consistent (if no correlation between X and ε)
➢ unbiased (if no correlation between X and ε)
➢ efficient (if homoskedasticity and no autocorrelation of ε)
➢ normally distributed (if ε normally distributed)