LECTURE 3
1/36
Introduction to Econometrics
INTRODUCTION TO LINEAR
REGRESSION ANALYSIS II
Hieu Nguyen
Fall semester, 2024
REVISION: THE PREVIOUS LECTURE
2/36
e (Desired) properties of anestimator:
 An estimator is unbiasedif the mean of its distribution
is equal to the value of the parameter it is estimating
 An estimator is consistent if it converges to the value of
the true parameter as the sample size increases
 An estimator is efficient if the variance of its
sampling distribution is the smallest possible
REVISION: THE PREVIOUS LECTURE
e We explained the principle of OLS estimator: minimizing
the sum of squared differences between the observation
and the regression line yi = β0 + β1xi + εi
e We found the formulae for theestimates:
3/36
REVISION: THE PREVIOUS LECTURE
4/36
e We explained that the stochastic error term must be
present in a regression equation because of:
1. omission of many minor influences (unavailable data)
2. measurement error
3. possibly incorrect functional form
4. stochastic character of unpredictable human behavior
e Remember that all of these factors are included in the error
term and may alter its properties
e The properties of the error term determine the properties
of the estimates
WARM-UP EXERCISE
5/36
e You receive a unique dataset that includes wages of all
citizens of Brno as well as their experience (number of
years spent working). Obviously, you are very curious
about what is the effect of experience on wages.
e You run an OLS regression of monthly wage in CZK on the
number of years of experience and obtain the following
results:
1. Interpret the meaning of the coefficient of experi.
2. Use the estimates to determine the average wage of a
person with 1, 5, 20, and 40 years of experience.
3. Do the predictedwages seem realistic? Explain your
answer.
ON TODAY’S LECTURE
6/36
e We will derive estimation formulas for multivariate OLS
e We will list the assumptions about the error term and the
explanatory variables that are required in classical
regression models
e We will show that under these assumptions, OLS is the
best estimator available for regression models
e The rest of the course will mostly deal in one way or
another with the question what to do when one of the
classical assumptions is not met
e Readings:
Studenmund - chapter 4
Wooldridge - chapters 5, 8, 9, 12
ORDINARY LEAST SQUARES WITH SEVERAL
EXPLANATORY VARIABLES
7/36
e Usually, there are more than one explanatory variables in
regression models
e Multivariate model with k explanatory variables:
yi = β0 + β1xi1 + β2xi2 + . .. + βkxik + εi
e For observations 1,2, .. ., n, we have:
y1 = β0 + β1x11 + β2x12 + . . . + βkx1k + ε1
y2 = β0 + β1x21 + β2x22 + . . . + βkx2k + ε2
. .. .
yn = β0 + β1xn1 + β2xn2 + . . . + βkxnk + εn
MATRIX NOTATION
8/36
e We can write in matrix form:
or in a simplified notation:
Y = Xβ + ε
k
k
OLS - DERIVATION UNDER MATRIX NOTATION(OPTIONAL)
MEANING OF REGRESSION COEFFICIENT
10/36
e Consider the multivariate model
Q = β0 + β1P + β2Ps + β3Y + ε
^
sestimated as Q = 31.50 − 0.73P + 0.11P + 0.23Y
Q . . . quantitydemanded
P . . . commodity’sprice
Ps . . . price ofsubstitute
Y . . . disposableincome
e Meaning of β1 is the impact of a one unit increase in P on
the dependent variable Q, holding constant the other
included independent variables Ps andY
e When price increases by 1 unit (and price of a substitute
good and income remain the same), quantity demanded
decreases by 0.73 units
EXERCISE
11/36
e Remember the unique dataset that includes wages of all
citizens of Brno as well as their experience (number of
years spent working).
e Because you realize that wages may not be linearly
dependent on experience, you add an additional variable
exper2
i into your model and you obtain the following
results:
wagei= 14450 + 1160 ·experi − 25 ·exper2
i
1. What is the overall impact of increasing the number of
years of experienceby 1 year?
2. Use the estimates to determine the average wage of a
person with 1, 5, 20, and 40 years of experience.
3. Do the predictedwages seem realistic now? Explain your
answer.
^
THE CLASSICAL ASSUMPTIONS
12/36
1. Linearity: the regressionmodel is linear in the parameters
(coefficients)
2. Random sampling: the data is a random sample drawn
from the population and each data point follows the
population equation
3. No perfect collinearity: the values of explanatory variables
are not all the same and no explanatory variable is a
perfect linear function of any other explanatory variable(s)
4. Zero conditional mean: values of explanatory variables
must contain no information about the mean of the
unobserved factors - explanatory variables are
uncorrelated with the error term
5. Homoskedasticity: the error term has a constant variance
6. Normality of the error term: the error term is normally
distributed
1. LINEARITY IN PARAMETERS
13/36
The regression model is linear in coefficients.
e Linearity in variables is not required
e Example: production function Y = AKβ1 Lβ2 for which
we suppose A = expβ0+ε can be transformed so that
ln Y = β0 + β1 ln K + β2 ln L + ε
and the linearity in coefficients is restored
e Note that it is the linearity in coefficients that allows us to
rewrite the general regression model in matrix form
EXERCISE
Which of the following models is/arelinear?
14/36
EXERCISE
Which of the following models is/are linear?
15/36
2. RANDOM SAMPLING
16/36
The data is a random sample drawn from the population and each
data point follows the population equation.
e Discussion during last class
3. NO PERFECT COLLINEARITY
17/36
The values of explanatory variables are not all the same and no
explanatory variable is a perfect linear function of any other
explanatory variable(s).
e If this condition does not hold, we talk about
(multi)collinearity
e Multicollinearity can be perfect or imperfect
e Perfect multicollinearity: one explanatory variable is an
exact linear function of one or more other explanatory
variables
 In this case, the OLS model is incapable to distinguish
one variable from the other
 OLS estimation cannot be conducted
 Example: we include dummy variables for men
and women together with the intercept
3. NO PERFECT COLLINEARITY
18/36
e Imperfect multicollinearity:
There is a linear relationship between the variables, but
there is some error in that relationship
Example: we include two variables that proxy for
individual health status
e Consequences ofmulticollinearity:
Estimated coefficients remain unbiased
But the standard errors of estimates are inflated - making
the variable insignificant even though they might be
significant
e Solution: drop one of thevariables
EXERCISE
19/36
e Which of the following pairs of independent variables
would violate the Assumption of no multicollinearity?
(That is, which pairs of variables are perfect linear
functions of each other?)
 right shoe size and left shoe size (of students in the class)
 consumption and disposable income (in the United
States over the last 30 years)
 Xi and 2Xi
 Xi and (Xi)2
4. BEFORE ZERO CONDITIONAL MEAN
20/36
The error term has a zero population mean.
e Notation: E[εi] = 0 or E[ε] = 0
e Idea: observations are distributed around the regression
line, the average of deviations is zero
e On average, we make no”mistakes”
e This assumption is satisfied as long as there is an intercept
included in the equation
4. ZERO CONDITIONAL MEAN
21/36
All explanatory variables are uncorrelated with the error term.
e Notation: E[xiεi] = 0 or E[Xjε] = 0
e If an explanatory variable and the error term were
correlated with each other, the OLS estimates would be
likely to attribute some of the variation in y to the x
when it actually came from the error term
e Example: Impact of skipping classes on exam scores:
Motivated students are less likely to skip classes →
negative correlation between skipped and error term
e Leads to biased and inconsistent estimates
e We will solve this problem using IVapproach
5. HOMOSKEDASTICITY
22/36
The error term has a constant variance - Var(si|Xi) = σ2
e If it is not satisfied, we talk about heteroskedasticity
e It states that each observation of the error is drawn from
a distribution with the same variance and thus varies in
the same manner around the regression line
e If the error term is heteroskedastic, it is more difficult for
OLS to get precise estimates of the coefficients of the
explanatory variables
e Technically: the OLS estimate will be consistent, but not
efficient
24/36
5. HOMOSKEDASTICITY - GRAPHICAL REPRESENTATION
x
Y
GRAPHICAL REPRESENTATION
25/36x
Y
5. HOMOSKEDASTICITY
23/36
e Heteroskedasticity is often present in cross-sectionaldata
e Example: Analysis of household consumptionpatterns
Variance of the consumption of certain goods might be
greater for higher-income households
These have more discretionary income than do
lower-income households
e We will solve this problem using Hull-White robust
standard errors
GRAPHICAL REPRESENTATION
6. NORMALITY OF THE ERROR TERM
26/36
The error term is normallydistributed.
e This is an empirical question
e ^Normality of the error term is inherited by the estimate β
e Knowing the distribution of the estimate allows us to find
its confidence intervals and to test hypotheses about
coefficients
PROPERTIES OF THE OLS ESTIMATE
27/36
e OLS estimate is defined by theformula
where y = Xβ + ε
e Hence, it is dependent on the random variable ε and thus
is a random variable itself
e The properties of are based on the properties ofε
EXPECTED VALUE OF THE OLS ESTIMATOR
28/36
e Under the assumptions 1-4, OLS is unbiased:
e The estimated coefficients may be smaller or larger,
depending on the sample
e However, on average, they will be equal to the true
parameters
e NOTE: in a given sample, estimates may differ
considerably from true values
VARIANCE OF THE OLS ESTIMATOR
28/36
 Under the assumptions1-5 , OLS is efficient :
 The error variance (σ2): increases the variance of an
estimator
 The variation in explanatory variable reduces the
variance of the estimator
GAUSS-MARKOV THEOREM
30/36
Under the assumptions 1 - 5, the OLS estimator of β is the best linear
unbiased estimator (BLUE) of the regression coefficients
e NOTE: assumption 6, normality, is not needed for this
theorem
e Gauss-Markov Theorem meansthat:
EXPECTED VALUE OF THE OLS ESTIMATE (OPTIONAL)
31/36
VARIANCE OF THE OLS ESTIMATE (OPTIONAL)
32/36
NORMALITY OF THE OLS ESTIMATE
33/36
CONSISTENCY OF THE OLS ESTIMATE
34/36
e When no explanatory variables are correlated with the
error term (Assumption 4), OLS estimate is consistent:
e In other words: as the number of observations increases,
the estimate converges to the true value of the coefficient
CONSISTENCY OF THE OLS ESTIMATE
35/36
e ^As long as the OLS estimate of β is consistent,the
residuals are consistent estimates of the error term
e If we have consistent estimates of the error term, we can
test if it satisfies the classical assumptions
e Moreover, possible deviations from the classical model can
be corrected
e As a consequence, the assumption of zero correlation
between explanatory variables and the error term
is the most important one to satisfy in regressionmodels
SUMMARY
36/36
e We expressed the multivariate OLS model in matrix
notation y = Xβ + ε and we found the formula of the
estimate:
e We listed the classical assumptions of regressionmodels:
 model linear in parameters, random sampling,
explanatory variables linearly independent
 (normally distributed) error term with zero mean
and constant variance
 no correlation between error term and
explanatory variables
e We showed that if these assumptions hold, OLS estimate is
 consistent (if no correlation between X and ε)
 unbiased (if no correlation between X and ε)
 efficient (if homoskedasticity and no autocorrelation of ε)
 normally distributed(if ε normally distributed)