Econometrics 2 - Lecture 6
Models Based on Panel Data


Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
2

Types of Data
nPopulation of interest: individuals, households, companies, countries
nTypes of observations
nCross-sectional data: observations of all units of a population, or of a representative subset, at
one specific point in time
nTime series data: series of observations on units of the population over a period of time
nPanel data (longitudinal data): repeated observations of (the same) population units collected
over a number of periods; data set with both a cross-sectional and a time series aspect;
multi-dimensional data
nCross-sectional and time series data are one-dimensional, special cases of panel data
nPooling independent cross-sections: (only) similar to panel data
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
3

Example: Individual Wages
nVerbeek’s data set “males”
nSample of
q545 full-time working males
qeach person observed yearly after completion of school in 1980 till 1987
nVariables
qwage: log of hourly wage (in USD)
qschool: years of schooling
qexper: age – 6 – school
qdummies for union membership, married,  black, Hispanic, public sector
qothers
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
4

Panel Data in GRETL
nThree types of data:
nCross-sectional data: matrix of observations, units over the columns, each row corresponding to
the set of variables observed for a unit
nTime series data: matrix of observations, each column a time series, rows correspond to
observation periods (annual, quarterly, etc.)
nPanel data: matrix of observations with special data structure
qStacked time series: each column one variable, with stacked time series corresponding to
observational units
qStacked cross sections: each column one variable, with stacked cross sections corresponding to
observation periods
qUse of index variables: index variables defined for units and observation periods
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
5

Stacked Data: Examples
n
n Stacked time series
n
n
n
n
n
n
n
n
n Stacked cross sections
April 19, 2013
Hackl, Econometrics 2, Lecture 6
6
unit
Year
x1
x2
1:1
1
2009
1.197
252
1:2
1
2010
1.369
269
1:3
1
2011
1.675
275
...
...
...
...
...
2:1
2
2009
1.220
198
2:2
2
2010
1.397
212
2:3
2
2011
1.569
275
...
...
...
...
...
unit
year
x1
x2
1:1
1
2009
1.197
252
2:1
2
2009
1.220
198
3:1
3
2009
1.173
167
...
...
...
...
...
1:2
1
2010
1.369
269
2:2
2
2010
1.397
212
3:2
3
2010
1.358
201
...
...
...
...
...

Panel Data Files
nFiles with one record per observation
qFor each unit (individual, company, country, etc.) T records
qStacked time series or stacked cross sections
qAllows easy differencing
qTime-constant variable: on each record the same value
nFiles with one record per unit
qEach record contains all observations for all T periods
qTime-constant variables are stored only once
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
7

Panel Data
nTypically data at micro-economic level (individuals, households, firms), but also at
macro-economic level (e.g., countries)
nNotation:
nN: Number of cross-sectional units
nT: Number of time periods
nTypes of panel data:
nLarge T, small N: “long and narrow”
nSmall T, large N: “short and wide”
nLarge T, large N: “long and wide”
n
nExample: Data set “males”: short (T = 8) and wide (N = 545) panel (N » T)
April 19, 2013
Hackl, Econometrics 2, Lecture 6
8

Panel Data: Some Examples
nData set “males”: wages  and related variables
nshort and wide panel (N = 545, T = 8)
nrich in information (~40 variables)
nunobserved heterogeneity
nGrunfeld investment data: investments in plant and equipment by
nN = 10 firms
nfor each T = 20 yearly observations for 1935-1954
nPenn World Table: purchasing power parity and national income accounts for
nN = 189 countries/territories
nfor some or all of the years 1950-2009 (T ≤ 60)
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
9

Use of Panel Data
nEconometric models for describing the behaviour of cross-sectional units over time
nPanel data models
nAllow controlling individual differences, comparing behaviour, analysing dynamic adjustment,
measuring effects of policy changes
nMore realistic models
nAllow more detailed or sophisticated research questions
nMethodological implications
nDependence of sample units in time-dimension
nSome variables might be time-constant (e.g., variable school in “males”, population size in the
Penn World Table dataset)
nMissing values
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
10

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
11

Example: Wages and Experience
nData set “males”
nIndependent random samples for 1980 and 1987
nN80 = N87 = 100
nVariables: wage (log of hourly wage), exper (age – 6 – years of schooling)
n
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
12
1980
1987
Full set
sample
Full set
sample
wage
mean
1.39
1.37
1.87
1.89
st.dev.
0.558
0.598
0.467
0.475
exper
mean
3.01
2.96
10.02
9.99
st.dev.
1.65
1.29
1.65
1.85
exp(wage)
4.01
6.49

Pooling of Samples
nIndependent random samples:
nPooling gives an independently pooled cross section
nOLS estimates with higher precision, tests with higher power
nRequires
qthe same distributional properties of sampled variables
qthe same relation between variables in the samples
April 19, 2013
Hackl, Econometrics 2, Lecture 6
13

Example: Wages and Experience
nSome wage equations (coefficients in bold letters: p<0.05):
n1980 data
n wage = 1.315 + 0.026*exper, R2 = 0.006
n1987 data
n wage = 2.441 – 0.057*exper, R2 = 0.041
npooled 1980 and 1987 data
n wage = 1.289 + 0.052*exper, R2 = 0.128
npooled data with dummy d87
n wage = 1.441 – 0.016*exper + 0.583*d87, R2 = 0.177
npooled sample with dummy d87 and interaction
n wage = 1.315 + 0. 026*exper + 1.126*d87 – 0.083*d87*exper
nd87: dummy for observations from 1987
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
14

Wage Equations
nWage equations, dependent variable: wage (log of hourly wage)
n
n
n
n
n
n
n
n
n
n
nAt least the intercept changes from 1980 to 1987
April 19, 2013
Hackl, Econometrics 2, Lecture 6
15
1980
1987
80+87
80+87
80+87
Interc.
coeff
1.315
2.441
1.289
1.441
1.315
s.e.
0.050
0.120
0.031
0.036
0.045
exper
coeff
0.026
-0.057
0.052
-0.016
0.026
s.e.
0.014
0.012
0.004
0.009
0.013
d87
coeff
0.583
1.126
s.e.
0.073
0.141
d87*exper
coeff
-0.083
s.e.
0.019
R2 (%)
0.6
4.1
12.8
17.7
19.2

Pooled Independent Cross-sectional Data
nPooling of two independent cross-sectional samples
n yit = β1 + β2xit + εit for i = 1,...,N, t = 1,2
nImplicit assumption: identical β1, β2 for i = 1,...,N, t = 1,2
nOLS-estimation: requires
qhomoskedastic and uncorrelated εit
n E{εit} = 0, Var{εit} = σ2 for i = 1,...,N, t = 1,2
n Cov{εi1, εj2} = 0 for all i, j with i ≠ j
qexogenous xit
nFor the analysis of panel data, often a more realistic model is needed, taking into consideration
nchanging coefficients
ncorrelated error terms
nexogenous regressors
April 19, 2013
Hackl, Econometrics 2, Lecture 6
16

Model with Time Dummy
nModel for pooled independent cross-sectional data in presence of changes:
nDummy variable d: indicator for t = 2 (dt=0 for t=1, dt=1 for t=2)
n yit = β1 + β2 xit + β3 dt + β4 dt*xit + εit
n allows changes (from t =1 to t = 2)
qof intercept from β1 to β1 + β3
qof coefficient of x from β2 to β2 + β4
nTests for constancy of (1) β1 or (2) β1, β2 over time (cf. Chow test)
n H0(1): β3 = 0 or H0(2): β3 = β4 = 0
nSimilarly testing for constancy of σ2 over time
nGeneralization to more than two time periods
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
17

Example: Wages and Experience
nWage equation
n wageit = β1 + β2 experit + β3 dt + εit
nWages might depend also on other variables; omitted variables are covered by the error terms
nblack: time-constant variable, omission may cause autocorrelation of error terms; similar other
time-constant factors like hisp
nmar (married): variable which is for many (not all) units time-constant, similar rural, union, ne
(living in north east), etc.; omission may cause autocorrelation
nschool: omission may cause endogeneity of exper
nUnobserved and unobservable variables can have similar effects, e.g., parental background,
attitudes, etc.
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
18

Problems with Sample Pooling
nThe analysis of the data (yit, xit), i = 1,...,N, t = 1,2, by OLS estimation of the parameters of
model
n yit = β1 + β2 xit + εit
n (or extensions based on a year dummy for t=2) may not fulfil usual requirements
nThe independence assumption across time may be unrealistic
nMain reason is that effects of non-measured and non-measurable variables  are only covered by the
error terms
nAlso exogeneity of  regressors may be unrealistic
nConsequences: OLS-estimates
nbiased and inconsistent
nnot efficient
nPanel data models allow more adequate analyses
April 19, 2013
Hackl, Econometrics 2, Lecture 6
19

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
20

Models for Panel Data
nModel for y, based on panel data from N cross-sectional units and T periods
n yit = β0 + xit’β1 + εit
n i = 1, ..., N: sample unit
n t = 1, ..., T: time period of sample
n xit and β1: K-vectors
nβ0 and β1: represent intercept and K regression coefficients; are assumed to be identical for all
units and all time periods
nεit: represents unobserved factors that may affect yit
qAssumption that εit are uncorrelated over time not realistic
qStandard errors of OLS estimates misleading, OLS estimation not efficient
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
21

Fixed Effects Model
nThe general model
n yit = β0 + xit’β1 + εit
nSpecification for the error terms: two components
n εit = αi + uit
qαi unit-specific, time-constant factors, also called unobserved (individual) heterogeneity; may be
correlated with xit
quit ~ IID(0, σu2); uncorrelated over time; represents unobserved factors that change over time,
also called idiosyncratic or time-varying error
qεit : also called composite error
nFixed effects (FE) model
n yit = Σj αi dij + xit’β1 + uit
n dij: dummy variable for unit i: dij = 1 if i = j, otherwise dij = 0
nOverall intercept omitted; unit-specific intercepts αi
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
22

Random Effects Model
nStarting point is again the model
n yit = β0 + xit’β1 + εit
n with composite error εit = αi + uit
nSpecification for the error terms:
quit ~ IID(0, σu2); uncorrelated over time
qαi ~ IID(0, σa2); represents all unit-specific, time-constant factors; correlation of error terms
over time only via the αi
qαi and uit are assumed to be mutually independent and independent of xjs for all j and s
nRandom effects (RE) model
n yit = β0 + xit’β1 + αi + uit
nUnbiased and consistent (N → ∞) estimation of β0 and β1
nEfficient estimation of β0 and β1: takes error covariance structure into account; GLS estimation
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
23

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
24

Fixed Effects (FE) Model
nModel for y, based on panel data for T periods
n yit = αi + xit’β + uit , uit ~ IID(0, σu2)
n i = 1, ..., N: sample unit
n t = 1, ..., T: time period of sample
nαi: fixed parameter, represents all unit-specific, time-constant factors, unobserved (individual)
heterogeneity
nxit: all K components are assumed to be independent of all uit; may be correlated with αi
nModel with dummies dij = 1 for i = j and 0 otherwise:
n yit = Σj αi dij + xit’β + uit
nNumber of coefficients: N + K
nMain interest: estimators for β
April 19, 2013
Hackl, Econometrics 2, Lecture 6
25

FE Model Parameters: Estimation
nFE model with dummies dij = 1 for i = j and 0 otherwise:
n yit = Σj αi dij + xit’β + uit
n Number of coefficients: N + K
nVarious  estimation procedures are available
nLeast squares dummy variable (LSDV) estimator
nWithin or fixed effects estimator
nFirst-difference estimator
nA special case
nDifferences-in-differences (DD or DID or D-in-D) estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
26

Least Squares Dummy Variable (LSDV) Estimator
nEstimation procedure for N + K parameters β and αi of the FE model
n yit = Σj αi dij + xit’β + uit
nOLS estimation
nNT observations for estimating N + K coefficients
nNumerically costly, not attractive
nEstimates for αi usually not of interest
nFixed effects and first-difference estimators are more attractive
April 19, 2013
Hackl, Econometrics 2, Lecture 6
27

Fixed Effects Estimation
nWithin transformation: transforms yit into time-demeaned ÿit by subtracting the average ӯi = (Σt
yit )/T:
n ÿit = yit - ӯi
n analogously ẍit and üit, for i = 1,...,N, t = 1, ..., T
nModel in time-demeaned variables
n ÿit = ẍit’β + üit
nPooled OLS estimator bFE for β
nbFE: “fixed effects estimator”, also called “within estimator”
nUses time variation in y and x within each cross-sectional observation; explains deviations of
yit from ӯi (not of ӯi from ӯj!)
n
nGRETL: Model > Panel > Fixed or random effects ...
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
28

The Fixed Effects Estimator
nFE model
n yit = αi + xit’β + uit , uit ~ IID(0, σu2)
n xit are assumed to be independent of all uit but may be correlated with αi
nEstimation of β from the model in time-demeaned variables
n ÿit = ẍit’β + üit
n gives
n bFE = (Σj Σt ẍit ẍit’)-1Σj Σt ẍit ÿit
nTime-demeaning differences away time-constant factors αi
nUnder the assumption that xit are independent of all uit: bFE is unbiased
nbFE coincides with LSDV estimator
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
29

Wage Equations
nWage equations, dependent variable: wage (log of hourly wage)
n
n
n
n
n
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
30
Pooled
80+87
FE
80+87
FE
80+87
FE
80+87
FE
80...87
Interc.
coeff
1.289
1.285
1.432
1.307
1.237
s.e.
0.031
0.031
0.036
0.045
0.016
exper
coeff
0.052
0.053
-0.013
0.029
0.063
s.e.
0.004
0.004
0.009
0.013
0.002
d87
coeff
0.564
1.107
s.e.
0.073
0.141
d87*exper
coeff
-0.083
s.e.
0.019
adjR2 (%)
12.8
13.7
18.1
19.5
55.6

Properties of Fixed Effects Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
31
 bFE = (ΣiΣt ẍit ẍit’)-1 ΣiΣt ẍit ÿit
nUnbiased if all xit are independent of all uit
nNormally distributed if normality of uit is assumed
nConsistent (for N → ∞) if xit are strictly exogenous, i.e., E{xit uis} = 0 for all s, t
nAsymptotically normally distributed
nCovariance matrix
 V{bFE} = σu2(ΣiΣt ẍit ẍit’)-1
nEstimated covariance matrix: substitution of σu2 by
su2 = (ΣiΣt ῦitῦit)/[N(T-1)]
with the residuals ῦit = ÿit - ẍit’bFE
nAttention! The standard OLS estimate of the covariance matrix underestimates the true values

Estimator for αi
nTime-constant factors αi, i = 1, ..., N
nEstimates based on the fixed effects estimator bFE
n ai = ӯi - ẋi’bFE
n with averages over time ӯi and ẋi for the i-th unit
nConsistent (for T → ∞) if xit are strictly exogenous
nPotentially interesting aspects of estimates ai
qDistribution  of the ai , i = 1, ..., N
qValue of ai for unit i of special interest
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
32

First-Difference Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
33
Elimination of time-constant factors αi by differencing
 ∆yit = yit – yi,t-1 = ∆xit’β + ∆uit
 ∆xit and ∆uit analogously defined as ∆yit = yit – yi,t-1
First-difference estimator: OLS estimation
 bFD = (ΣiΣt ∆xit ∆xit’)-1 ΣiΣt ∆xit ∆yit
Properties
nConsistent (for N → ∞) under slightly weaker conditions than bFE
nSlightly less efficient than bFE due to serial correlations of the ∆uit
nFor T = 2, bFD and bFE coincide

Differences-in-Differences Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
34
Natural experiment or quasi-experiment:
nExogenous event, e.g., a new law, changes in operating conditions
nTreatment group, control group
nAssignment to groups not (like in a true experiment) at random
nData: before event, after event
Model for response yit
yit = δrit + μt + αi + uit, i =1,...,N, t = 1 (before), 2 (after event)
nDummy rit =1 if i-th unit in treatment group, rit =0 otherwise
nδ: treatment effect
nFixed effects model (for differencing away time-constant factors):
∆yit = yi2 – yi1 = δ ∆rit + μ0 + ∆uit
with μ0 = μ2 – μ1

Wage Differences 1980 - 1987
nEffect of ethnicity
nwage (log of hourly wage) : increases from 1.419 (1980) to 1.892 (1987)
ni.e., increase of hourly wage from USD 4.13 (1980) to 6.63 (1987)
nDoes the wage increase depend on ethnicity?
nDummy blackit = 1 if i-th person is afro-american, blackit = 0 otherwise
nModel for wage:
n wageit = μt + αi + uit, i =1,...,N, t = 1980, 1987
nαi: time-constant factores, e.g., schooling, rural, industry, etc.
nModel for differences with μ0 = μ1987 – μ1980
n ∆wageit = μ0 + δ blackit + ∆uit
n
n
n
n
n
n
n
n
n
nAt least the intercept changes from 1980 to 1987
April 19, 2013
Hackl, Econometrics 2, Lecture 6
35

Wage Differences, cont’d
nIncrease of wage (log of hourly wage)
n ∆wageit = μ0 + δ blackit + ∆uit
n OLS-estimation gives (N = 545, 63 afro-americans)
n
n
n
nDifferences in wage and in hourly wages
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
36
μ0
δ
adj R2
Estimate
0.491
-0.154
0.47
Std.err.
0.027
0.081
μ0
μ0+ δ
all
black = 0
black = 1
wage (average)
0.491
0.337
0.473
hourly wages
1.634
1.401
1.605
Increase (%)
63.4
40.1
60.5

Estimator of Treatment Effect
April 19, 2013
Hackl, Econometrics 2, Lecture 6
37
Effect of treatment (event) by comparing units
qwith and without treatment
qbefore and after treatment
Model for panel data yit
yit = δrit + μt + αi + uit, i =1,...,N, t = 1 (before), 2 (after event)
Differences-in-differences (DD or DID or D-in-D) estimator of treatment effect δ
dDD = ∆ӯtreated - ∆ӯuntreated
∆ӯtreated: average difference yi2 – yi1 of treatment group units
∆ӯuntreated: average difference yi2 – yi1 of control group units
nTreatment effect δ measured as difference between changes of y with and without treatment
ndDD consistent if E{∆rit ∆uit} = 0
nAllows correlation between time-constant factors αi and rit

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
38

Random Effects Model
April 19, 2013
Hackl, Econometrics 2, Lecture 6
39
Model:
yit = β0 + xit’β + αi + uit , uit ~ IID(0, σu2)
nTime-constant factors αi: stochastic variables with identical distribution for all units
 αi ~ IID(0, σa2)
nAttention! More information about αi than in the fixed effects model
nαi + uit: error term with two components
qUnit-specific component αi, time-constant
qRemainder uit, assumed to be uncorrelated over time
nαi, uit: mutually independent, independent of xjs for all j and s
nOLS estimators for β0 and β are unbiased, consistent, not efficient (see next slide)
n

GLS Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
40
αi iT + ui: T-vector of error terms for i-th unit, T-vector iT = (1, ..., 1)’
 Ω = Var{αiiT + ui}: Covariance matrix of αiiT + ui
Ω = σa2 iT iT’ + σu2IT
Inverted covariance matrix
Ω-1 = σu-2{[IT – (iTiT’)/T] + ψ (iTiT’)/T}
with ψ = σu2/(σu2 + Tσa2)
(iTiT’)/T: transforms into averages
IT – (iTiT’)/T: transforms into deviations from average
GLS estimator
 bGLS = [ΣiΣtẍitẍit’+ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1[ΣiΣtẍitÿit+ψTΣi(ẋi –ẋ)(ӯi –ӯ)]
with the average ӯ over all i and t, analogous ẋ
nψ = 0: bGLS coincides with bFE; bGLS and bFE equivalent for large T
nψ = 1: bGLS coincides with the OLS estimators for β0 and β

Between Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
41
Model for individual means ӯi and ẋi:
ӯi = β0 + ẋi’β + αi + ūi , i = 1, ..., N
OLS estimator
bB = [Σi(ẋi –ẋ)(ẋi –ẋ)’]-1Σi(ẋi –ẋ)(ӯi –ӯ)
is called the between estimator
nConsistent if xit strictly exogenous, uncorrelated with αi
nGLS estimator can be written as
bGLS = ∆bB + (IK - ∆)bFE
∆: weighting matrix, proportional to the inverse of Var{bB}
qMatrix-weighted average of between estimator bB and within estimator bFE
qThe more accurate bB the more weight has bB in bGLS
qbGLS: optimal combination of bB and bFE, more efficient than bB and bFE
n

GLS Estimator: Properties
April 19, 2013
Hackl, Econometrics 2, Lecture 6
42
bGLS = [ΣiΣtẍitẍit’+ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1[ΣiΣtẍitÿit+ψTΣi(ẋi –ẋ)(ӯi –ӯ)]
nUnbiased, if xit are independent of all αi and uit
nConsistent for N or T or both tending to infinity if
qE{ẍit uit} = 0
qE{ẋi uit} = 0, E{ẍit αi} = 0
qThese conditions are required also for consistency of bB
nMore efficient than the between estimator bB and the within estimator bFE; also more efficient
than the OLS estimator
n

Random Effects Estimator
April 19, 2013
Hackl, Econometrics 2, Lecture 6
43
EGLS or Balestra-Nerlove estimator: Calculation of bGLS from model
yit – ϑӯi = β0(1 – ϑ) + (xit – ϑẋi)’β + vit
with ϑ = 1 – ψ1/2, vit ~ IID(0, σv2)
quasi-demeaned yit – ϑӯi and xit – ϑẋi
Two step estimator:
1.Step 1: Transformation parameter ψ calculated from (method by Swamy & Arora)
qwithin estimation: su2 = (ΣiΣt ῦitῦit)/[N(T-1)]
qbetween estimation: sB2 = (1/N)Σi (ӯi – b0B – ẋi’bB)2 = sa2+(1/T)su2
qsa2 = sB2 – (1/T)su2
2.Step 2:
qCalculation of 1 – [su2/(su2 + Tsa2)]1/2 for parameter ϑ
qTransformation of yit and xit
qOLS estimation gives the random effect estimator bRE for β

Random Effects Estimator: Properties
April 19, 2013
Hackl, Econometrics 2, Lecture 6
44
bRE: EGLS estimator of β from
yit – ϑӯi = β0(1 – ϑ) + (xit – ϑẋi)’β + vit
with ϑ = 1 – ψ1/2, ψ = σu2/(σu2 + Tσa2)
nAsymptotically normally distributed under weak conditions
nCovariance matrix
Var{bRE} = σu2[ΣiΣt ẍit ẍit’ + ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1
nMore efficient than the within estimator bFE (if ψ > 0)
n

Wage Equations, 1980-1987
nDependent variable: wage (log of hourly wage)
n
n
n
n
n
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
45
Between
Fixed Effects
Random Effects
Pooled
OLS
Intercept
0.511
1.053
-0.079
0.049
school
0.089***
--
0.100***
0.095***
exper
-0.032
0.118***
0.111***
0.087***
exper2
0.004
-0.004***
-0.004***
-0.003***
union
0.262***
0.082***
0.109***
0.179***
mar
0.184***
0.045**
0.064***
0.126***
black
-0.141***
--
-0.149***
-0.150***
rural
0.188***
0.049*
-0.026
-0.138***

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
46

Summary of Estimators
April 19, 2013
Hackl, Econometrics 2, Lecture 6
47
nBetween estimator
nFixed  effects (within) estimator
nCombined estimators
qOLS estimator
qRandom effects (EGLS) estimator
nFirst-difference  estimator
q
Estimator
Consistent, if
Between
bB
xit strictly exog, xit and αi uncorr
Fixed effects
bFE
xit strictly exog
OLS
b
xit and αi uncorr, xit and uit contemp. uncorr
Random effects
bRE
conditions for bB and bFE are met
First-difference
bFD
E{ẍit üit} = 0

Fixed Effects or Random Effects?
April 19, 2013
Hackl, Econometrics 2, Lecture 6
48
Random effects model
E{yit | xit} = xit’β
nLarge values N; of interest: population characteristics (β), not characteristics of individual
units (αi)
nMore efficient estimation of β, given adequate specification of the time-constant model
characteristics
Fixed  effects model
 E{yit | xit} = xit’β + αi
nOf interest: besides population characteristics (β), also characteristics of individual units
(αi), e.g., of countries or companies; rather small values N
nLarge values of N, if xit and αi correlated: consistent estimator bFE in case of correlated
xit and αi
n

Diagnostic Tools
April 19, 2013
Hackl, Econometrics 2, Lecture 6
49
nTest of common intercept of all units
qApplied to pooled OLS estimation: Rejection indicates preference for fixed or random effects model
qApplied to fixed effects estimation: Non-rejection indicates preference for pooled OLS estimation
nHausman test (of correlation between xit and αi):
qNull-hypothesis that GLS estimates are consistent
qRejection indicates preference for fixed effects model
nTest of non-constant variance of the error terms, Breusch-Pagan test
pRejection indicates preference for fixed or random effects model
pNon-rejection indicates preference for pooled OLS estimation
n

Hausman Test
April 19, 2013
Hackl, Econometrics 2, Lecture 6
50
Tests of correlation between xit and αi
H0: xit and αi are uncorrelated
Test statistic:
ξH = (bFE - bRE)‘ [Ṽ{bFE} - Ṽ{bRE}]-1 (bFE - bRE)
with estimated covariance matrices Ṽ{bFE} and Ṽ{bRE}
nbRE: consistent if xit and αi are uncorrelated
nbFE: consistent also if xit and αi are correlated
Under H0: plim(bFE - bRE) = 0
nξH asymptotically chi-squared distributed with K d.f.
nK: dimension of xit and β
Hausman test may indicate also other types of misspecification
q

Robust Inference
April 19, 2013
Hackl, Econometrics 2, Lecture 6
51
Consequences of heteroskedasticity and autocorrelation of the error term:
nStandard errors and related tests are incorrect
nInefficiency of estimators
Robust covariance matrix for estimator b of β from yit = xit’β + εit
b = (ΣiΣt xitxit’)-1 ΣiΣt xityit
nAdjustment of covariance matrix similar to Newey-West: assuming uncorrelated error terms for
different units (E{εit εjs} = 0 for all i ≠ j)
V{b} = (ΣiΣt xitxit’)-1 ΣiΣtΣs eiteis xitxis’ (ΣiΣt xitxit’)-1
eit: OLS residuals
nAllows for heteroskedasticity and autocorrelation within units
nCalled panel-robust estimate of the covariance matrix
Analogous variants of the Newey-West estimator for robust covariance matrices of random effects and
fixed effects estimators

Testing for Autocorrelation and Heteroskedasticity
April 19, 2013
Hackl, Econometrics 2, Lecture 6
52
Tests for heteroskedasticity and autocorrelation in random effects model error terms
nComputationally cumbersome
Tests based on fixed effects model residuals
nEasier case
nApplicable for testing in both fixed and random effects case

Test for Autocorrelation
April 19, 2013
Hackl, Econometrics 2, Lecture 6
53
Durbin-Watson test for autocorrelation in the fixed effects model
nError term uit = ρui,t-1 + vit
qSame autocorrelation coefficient ρ for all units
qvit iid across time and units
nTest of H0: ρ = 0 against ρ > 0
nAdaptation of Durbin-Watson statistic
n
nTables with critical limits dU and dL for K, T, and N; e.g., Verbeek’s Table 10.1

Test for Heteroskedasticity
April 19, 2013
Hackl, Econometrics 2, Lecture 6
54
Breusch-Pagan test for heteroskedasticity of fixed effects model residuals
nV{uit} = σ2h(zit’γ); unknown function h(.) with h(0)=1, J-vector z
nH0: γ = 0, homoskedastic uit
nAuxiliary regression of squared residuals on intercept and regressors z
nTest statistic: N(T-1) times R2 of auxiliary regression
nChi-squared distribution with J d.f. under H0

Wage Equations, 1980-1987
nFixed effects estimation, standard and HAC standard errors
n
n
n
n
n
n
n
n   ∆: ratio of HAC s.e. to s.e.
n
n
n
n
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
55
Coeff.
s.e.
HAC s.e.
∆
Intercept
1.053
0.0276
0.0384
1.39
exper
0.118
0.0084
0.0108
1.29
exper2
-0.004
0.0006
0.0007
1.17
union
0.082
0.0193
0.0227
1.18
mar
0.045
0.0183
0.0210
1.15
rural
0.049
0.0290
0.0391
1.35

Goodness-of-Fit
April 19, 2013
Hackl, Econometrics 2, Lecture 6
56
Goodness-of-fit measures for panel data models: different from OLS estimated regression models
nFocus may be on within or between variation in the data
nThe usual R2 measure relates to OLS-estimated models
Definition of goodness-of-fit measures: squared correlation coefficients between actual and fitted
values
nR2within: squared correlation between within transformed actual and fitted yit; maximized by
within estimator
nR2between: based upon individual averages of actual and fitted yit; maximized by between estimator
nR2overall: squared correlation between actual and fitted yit; maximized by OLS
Corresponds to the decomposition
[1/TN]ΣiΣt(yit – ӯ)2  = [1/TN]ΣiΣt(yit – ӯi)2 + [1/N]Σi(ӯi – ӯ)2
q

Goodness-of-Fit, cont’d
April 19, 2013
Hackl, Econometrics 2, Lecture 6
57
Fixed effects estimator bFE
nExplains the within variation
nMaximizes R2within
 R2within(bFE) = corr2{ŷitFE – ŷiFE, yit – ӯi}
Between estimator bB
nExplains the between variation
nMaximizes R2between
 R2between(bB) = corr2{ŷiB, ӯi}

Wage Equations, 1980-1987
nDependent variable: wage (log of hourly wage)
n
n
n
n
n
n
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
58
Between
F.E.
R.E.
OLS
Intercept
0.511
1.053
-0.079
0.049
school
0.089***
--
0.100***
0.095***
exper
-0.032
0.118***
0.111***
0.087***
exper2
0.004
-0.004***
-0.004***
-0.003***
union
0.262***
0.082***
0.109***
0.179***
mar
0.184***
0.045**
0.064***
0.126***
black
-0.141***
--
-0.149***
-0.150***
rural
0.188***
0.049*
-0.026
-0.138***
R2 (%)
16.07
5.66
18.42
19.70

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
59

Panel Data and GRETL
April 19, 2013
Hackl, Econometrics 2, Lecture 6
60
Estimation of panel models
Pooled OLS
nModel > Ordinary Least Squares …
nSpecial diagnostics on the output window: Tests > Panel diagnostics
Fixed and random effects models
nModel > Panel > Fixed or random effects…
nProvide diagnostic tests
qFixed effects model: Test for common intercept of all units
qRandom effects model: Breusch-Pagan test, Hausman test
Further estimation procedures
nBetween estimator
nWeighted least squares
nInstrumental variable panel procedure

Hackl, Econometrics 2, Lecture 6
61
Your Homework
1.Use Verbeek’s data set MALES which contains panel data for 545 full-time working males over the
period 1980-1987. Estimate a wage equation which explains the individual log wages by the variables
years of schooling, years of experience and its squares, and dummy variables for union membership,
being married, black, Hispanic, and working in the public sector. Use (i) pooled OLS, (ii) the
between and (iii) the within estimator, and (iv) the random effects estimator.
2.
n
April 19, 2013

Contents
nPanel Data
nPooling Independent Cross-sectional Data
nPanel Data: Pooled OLS Estimation
nPanel Data Models
nFixed Effects Model
nRandom Effects Model
nAnalysis of Panel Data Models
nPanel Data in GRETL
n
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
62

A Model for Two-period Panel Data
nModel for y, based on panel data for two periods:
n yit = β0 + δ1dt + β1 xit + εit
n     = β0 + δ1dt + β1 xit + αi + uit
n i = 1,..., N: sample units of the panel
n t = 1, 2: time period of sample
n dt: dummy for period t = 2
nεit = αi + uit: composite error
nαi: represents all unit-specific, time-constant factors; also called unobserved (individual)
heterogeneity
nuit: represents unobserved factors that change over time, also called idiosyncratic or
time-varying error
quit (and εit) may be correlated over time for the same unit
nModel is called unobserved or fixed effects model
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
63

Estimation of the Parameters of Interest
nParameter of interest is β1
nEstimation concepts:
1.Pooled OLS estimation of β1 from yit = β0 + δ1dt + β1xit + εit based on the pooled dataset, εit =
αi + uit
qInconsistent, if xit and αi are correlated
qIncorrect standard errors due to correlation of uit (and εit) over time; typically too small
standard errors
2.First-difference estimator: OLS estimation of β1 from  the first-difference  equation
n ∆yi = yi1 – yi2 = δ1 + β1 ∆xi + ∆ui
qαi are differenced away; correlation of xit and αi not relevant
qCorrelation of uit (and εit) over time not relevant
3.Fixed effects estimation (see below)
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
64

Wage Equations
nData set “males”, cross-sectional samples for 1980 and 1987
n(1): OLS estimation in pooled sample
n(2): OLS estimation in pooled sample
n with interaction dummy
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
65
(1)
(2)
interc.
coeff
1.045
1.241
s.e.
0.048
0.056
exper
coeff
0.160
0.073
s.e.
0.017
0.021
exper2
coeff
-0.008
-0.006
0.001
0.001
d87
coeff
0.479
s.e.
0.076
R2 (%)
16.2
19.0

Pooled OLS Estimation
nModel for y, based on panel data from T periods:
n yit = xit‘β + εit
nPooled OLS estimation of β
nAssumes equal unit means αi
nConsistent if xit and εit (at least contemporaneously) uncorrelated
nDiagnostics of interest:
qTest whether panel data structure to be taken into account
qTest whether fixed or random effects model preferable
nIn GRETL: output window of OLS estimation applied to panel data structure offers a special test:
Test > Panel diagnostics
nTests H0: pooled model preferable to fixed effects and random effects model
nHausman test (H0: random effects model preferable to fixed effects model)
n
n
n
April 19, 2013
Hackl, Econometrics 2, Lecture 6
66