Econometrics Werner G. Müller Department of Applied Statistics (Institut für Angewandte Statistik) . Johannes-Kepler- niversität Linz WS 2009/10 What is Econometrics? ÍÍFASI Third edition A GUIDE TO Modern Econometrics MARNO VERBEEK Introductory Econometrics JEFFREY M. WOOLORIDGE „[...], econometrics is the interaction of economic theory, observed data and statistical methods. It is the interaction of these three that makes econometrics interesting, challenging, and perhaps, difficult." Verbeek (2000, 2008) Econometrics is based upon the development of statistical methods for estimating economic relationships, testing economic theories, and ■ evaluating and implementing government and business policy." Wooldridge (2006) 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 2 Criticism „That there is anyone I would trust with it at the present stage, or that this brand of statistical alchemy is ripe to become a branch of science, I am not yet persuaded. But Newton, Boyle, and Locke all played with alchemy. So let him [Tinbergen] continue." Keynes (1939, Die Okonometriker sind im Wesentlichen Statistiker, die sich einbilden, die komplexe Welt der Wirtschaft in Zahlen abbilden zu können. Dagegen war selbst . der Kommunismus subtil." Gansterer (2003) 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz Slides are based on ÍÍFASI Manuscript for the planned textbook „Ökonometrie Praxis!" of W.G. Müller (JKU) and T.Url (Wifo), in German. Excerpts and additional material on the IFAS-Homepage: www.ifas.iku.at. Complimentary GRETL course by Daniel Němec (Masarykovy Univerzity Brno). 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 4 Used Software (■FAS) Examples in the manuscript are made in Eviews6. Homeworks should be done in GRETL. GRETL is for free, a cheap student version of Eviews4.1 is fully sufficient for replicating most of the examples in this course. Tipp! EViEWsO 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 5 Chapter 1 - Contents 1 Inflation (Cagan model and data - simple linear regression descriptively) 1.1 Hyperinflation 1.2 Data-entry (working with EVI EWS and GRETL) 1.3 Graphical display of data (descriptive statistics) 1.4 Datatransformation 1.5 An estimation technique (least squares principle) 1.6 Spurious correlation (spurious regression) 1.7 Homework exercise (Lui Data in the Cagan model) 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 6 (■FAS] 1. Inflation ÍÍFASI „If the governments devalue the currency in order to betray all creditors, you politely call this procedure inflation." George Bernard Shaw (1856-1950) 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 7 The Consumer Price Index In a consumer price index (CPI) PCt the prices of the most important 1, ....,N consumed goods and services will be comprised as follows for a period t: pa = Wi r P ^ \P1B J + W- ŕ p ^ P \r2B J + ... + W N r P ^ rNt P N s i=l W; ŕP^ í it P and compared to the price in a basis year B of the index. The importance of the /-th component of the basket of goods is reflected by the weight wv The more of a component is consumed, the higher is his weight in the index and the stronger will its price changes affect the value of PQ. In the basis year the value of the index is PCt=l, because t-B holds and the prices are thus identical. The published price indices are usually multiplied by a factor 100, such that PQ=100 holds for the basis year. y i» ca w 10 n i7 0i cs mcm cs asoŕce-OD m n u en NOHDiUHuniDii i?di fl! qhh os » oj i 2005 2006 20D7 'J-.-L-i j jvliollŕfílilnd*« Z0DÜ- 100 V-uľOŕilirtfWplŕ^ŕiííW EKB - 100 Quelle Sljlinl* fnsira 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 8 The Inflationrate ÍÍFAsl The Inflationsrate 7Ttxl00% measures the percent change of a price index between two consecutive observation periods, i.e: 7C, = (P,-PP1)/P,4. An alternative definition is: Kt- log Pt - log Ptl This assumes a continuous exponential growth process Pt = PtT ■ expf^TJ, which holds only approximately. For T=l therefore holds Pt = Pt_j ■ exp(7Tt) and hence nt - log(P/Pt_j). Inflationsrate 2000 - 2009 Veränderung gegenüber dem Vorjahr in % □□HZ IWrcdMFT MUH HI KM « TI ■ H HC IE 2,3 L1 QgrtüS 2000 2001 2002 2003 2004 2005 2006 2007 2O08 2009 Quelle: Statistik Austria, WIFO Foto: BUderBm Grafik: WKO/Statistik 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 9 The Economic Impact of Inflation ÍÍFASI • Unemployement: Phillipscurve • Growth: money demand Look e.g. into the fourth edition of the famous textbook by Blanchard (2006) 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 10 Theory of Money Demand (■FAS) Money market equilibrium: Md/P = Y L(r+ ff) = M/P. Geldmarkt LM—Kurva H ■P ■■J íl Li k i r,L /ĽM2 / ,ĽM1 H -l-l d n n .............................= r-ä- ----------........ ^.^ __ _________ ______£_1_ ^~ liír,Y) ---------------------------h- --------------------------------------------------------------------------b M2/P Ml/P M/P RQaLkassa Vo L k £3 q i n k oranie n Graphics and Java-Applet from the thesis of Walter Ebner, WU 1999 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 11 1.1 Hyperinflation lift] AM M M P = ™ &{? + **)] M AM P " f>uni)ertuuiiciii> i kronen .... $JR$ MOP** Cagan, P., „The Monetary Dynamics of Hyperinflation", in Friedman, M., „Studies in the Quantity Theory of Money", University of Chicago Press, Chicago, 1956, S. 25-117. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 12 1.2 Data Entry EViews U\Ľ\\K File Ed it Object V ie w Proc Q_u ick Options Window He Ip Path = c:\rnuelleraurl DB = none WF = none 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 1.2 Data Entry Date i Werkzeuge Hi Ife Keine Datendatei geladen ID # ^ Variablennarne * Beschreibung @ □ s /* m myt fi b 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz Dataset: M/P 0.1053 0.1010 0.1010 7t 0.0601 0.1407 0.0682 7ť 0.0382 0.0428 0.0440 Monthly data, from January 1921 until august 1922 from Austria. 0.1064 0.1064 0.1010 0.1235 0.1010 0.0909 0.0719 500000 r ffiiiii=:uiibcnuuiieiiö Srpaen 500ÜQ! ■ 0.0431 0.0058 0.1391 -0.1075 0.2747 0.2832 0.4960 0.5782 0.8517 0.0440 0.0421 0.0470 0.0394 0.0509 0.0622 0.0834 0.1075 0.3753 0.1204 0.3569 0.1319 0.1283 0.1319 0.0269 0.1266 0.1492 0.1278 0.3435 0.1384 0.2871 0.1458 0.6544 0.1704 0.2038 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 15 1.3 The Graphical Display of Data ÍÍFAsl The line-graph: 21:01 21:04 21:07 21:10 22:01 22:04 22:07 M P-----PIE 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 16 1.3 Graphical Display of Data ÍÍFASI The Scatterplot: 14 Quantification by means of the coefficient of correlation (p429): p = Kov(M IP,7Te) ^Var(M IP)Var(7ie) 12- .10- I .08- .06- .04- 02 0® ODOO .00 .04 p = -0.974 O 0 n-----------1-----------1-----------1- 08 .12 .16 .20 PIE .24 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 17 1.4 Datatransformation (ífSt) Data-driven, e.g. Box-Cox transformation: V ' ln(Y) if A = 0. Theory-driven, e.g. from a money demand function with proportional elasticity of inflation: M/P = expf-atf - yj. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 18 1.4 Datatransformation ÍÍFASI The non-linear relationship between money demand and expected inflation can be transformed to a linear relationship by taking logarithms on both sides: ■2.0 log M/P = -art - y. For further analysis the variable m_p must thus be logarithmically transformed. PIE 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 19 Parameter Estimation ÍÍfas) Regression towards the mean Sir Francis Galton (1886) RATE OF REGRESSION IN HEREDITARY STATURE ___________________________Fifr(a)___________ The Deviates of the Children are to those of their Mid-Parents as 2 to 3. When Mid-Parents are taller than mediocrity, A^ their Children tend to be shorter than they /^y When Mid Parents are shorter than mediocrity, their Children tend to he taller than they. JriWRSmsheM 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 20 1.5 An Estimation Method ÍÍFAsl The most common method of regression analysis is the so-called least-squares (LS approach (8pp). Here, the sum of squared vertical deviations of the entries in the scatter-plot from the regression lines is minimized, thus in our example min X/log M/Pt + an* + y)2. LOGM P vs. PIE -2.0 -3.0- 00 0.05 0.10 0.15 0.20 0.2í PIE The solutions of this minimization task ď and f are the (ordinary) least squares estimators of the parameters. They define the location (intercept) and slope of the regression line. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 21 In GRETL ÍÍFAsl Datei Werkzeuge Daten Ansicht Hinzufügen Stichprobe Variable Modell Hilfe 01_austria,gdt * ID # * Variablenname < Beschreibung 0 const automatisch generierte Konstante 1 M_P 2 PI 3 PIE gretl: Modell spezifizieren |^[n][X KQ Abhängige Variable □ Robuste Standardfehler Lags, Monatlich: Voller Bereich 1921:10 - 1923:05 Hilfe Leeren Abbrechen OK 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 22 Concrete Estimates ÍÍFAsl Equation Specification Equation Specification: x] Dependent variable followed by list of repressors including ARMA and PDLterms, OR an explicit equation like Y=c(1)+c(£)*X. Entries under Coefficient, which denote the estimates -or = -8.74 and -T = -1.87 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 23 Backtransformation ÍÍFASI via Global-Fit-Options r, Global Fit Options Y transformations: X transformations: > None 4- -Logarithmic- > Inverse > Power 1 > Box-Cox 4 None > Logarithmic > Inverse | / rUYYKI I | ~ > Box-Cox 0~~ > Polynomial pf~ J Robustness Iterations Fitted Y series (optional) Log M_P vs. PIE 0.00 0.05 0.10 0.15 PIE 0.20 0.25 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 24 Requirements (p15&16) • The statistical model is linear in the parameters. • The errors et are random, they are independent of each other, have expectation 0 and a constant variance a/. • The regressorsXare strictly exogeneous, i.e. independent from past, current and future errors and linearly independent. • The number of observations exceeds the number of regressors. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 25 (■FAS] Matrixnotation (p12) ÍÍFASI The linear model can be written as y = Xß+e ' y, '< with y = Here is/are WtJ í x = 1 x1 \ V X JCrp J ,ß = rßA A 8 = \rij \£TJ ß={-y,-a}'={ß(>ß1y the parameters, e=fep...,eTythe error terms, y={ylf...fyTy the regressand with yt = log M/Pt, Xthe regressormatrix with x={xlf...,xTy and xt n? 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 26 OLS-Estimator in Matrixnotation (p13) ÍÍFASI The least squares estimator for /? can now be compactly written as ßr = (X'XytX'y, (2.19) that is a special linear combination of the observations of the regressand. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 27 Derivation (p12&13 and Appendix A) ÍÍFASI Principle of least squares: fľ= arg mmßS(ß) Partial differntiation of S(ß) = e'e = (y-Xß) '(y-Xß) = yy-2yJCß+ß'XJCß leads to — = -2X'y + 2X'Xß dß y H Setting equal zero yields the normal equations Xltß=Xy. 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 28 Invertibility We require the matrix XX to be of full rank and thus also X (see requirements). Caution! We do not have that, when • T < k (numbers of observations is smaller than numbers of parameters) • there are linear relations between the vectors of regressors. 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz Residuals (p9) Almost as a side-product during estimation we produce the so-called residuals. They correspond to the vertical distances of the observations form the regression line, i.e. er =y-Xß =y-ý. The residuals thus reflect the difference between the individual observed value of the explained variable y and the value j>, which is predicted (estimated) through the model. 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 30 (■FAS] 1.6 Spurious Correlation (p323) When two time series consist of growing or falling values, it often happens, that they exhibit high correlations, although they do not have any causal relationship. Solution: „detrending" Abb,2.3 Korrelation fischen der Abrahme brütender Stofchcnpajre und dem Geburtenrückgang in der Bundesrepublik Deutschland íwfcdien V365-19B0 (nach St« [30]) (lFA$l 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 31 Trend ÍÍFAsl Longterm development of a variable Linear detrending: ťyt = log M/Pt - ßy0 - ßyl t ťxt= nf-ßx0 -ßxlt 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 32 Corrected Result ÍÍFAsl Dependent Variable: RESIDLOGM_P Method: Least Squares Sample: 1921:01 1922:08 Included observations: 20 Variable Coefficient Std. Error t-Statistic Prob. C 1.99E-16 0.009838 2.02E-14 1.0000 RESIDPIE -10.84415 0.637443 -17.01195 0.0000 R-squared 0.941446 Mean dependent var 2.08E-16 Adjusted R-squared 0.938193 S.D. dependent var 0.176964 S.E. of regression 0.043995 Akaike info criterion -3.314839 Sum squared resid 0.034840 Schwarz criterion -3.215266 Log likelihood 35.14839 F-statistic 289.4065 Durbin-Watson stat 1.973688 Prob(F-statistic) 0.000000 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 33 The (multiple)2-Regressor Model (p19) ÍÍFA$1 Again the linear model is written as y = Xß+e with y = Here is/are '30 Wt) íl ,X = xn x2l \ lxx X, \iJV\T ^IT J ,ß = fßA ß Vß2j £ = \£T J ß={-y,-a,S}'={ß0,ß1,ß2}' the parameters, e=fep...,eT}'the error terms, y={yv...yyTy the regressand with yt = log M/Pt, Xthe regressormatrix with xi=/r^/,..., 7iTey and 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 34 Frisch-Waugh(-Lovell) Theorem ÍÍFASI The coefficients from the linear detrended regressions correspond to the ones from multiple regressions with the trend as an additional regressor. A more general version later.... 13.10.2009 phd-course "Econometrics", © W.G. Müller, JKU Linz 35 1st Homework: Inflation during the southern Sung-Dynasty ÍÍFASI Lui, F.T., "Cagan's Hypothesis and the First Nationwide Inflation of Paper Money in World History", Journal of Political Economy, 91(6), 1983, S. 1067-1074 (also in "Major Inflations in History", herausgegeben von F.H. Capie, Cheltenham, U.K., Edward Elgar Publishing Ltd., 1991). Periode Mt P, 1161-1170 100 100.0 1171-1180 204 86.7 1181-1190 224 107.3 1191-1200 827 183.9 1201-1210 1429 279.8 1211-1220 2347 280.2 1221-1230 2755 335.5 1240 4949 4032.2 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 36 2.3 The coefficient of determination (p21) ÍÍFAsl is a measure of Goodness determined from the variance residuals as follows of fit, of the Z R2=\- —\2 Zu-y) (2.42) In the regression model with intercept, it holds that 0 < R2 < 1 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 37 Exhibit 1 For each of the five questions below ciroie the most correct response, in answering th&se questions use the following definitions and diagram: Regression line: theiine that minimizes the sum of squared errors; coefficient of determination: RF, calculated as one minus the ratio of the sum or squared errors to the total sum of squares; total sum of squares; the sum of squared deviations of Y around fts mean. in For the fojr sample points, which line best represents the average of the ť data? (Note that the point (0H 0) is one of the observations,) A. Va S, Yb C. Yc D. Yd E. Ye 2. For ihe four samples points, which line best represents the regression line? A. Ye B. Yb C. Yc D Yd E. Ye 3. The coefficient of determination for ihe regression line in Question 2 Is; A, Less than 0 B. 0 C. Between ,0 and .5 D, .5 E, Greaier than ,5 4. If the regression line Is forced lo pass through Ihe origin, which line besl represents the regression line incorporating this zero intercept constraint: A. Ya B. Yb C- Yc D. Yd E. Ve 5. The coefficient of determination for the constrained regression in Question 4 is: A. Less than 0 6. 0 C. Between 0 and ,5 Dr 5 E, Greater than P5 Y Yb / y - Yc ■ 1 / ^r^ ^ Yd V- r^ ■-----------------------------------•---------■---------------------------------- Figure 7. Figure 2 Econometrics on TV 13.10.2009 phd-course "Econometrics", ©W.G. Müller, JKU Linz 41