LECTURE 1 Introduction to Econometrics J´an Palguta September 20, 2016 1 / 29 WHAT IS ECONOMETRICS? To beginning students, it may seem as if econometrics is an overly complex obstacle to an otherwise useful education. (. . .) To professionals in the field, econometric is a fascinating set of techniques that allows the measurement and analysis of economic phenomena and the prediction of future economic trends. Studenmund (Using Econometrics: A Practical Guide) 2 / 29 WHAT IS ECONOMETRICS? Econometrics is a set of statistical tools and techniques for quantitative measurement of actual economic and business phenomena It attempts to quantify economic reality bridge the gap between the abstract world of economic theory and the real world of human activity It has three major uses: 1. describing economic reality 2. testing hypotheses about economic theory 3. forecasting future economic activity 3 / 29 4 / 29 EXAMPLE Consumer demand for a particular commodity can be thought of as a relationship between quantity demanded (Q) commodity’s price (P) price of substitute good (Ps) disposable income (Y) Theoretical functional relationship: Q = f(P, Ps, Y) Econometrics allows us to specify: Q = 31.50 − 0.73P + 0.11Ps + 0.23Y 5 / 29 INTRODUCTORY ECONOMETRICS COURSE Lecturer: J´an Palguta (CERGE-EI, Prague) 171922@mail.muni.cz Lectures: Tuesday, 10:15-11:00 a.m., room VT 203 Tuesday, 11:05-12:45 a.m., room VT 203 Web: https://is.muni.cz/auth/el/1456/ podzim2016/BPE_INEC/ 6 / 29 INTRODUCTORY ECONOMETRICS COURSE Course requirements: 4 home assignments (account for 4 × 10 = 40 points) written final exam (accounts for 60 points) to pass the course, student has to achieve at least 30 points in the exam and 50 points in total Recommended literature: Studenmund, A. H., Using Econometrics: A Practical Guide Adkins, L., Using gretl for Principles of Econometrics Wooldridge, J. M., Introductory Econometrics: A Modern Approach 7 / 29 COURSE CONTENT Lectures: Lecture 1: Introduction, repetition of statistical background Lectures 2 - 5: Linear regression models Lectures 6 - 12: Violations of standard assumptions Lecture 13: Final exam In-class exercises: Will serve to clarify and apply concepts presented on lectures We will use statistical software (Gretl) to solve the exercises 8 / 29 LECTURE 1. Introduction, repetition of statistical background probability theory statistical inference Readings: Studenmund, A. H., Using Econometrics: A Practical Guide, Chapter 17 Wooldridge, J. M., Introductory Econometrics: A Modern Approach, Appendix B and C 9 / 29 RANDOM VARIABLES A random variable X is a variable whose numerical value is determined by chance. It is a quantification of the outcome of a random phenomenon. Discrete random variable: has a countable number of possible values Example: the number of times that a coin will be flipped before a heads is obtained Continuous random variable: can take on any value in an interval Example: time until the first goal is shot in a football match between FC Barcelona and Real Madrid 10 / 29 DISCRETE RANDOM VARIABLES Described by listing the possible values and the associated probability that it takes on each value Probability distribution of a variable X that can take values x1, x2, x3, . . . : P(X = x1) = p1 P(X = x2) = p2 P(X = x3) = p3 ... Cumulative distribution function (CDF) : FX(x) = P(X ≤ x) = i=1,xi≤x P(X = xi) 11 / 29 SIX-SIDED DIE: PROBABILITY DENSITY FUNCTION 12 / 29 SIX-SIDED DIE: HISTOGRAM OF DATA (100 ROLLS) 13 / 29 SIX-SIDED DIE: HISTOGRAM OF DATA (1000 ROLLS) 14 / 29 CONTINUOUS RANDOM VARIABLES Probability density function fX(x) (PDF) describes the relative likelihood for the random variable X to take on a particular value x Cumulative distribution function (CDF) : FX(x) = P(X ≤ x) = x −∞ fX(t)dt Computational rule: P(X ≥ x) = 1 − P(X ≤ x) 15 / 29 EXPECTED VALUE AND MEDIAN Expected value (mean) : Mean is the (long-run) average value of random variable Discrete variable E [X] = i=1 xiP(X = xi) Continuous variable E [X] = +∞ −∞ x fX(x)dx Example: calculating mean of six-sided die Median : ”the value in the middle” 16 / 29 EXERCISE 1 A researcher is analyzing data on financial wealth of 100 professors at a small liberal arts college. The values of their wealth range from $400 to $400,000, with a mean of $40,000, and a median of $25,000. However, when entering these data into a statistical software package, the researcher mistakenly enters $4,000,000 for the person with $400,000 wealth. How much does this error affect the mean and median? 17 / 29 VARIANCE AND STANDARD DEVIATION Variance : Measures the extent to which the values of a random variable are dispersed from the mean. If values (outcomes) are far away from the mean, variance is high. If they are close to the mean, variance is low. Var[X] = E (X − E [X])2 Standard deviation : σX = Var[X] 18 / 29 DANCING STATISTICS Watch the video ”Dancing statistics: Explaining the statistical concept of variance through dance”: https://www.youtube.com/watch?v=pGfwj4GrUlA&list= PLEzw67WWDg82xKriFiOoixGpNLXK2GNs9&index=4 Use the ’dancing’ terminology to answer these questions: 1. How do we define variance? 2. How can we tell if variance is large or small? 3. What does it mean to evaluate variance within a set? 4. What does it mean to evaluate variance between sets? 5. What is the homogeneity of variance? 6. What is the heterogeneity of variance? 19 / 29 COVARIANCE, CORRELATION, INDEPENDENCE Covariance : How, on average, two random variables vary with one another. Do the two variables move in the same or opposite direction? Measures the amount of linear dependence between two variables. Cov(X, Y) = E [(X − E[X]) (Y − E[Y])] = E [XY] − E[X]E[Y] Correlation : Similar concept to covariance, but easier to interpret. It has values between -1 and 1. Corr(X, Y) = Cov(X, Y) σXσY 20 / 29 INDEPENDENCE OF VARIABLES Independence : X and Y are independent if the conditional probability distribution of X given the observed value of Y is the same as if the value of Y had not been observed. If X and Y are independent, then Cov(X, Y) = 0 (not the other way round in general) Dancing statistics: explaining the statistical concept of correlation through dance https://www.youtube.com/watch?v=VFjaBh12C6s&index=3& list=PLEzw67WWDg82xKriFiOoixGpNLXK2GNs9 21 / 29 RANDOM VECTORS Sometimes, we deal with vectors of random variables Example: X =   X1 X2 X3   Expected value: E [X] =   E[X1] E[X2] E[X3]   Variance/covariance matrix: Var [X] =   Var[X1] Cov(X1, X2) Cov(X1, X3) Cov(X2, X1) Var[X2] Cov(X2, X3) Cov(X3, X1) Cov(X3, X2) Var[X3]   22 / 29 STANDARDIZED RANDOM VARIABLES Standardization is used for better comparison of different variables Define Z to be the standardized variable of X: Z = X − µX σX The standardized variable Z measures how many standard deviations X is below or above its mean No matter what are the expected value and variance of X, it always holds that E[Z] = 0 and Var[Z] = σZ = 1 23 / 29 NORMAL (GAUSSIAN) DISTRIBUTION Notation : X ∼ N(µ, σ2) E[X] = µ Var[X] = σ2 Dancing statistics https://www.youtube.com/watch?v=dr1DynUzjq0&index=2& list=PLEzw67WWDg82xKriFiOoixGpNLXK2GNs9 24 / 29 EXERCISE 2 A woman wrote to Dear Abby, saying that she had been pregnant for 310 days before giving birth. Completed pregnancies are normally distributed with a mean of 266 days and a standard deviation of 16 days. Use statistical tables to determine the probability that a completed pregnancy lasts at least 270 days at least 310 days 25 / 29 CHI SQUARED DISTRIBUTION Chi-squared distribution with k degrees of freedom : χ2 k Let Zi ∼ N(0, 1) for each i and independent, then X = k i=1 Z2 i −→ X ∼ χ2 k 26 / 29 t AND F DISTRIBUTIONS Student’s t distribution with n degrees of freedom: tn Fisher-Snedecor F distribution with m and n degrees of freedom: Fm,n Let Z ∼ N(0, 1), X ∼ χ2 m and Y ∼ χ2 n, independent: Z Y/n ∼ tn and X/m Y/n ∼ Fm,n Note that as n grows, t distribution approaches N(0, 1) Why do we care? −→ Construction of confidence intervals, hypothesis testing 27 / 29 SUMMARY Today, we revised some concepts from statistics that we will use throughout our econometrics classes It was a very brief overview, serving only for information what students are expected to know already The focus was on properties of statistical distributions and on work with normal distribution tables 28 / 29 NEXT LECTURE We will go through terminology of sampling and estimation We will start with regression analysis and introduce the Ordinary Least Squares estimator 29 / 29