Introduction to gretl Applied Economics Dali Laxton 16.10.2020 Outline 1 What is gretl? 2 gretl Basics 3 Importing Data 4 Saving as gretl File 5 Running a Script 6 First Exercises 7 Commands on Datasets 8 Commands on Variables 9 Graphs What is gretl? What is gretl? gretl is an acronym for Gnu Regression Econometrics and Time-series Library it is free econometrics software it has an easy Graphical User Interface (gui) it runs least-squares, maximum-likelihood, systems estimators... it outputs results to several formats very important for us in this course: it admits scripts (sequence of commands saved in a file) 1 / 42 What is gretl? How do I get gretl? Handy for remote learning since it is open-surce free software anyone can install it on home desktop without the need of license In order to install it go to: http://gretl.sourceforge.net It runs on Windows, Mac, Linux 2 / 42 Installation Open the webpage and go down where Download section is located. Click on the circled button for windows users Installation Choose the .exe file which is appropriate for your system Installation Run the .exe file and choose default options to proceed. Do not forget to add gretl directory to your PATH Start gretl To start gretl you can just type gretl in your search button. Select it and an empty gretl window will pop out. You can either run commands directly in the gretl console or write scripts and save them for future use Gretl console Gretl scripts What is gretl? How do I work with gretl?(1/2) the easiest way for beginners is by using its graphical user interface you can also use the console button of the toolbar: from the prompt (?) you can execute gretl commands one line at a time. the most efficient way is by using scripts: 1 2 3 4 3 / 42 create a script file, write gretl commands one every line, and save it run the script using the gui inspect output if needed, change script file, save it, and go back to step 2 Running commands in the script Click CTRL + R or click the button shown below. The window with outputs (or errors) will open. If you save these scripts, it will be saved as .inp file. What is gretl? How do I work with gretl?(2/2) you know your way using the gui, but want to know about scripts... 1 2 3 4 actions you do with the gui are stored as script in a led called session.inp gretl comes with over 70 practice scripts the manual gives good advice and devotes several chapters to provide good programming solutions this course will provide you with tested scripts 4 / 42 Opening datasets qGretl comes with inbuilt datasets which you can play with qGo File -> Open data -> Sample file and this will give you lots of datasets, or alternatively, go to specific data file which you want to access in your computer by choosing User file instead of sample file Opening datasets qIf you want to open file which has a different extension, for example excel, when you go to User File, do not forget to set it for excel searching, otherwise nothing will be shown in the window Opening datasets qFor now let’s open a Sample file the very first one offered ‘abdata’ by double-clicking it with a mouse Playing with data The following window will pop out. You can now check the values of each variable by double clicking them. You can edit them etc. Operations on variables qSee summary statistics of variables by highlighting them and clicking Variable -> summary statistics in the upper panel qChange values of variables by highlighting it and clicking Data -> Edit values Saving as gretl File Saving as a new gretl File You can open any other type of file using gretl (excel), do some operations on them and then save as a gretl file (.gdt) by clicking: File ->Save Data as ->Standard format 18 / 42 Running a Script Looking at the Session Script Tools > Command log displays the operations that we have done so far 20 / 42 Running a Script More on scripts using File/Script files/New script you open the command script editor If you have a very long command that exceeds one line, use the backslash (\) as a continuation command using scripts (and the console) requires you to use the correct language syntax gretl's language is case sensitive: gretl considers x to be different from X you can find all the commands in the gretl command reference after clicking Help -> command reference at the console window, you can type help 23 / 42 First Exercises Exercise 1 6 24 / 42 •Load the file engel.gdt. The data set contains two variables named income and foodexp. The variable foodexp is annual expenditures on food in a household and income is annual income measured in $100 increments. •Using GUI, display logs of both variables. •Using GUI, summarize foodexp for households with income below the median. •Using GUI, plot foodexp against income. •Write all the commands in a script file and save it. Open the script and run the commands. •Save the new dataset. First Exercises Exercise 2 The file BWGHT.csv contains information about infants' birth weight, sex and race, the income of their faimilies, their parents' education and the number of cigarettes the mother smoked per day during pregnancy (msmoke). 1 2 3 4 5 6 25 / 42 Import from gretl the file BWEIGHT.xlsx Compute the average birth weight for all the babies in the sample, and separately for boys and girls. Compute the proportion of mothers that smoked during pregnancy. Which familiy variables do you think are related to children's birth weight? Compute the correlation among bwght and those variables. Save the new database in gretl format. Write down all the previous commands in a script and save it. Close your session, open the script and run all the commands together. First Exercises Basic commands for data management (1/2) Commands on the entire data open: opens a data file replacing any data file already open smpl: defines the sample range dataset: sorts/clears/transposes/adds observations and more setobs: declares the structure of the data (cross-section, time-series, panel) append: appends the content of a data to the current dataset store: saves the data into a file 26 / 42 First Exercises Basic commands for data management (2/2) Basic commands on variables genr: creates a new variable delete: removes variables setinfo: sets attributes of a variable rename: renames a variable summary: shows summary statistics for variables print: lists the values of variables 27 / 42 Commands on Datasets open dataname −−www −−sheet= "name" −−coloffset=# −−rowoffset=# opens a dataset replacing any already loaded data −−www opens database in gretl server with spreadsheets, it selects the worksheet, and the first column and row the first row must contain valid variable names. In the case of an ASCII or CSV import, if the file contains no row with variable names the program will automatically add names, v1, v2 and so on. open C:\there\mydata.xls −−sheet= mysheet −−colo set=3 −−rowo set=2 opens worksheet mysheet from C:\there\mydata.xls reads the data from the fourth column and third row 28 / 42 Commands on Datasets smpl (#start #end | condition −−restrict | # −−random | full) −−replace −−balanced condition −−restrict: restricts the sample to observations that satisfy the condition # −−random: # cases are ramdomly selected full: restores the full data range sample restrictions are by default cumulative: −−replace turns off all previous restrictions Examples (using Example2.xls) smpl YEAR!=1976 −−restrict smpl EMP > 3 −−restrict −−replace smpl 50 −−random 29 / 42 Commands on Datasets dataset (addobs # | transpose | sortby varname | resample # | clear) addobs: adds extra observations at the end transpose: transposes current data set. sortby: sorts data by varname (dsortby: descending order) (a list of variables can be provided; available only for undated data). resample: random sampling (Constructs a new dataset by random sampling, with replacement, of the rows of the current dataset. The original dataset can be retrieved via the command smpl full). clear: clears out current data Examples dataset sortby EMP dataset resample 500 dataset clear 30 / 42 Commands on Datasets setobs #freq #start (−−cross-section | −−time-series | −−stacked-cross-section | −−stacked-time-series) #freq represents frequency in time-series data in panel, #freq is units in stacked cross-sections or periods in stacked time series for cross-sections, #freq=1 #start=1 for panels and cross-sections in time series, #start is the starting date 31 / 42 Commands on Datasets setobs unitvar timevar −−panel-vars imposes a panel interpretation sorts data as stacked time series, by ascending values of unitvar Examples setobs 1 1 −−cross-section setobs 20 1:1 −−stacked-time-series setobs unit year −−panel-vars 32 / 42 Commands on Datasets append newdata −−time-series opens a data file and appends the content to the current dataset First case: additional observations for existing variables Second case: new variables (best if #obs compatible) Third case: appends a time series in a panel First Case open C:\there\Example2.xls −−sheet= first100 append C:\there\Example2.xls −−sheet= moreunits appends worksheet moreunits from C:\there\Example2.xls Second Case append C:\there\Example2.xls −−sheet= wages appends worksheet wages from C:\there\Example2.xls 33 / 42 Commands on Datasets append Third Case You have a panel and you want to add a variable which is available in time-series form. For example, you want to add annual CPI data to a panel in order to de ate nominal income gures. open the data: C:\there\Example2.xls −−sheet= first100 you need to have a panel: setobs unit year panel-vars append C:\there\Example2.xls −−sheet= cpi 34 / 42 Commands on Datasets store data le [varlist] −−gzipped −−overwrite by default data saved in gretl format also exports to csv (using −−csv) and other formats store C:\there\mydata.gdt saves current data to C:\there\mydata.gdt 35 / 42 Commands on Variables [genr] newvar = formula a formula is a well-formed function of variables the range over which the result is written depends on the current sample arithmetical operators: ^, *, /, + , - boolean operators: ! (negation), && (AND), || (OR), >, <, =, >= , <= , != look at the gretl Function Reference (Help/Function Reference) for built-in functions Examples genr y = 3 + 2 * x1 + 5 * x2 + error D1976 = (YEAR = 1976) genr avgy = mean(y) 36 / 42 Commands on Variables delete [ varlist ] −−db removes listed variables if no varlist is given, it deletes the last (highest numbered) variable from the dataset −−db: deletes variables from a gretl database 37 / 42 Commands on Variables setinfo varname −d "thislabel" −n "thisname" −−discrete −−continuous −d "thislabel": thislabel is set as the variable's descriptive label −n "thisname": thisname is used in place of the variable's name in graphs −−discrete: marks variable as discrete (by default variables are continuous) Examples setinfo x1 −d "Description of x1" −n "Graph name" setinfo z −−discrete 38 / 42 Commands on Variables rename varname newname changes the name of the variable names must be of 15 characters maximum they must start with a letter they must be composed of only letters, digits, and the underscore character Example in bwght.gdt rename bwght birth_weight 39 / 42 Commands on Variables summary [ varlist ] −−simple −−by=byvar prints summary statistics for variables in varlist if varlist is omitted, it prints statistics for all variables −−simple: only prints the mean, minimum, maximum and standard deviation −−by=byvar: statistics are printed for sub-samples defined by the values taken on by byvar Example in bwght.gdt summary bwght summary bwght −−simple summary bwght −−simple −−by=male summary bwght −−simple −−by=parity 40 / 42 Commands on Variables print [varlist] −−byobs −−no-dates prints the values of the variables in varlist if no list is given, prints the values of all variables −−byobs: data are printed by observation, not by variable you can also print strings Examples print bwght male −−byobs print bwght ; male −−byobs : print "This is a comment" 41 / 42 Graphs A (very) brief graph menu gnuplot yvars xvars: xy graphs scatters yvar ; xvarlist: pairwise scatterplots freq yvars, or using the gui Variable/Frequency Distribution: histograms 42 / 42