PSY532 R101: A practical guide to using R as your everyday statistical tool

Faculty of Social Studies
Autumn 2014
Extent and Intensity
1/1/0. 4 credit(s). Type of Completion: z (credit).
Teacher(s)
Anastasia Ejova, PhD. (lecturer)
doc. Mgr. Stanislav Ježek, Ph.D. (lecturer)
Mgr. Kamila Dufková (assistant)
Guaranteed by
Anastasia Ejova, PhD.
Department of Psychology – Faculty of Social Studies
Supplier department: Department of Psychology – Faculty of Social Studies
Timetable
Mon 17:00–18:30 PC26
Course Enrolment Limitations
The course is only offered to the students of the study fields the course is directly associated with.
fields of study / plans the course is directly associated with
there are 48 fields of study the course is directly associated with, display
Course objectives
This course has three aims. First, it is a course in how to become completely independent of SPSS, should you find yourself in a workplace without an expensive SPSS license. A second aim is to provide a “refresher” course in common statistical analyses. No matter what software you see yourself using in the future, this is a chance to revise statistical theory and even learn new concepts that R requires you to consider as you tell it what to do. A final aim is to make you excited about the wide range of analyses possible in R. Often, when reviewers comment on drafts of your papers, they will suggest small statistical checks that are easier to perform in R than SPSS. Having completed this course, you should feel confident exploring any of these specialised operations in R.
Syllabus
  • 1. Data management: Entering, inspecting and “cleaning” your data in R This lecture and seminar set will describe how to import data into R from Excel and SPSS. Students will additionally become acquainted with the R-Studio interface, which divides the screen into four meaningful windows for viewing and analysing data. The data set we will be using for most of the course will also be introduced, alongside the methods of obtaining means, standard deviations and other descriptive statistics in R. 2. Basic ANOVA and regression The lecture will briefly revise the principles of analysis of variance and regression before running through examples in R. We will cover t-tests, one-way ANOVA with contrasts and post-hoc tests, factorial ANOVA, repeated measures ANOVA, analysis of covariance, simple linear regression, and hierarchical linear regression. The seminar will introduce “bootstrapping” in relation to ANOVA and regression. 3. Graphing R has many “apps” (or “packages”) for drawing graphs, and, in these two weeks we will learn one of the most popular, a package called ggplot2. While enabling you to draw many different kinds of quick graphs to explore data, this package also allows you to adjust graph features for publications and presentations. 4. Working with skewed, clustered and categorical data Often we wish to examine group-based differences in variables that are not normally distributed, clustered based on some other variable such as gender, or categorical (e.g., responses belonging to one of two categories – “yes” and “no”). This lecture-seminar set will cover techniques for working with this kind of data: generalized linear modelling, multilevel modelling, logistic regression and chi-square analysis. A new data set will be introduced for some of the analyses. Diverting our attention from R a little, we will discuss how results from some of these analyses tend to be interpreted and reported in research articles. 5. Bayesian data analysis We will dedicate one week to Bayesian data analysis, an approach to hypothesis testing that is gaining popularity around the world. SPSS is not suitable for this kind of analysis, whereas R is among the programs that offer many options. Under the Bayesian approach, belief in a hypothesis should be based on the data and prior assumptions about the probability of the hypothesis. The lecture will highlight the convenience of reporting degree of belief in a hypothesis as opposed to the usual practice of reporting degree of belief in the hypothesis if the null hypothesis were true. Our exploration of “prior assumptions” will be largely in the context of a Bayesian analysis R package (arm) where the prior assumption is that the variable being predicted in a linear model has few rather than many predictors. The package will be used to re-analyse our course data sets, with reporting and interpretation discussed. 6. Handling missing data Among researchers using SPSS, a common approach to handling missing data is the replacement of missing values according to an expectation-maximisation (EM) algorithm. We will discuss what this algorithm means in broad terms and then consider the advantages of using the algorithm to calculate multiple possible missing values instead of just one. The norm package in R does precisely this, giving us what is termed a “multiple imputation”. The lecture will cover an alternative approach to multiple imputation that is useful if categorical variables are among those missing. The mi package developed for this purpose will be demonstrated. In the seminar, we will put R aside for another moment and discuss an emerging methodological trend – planned missingness. This is the approach of asking participants to answer a smaller subset of survey questions rather than the full set. Analysis of the full survey is subsequently possible following multiple imputation.
Language of instruction
English
Further comments (probably available only in Czech)
Study Materials
The course is also listed under the following terms Spring 2015, Autumn 2016, Autumn 2017, Autumn 2018.
  • Enrolment Statistics (Autumn 2014, recent)
  • Permalink: https://is.muni.cz/course/fss/autumn2014/PSY532