PSY544 – Introduction to Factor Analysis
Homework assignment 2, Fall 2019
Due midnight, December 9, 2019
Data from a sample of N = 139 decathlon atletes were collected concerning their performance
in each individual decathlon event (100m, Long Jump, Short Put, High Jump, 400m, 110m High
Hurdles, Discus, Pole Vault, Javelin, 1500m). All performance variables were coded in such a
way that the higher a person‘s score, the better his/her performance. A 10 x 10 correlation
matrix was computed from the data:
Part 1
Obtain unrotated Maximum Likelihood (MWL) solutions for m = 0, 2, 3 and 4 factors. Let CEFA
save the iteration details and increase the maximum number of iterations to some higher than
default number (like 300). You will encounter “parameters on the boundary” (likely Heywood
cases) – ignore this problem for the rest of the homework and don’t let it affect responses to
the questions below.
a) For the 2-factor solution, explain (using your own words, but correctly and sensibly, so I
understand) why the sample value of the discrepancy function 𝐹̂ (“Disc Fun” in CEFA output) in
the “iteration details” follows the pattern shown in the output, from one iteration to the next.
Explain what is meant by “convergence” in this context.
b) For the m = 2, 3 and 4 solutions, create a table showing the sample value of the discrepancy
function 𝐹̂, the likelihood-ratio test statistic (𝜒2
), the degrees of freedom (df), the effective
number of parameters (npar), the estimated RMSEA and the associated 90% CI.
For the 3-factor solution…:
c) …comment on the meaning of each of the values you have put into the table for the previous
question. Not what they imply for any of the models, but what do the statistics / numbers
represent.
d) …show the relationship (mathematically) between 𝐹̂ and the likelihood-ratio test statistic.
e) …show the relationship (mathematically) between 𝐹̂ and the population discrepancy
function value, 𝐹̂0
f) …show the relationship (mathematically) between 𝐹̂0 and the point estimate of RMSEA
g) …compute the final communality for the second variable (Long Jump) using the factor
loadings. What does this value mean?
h) Based on the results in your table and the residual correlations for each model, how many
factors should be retained? Justify your response. Don’t pay attention to interpretability of the
solution for the purposes of this question.
Part 2
Obtain the eigenvalues of the sample correlation matrix R (CEFA gives you those, or compute
them some other way).
a) What number of factors is suggested by the Kaiser criterion? Confront (and comment on) this
with respect to your answer to h) in Part 1.
b) Draw a scree plot and use it to make a judgement on the number of factors to be retained.
Again, confront (and comment on) this with respect to your answer to h) in Part 1.
Part 3
Conduct a rotation on the 4-factor OLS solution.
a) State what rotation method you used and whether it’s an orthogonal or an oblique rotation.
Explain the defining difference between the two.
b) In what sense is the rotated solution “better” than the unrotated one? Did the rotation help
you in anything?
c) Briefly interpret the factors in the rotated solution (i.e., what could these modelled variables
possibly represent from real life?)
Optional – Extra credit (feel free to do both, either, or none)
You’re already familiar with some of the many rotational criteria (Varimax, Quartimax, ...), now,
let’s get creative.
a) Look up some rotational criterion we haven’t covered in class. Copy the simplicity /
complexity function into your assignment and explain in your own words what’s the gist of the
function or point out some of the differences between the function and functions we have
covered in class.
b) Try to come up with your own simplicity or complexity function and explain it. Of course this
is no easy job, so don’t think too hard about it and just jot some ideas down and explain how
they would relate to the principle of simple structure.