Fitting the Common Factor Model II PSY544 – Introduction to Factor Analysis Week 6 Maximum likelihood estimation •In the preceding lectures, we covered the least-squares approach to estimating the model parameters (model matrices). • •An important property of the techniques previously covered is that they do not make any distributional assumptions about the manifest variables. • •Now, we will briefly cover another technique – maximum likelihood (ML) estimation. Maximum likelihood estimation •Maximum likelihood (ML) estimation does make distributional assumptions. • •These assumptions are about the normality of data. In other words, ML estimation assumes the MVs are normally distributed. • •ML estimation is not only used in FA, but pretty widely in statistics overall, so it comes in handy to have an idea about how it works. Right? J Maximum likelihood estimation •ML estimation, in general, can be described as follows: • •1) You assume you have a random sample from some well-defined population. •2) You assume that the distribution of manifest variables has some particular form (in our case, multivariate normality) • •The likelihood function can be defined as follows: •L = Likelihood function = function(data, parameters) Maximum likelihood estimation •The likelihood function indicates the likelihood of the data, given model parameters. • •The principle goes as follows: Given the data we have, we obtain the values of parameters that maximize the likelihood function. These values of the parameters are the maximum likelihood estimates of the real (unknown) parameters. • •Let me illustrate with aliens. Maximum likelihood estimation •I’ll spare you most of the math here, the important thing is that you understand MLE (maximum likelihood estimation) conceptually • •The actual values of the likelihood function tend to be very small, which can give computers hard time (due to rounding errors) and can be grossly ineffective • •For these reasons, we work with a function that is inversely related to the likelihood function – a function based on -2 x log(likelihood) (so we’re really looking for the minimum of this function which corresponds to the ML maximum) Maximum likelihood estimation Maximum likelihood estimation Maximum likelihood estimation •The convergence criterion is usually defined as the change in the discrepancy function values of two successive iterations. When the difference in these values drops below some pre-defined point, the iterations cease. • •Heywood cases can occur just like in the iterative least squares case. • Maximum likelihood estimation •After the converge, the likelihood ratio test statistic is computed by multiplying the value of the discrepancy function by (N – 1). • •This statistic is used in assessing model fit and testing hypotheses about model fit (we’ll get there shortly) • Summary •We have described three different methods for fitting the common factor model to sample data: • •Principal factors with prior communality estimates (noniterative) •Ordinary least squares (iterative principal factors) •Maximum likelihood • •These are methods for fitting the model to data. Many more methods exist, but OLS and ML are commonly available. • Evaluating model fit •Trying to fit a model to data is all nice and neat, but what is it good for without a way to evaluate how well the model fits? • •Models have an ultimate purpose – explanation, description and understanding (or prediction, but that’s not very hot in psychology) through simplification. • •Models represent an attempt to simplify reality as much as we can so that we can understand it better. Simplify too much, and your model will not successfully represent reality. Simplify too little, and your model becomes unwieldy and begins to defy its purpose. Evaluating model fit •Models are never true, by definition (they’re models!) • •Anyway, when evaluating model fit, we ask ourselves a bunch of questions • •How appropriate is the model for the data? •How well does the model reconstruct the observed data? •… • Test of perfect fit Test of perfect fit •Note that the null hypothesis H0 has a different role from the one that is usual in research. We actually don’t want to reject this null hypothesis – rejecting the null means rejecting the model. Failing to reject the null means that the model is plausible. • Test of perfect fit DF intermezzo DF intermezzo DF intermezzo Test of perfect fit Test of perfect fit •If (N – 1)FML is significant, we reject the null hypothesis. In other words, we reject the model with m factors. • •If (N – 1)FML is NOT significant, we failed to reject the null hypothesis. In other words, we failed to reject the notion that the model with m factors fits perfectly in the population. • •Seems like a nice, clear, beautiful way to assess the fit of our model, don’t you think? • • • Test of perfect fit •It has one tiny problem. It doesn’t make sense. • •The problem is – did we ever believe H0 to be true? We know it isn’t. •We are building a model – an approximation to reality. What sense does it make to think that an approximation to reality will be perfect? It’s called an approximation for a reason! What’s the point of testing something we know isn’t true? • • • Test of perfect fit •Moreover, the likelihood ratio test statistic is defined as (N – 1)FML which follows the chi-square distribution if the model is correct. • •Even if the discrepancy (FML) is small, as N increases, so will the test statistic. A large enough N will inevitably result in a significant likelihood ratio test statistic, thus rejecting the model. With N large enough, even a well-fitting parsimonious model will be rejected. • •The only thing the test basically tells us is whether sample size is large enough to reject the null hypothesis. • • • Test of perfect fit •We hope for a model that fits well-enough, not for a model that fits perfectly (if it does fit perfectly, something is probably wrong – like the model being overly complex and not parsimonious) • •However, folks still use this test and you will see it in almost EVERY paper that contains a factor analysis model. I myself still put it in papers, just because I don’t want to deal with nagging reviewers. • •Be wiser than most FA users and don’t pay attention to this test. • • • RMSEA RMSEA RMSEA RMSEA •Browne & Cudeck, 1992, provide the following guidelines for interpretation of RMSEA: •< .05 -- close fit •.05 - .08 -- good fit •.08 - .10 -- acceptable fit •> .10 -- unacceptable fit • •These numbers are guidelines, they should NOT be used as cutoffs (which is what everyone does, of course) RMSEA •Note that the formula for RMSEA contains degrees of freedom. RMSEA “prefers” simpler model over more complex models. • •Generally speaking, if the value of the discrepancy function would be the same for two models, one with m = 2 factors and one with m = 3 factors, RMSEA would favor the simpler model (m = 2 factors). Confidence intervals for RMSEA •One cool thing about RMSEA is that we know its theoretical distribution and thus we can calculate confidence intervals around the point estimate. •The confidence intervals are provided by some software (CEFA, R…) •Use the confidence intervals! They give you information that the point estimate does not (e.g., uncertainty about the point estimate) •The point estimate might be, say, .06, but you might get different CIs: •(0.00; 0.14) – we don’t know if the model fits great or not so much •(0.05; 0.07) – we know the model probably fits well ML vs OLS •We’ve learned about both ML and OLS. Remember, both are just different methods for fitting the same model to data. •In other words, both are methods for estimating the model parameters. Typically, the estimates will not differ too much. • •Both are optimal in their own definition of optimality.