Comparing groups Chi-square test T-test E0420 Week 3 Let's analyze! •https://stats.idre.ucla.edu/other/mult-pkg/whatstat/ •Testing associations between: •1 categorical independent variable (IV, predictor, exposure variable) •T-test can handle only 2 categories (if more, use ANOVA) •Chi-square is not limited by the number of categories •1 dependent variable (DV, outcome) •Categorical – Chi-square test •Continuous – T-test • Association Value of one variable tells us something about the value of the other variable Distribution of one variable vary according to the other variable Let's analyze! •Categorical IV: •Gender (males vs females), type of exposure (eating fish vs not eating fish), work or school group/class (scientists vs administrators), or experimental/treatment group •DV •Continuous – BMI, depression score, hrs slept per night •Categorical – presence of a diagnosis (diabetes, pregnancy, depression), success or failiure •Comparing means (T-test) or proportions (Chi-square) of a DV across 2 groups/levels of an IV • •A word about categorization of continuous IVs •Mean/median/tercile split •Change in the original information, smaller effect size, potentially spurious effects How does Chi-square work? •1. Constructing contingency table (2x2 table, cross tabulation) •= number of cases in each category •2. Comparing observed numbers in each category to expected numbers in each category if there is no association (= null hypothesis) •= obtaining the χ2 statistic •3. Using χ2 distribution for specific degrees of freedom to determine how likely is the obtained χ2 if null hypothesis is true •= determining statistical significance (p-value) •If the probability is 5% (α = .05) or less, the χ2 is stat. sig. •In general, large χ2 suggests rejection of null hypothesis • • Comparing association between 2 categorical variables The Chi-Square Test in Structural Equation Modeling: This tests the null hypothesis that the predicted model and observed data are equal. Because you want your predictions to match the actual data as closely as possible, you do not want to reject this null hypothesis. In other words, a nonsignificant result for this test indicates good model fit. 1. Contingency table Feeling depressed Yes No Total Male 96 (40%) 144 (60%) 240 Female 72 (24%) 228 (76%) 300 Total 168 (31%) 372 (69%) 540 Observed numbers Large contingency tables valid if less than 20% of expected numbers are under 5 and none is less than 1 2. Obtaining the χ2 statistic Feeling depressed Yes No Total Male 96 (40%) 144 (60%) 240 Female 72 (24%) 228 (76%) 300 Total 168 (31%) 372 (69%) 540 Expected numbers Feeling depressed Yes No Total Male 75 165 240 Female 93 207 300 Total 168 (31%) 372 (69%) 540 Observed numbers Assumptions: The levels (or categories) of the variables are mutually exclusive Independence of the groups 2. Obtaining the χ2 statistic • • • • • • •In our case, χ2 = 15.97 3. Determining statistical significance •degrees of freedom (df) = (r-1)(c-1) where r is the number of rows and c is the number of columns •Critical value is based on the selected alpha level and df • •In our case, χ2 = 15.97 NS D-01smc https://www.mun.ca/biology/scarr/4250_Chi-square_critical_values.html -The number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary = the number of independent pieces of information that go into the estimate of a parameter -In general, the degrees of freedom of an estimate of a parameter are equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (most of the time the sample variance has N − 1 degrees of freedom, since it is computed from N random scores minus the only 1 parameter estimated as intermediate step, which is the sample mean) Chi-square distribution • Chi-square write-up •A chi-square test was performed to examine the association between gender and feeling depressed. The association between these variables was significant, χ2 (1, N = 540) = 15.97, p < .001. Women were more likely than men to feel depressed. T-test use •Purpose: •Comparing means of a continuous DV across 2 groups (binary IV) •Useful when we have „natural“ IV groups (e.g., experimental condition) •Types: •Independent T-test •Paired-samples T-test T-test assumptions •Assumption of independence •Applies for independent t-test •Assumption of normality •DV should be approximately normally distributed •Assumption of homogeneity of variance •The two independent samples are assumed to be drawn from populations with identical population variances •The variances of DV should be equal in both IV groups Normal/Gaussian distribution •Also called bell curve • https://commons.wikimedia.org/wiki/File:Wechsler.svg • https://commons.wikimedia.org/wiki/File:Distribution_of_Annual_Household_Income_in_the_United_State s_2010.png How does T-test work? •1. Obtaining mean and SD of the DV in each IV group •2. Comparing the observed differences in the group means to expected differences (= null hypothesis) •= obtaining the t statistic •3. Using t distribution for specific degrees of freedom to determine how likely is the obtained t if null hypothesis is true •= determining statistical significance (p-value) •If the probability is 5% (α = .05) or less, the t is stat. sig. •In general, large t suggests rejection of null hypothesis • - In t-test, expected differences are usually 0 1. Obtaining means and SDs •Testing differences in depression scores on Beck Depression Inventory (BDI) between males and females •Females •Mean = 9 •SD = 2 •N = 50 •Males •Mean = 6 •SD = 3 •N = 40 • • 2. Obtaining the t statistic Paired-samples T-test Independent T-test Equal variance assumed Equal variance not assumed -When calculating the magnitude of the effect (i.e., the difference between the samples), we need to take into account the measurement error in the samples -If the observations in our sample fluctuate a lot around the mean, the differences might be due to this fluctuation (measurement error) rather than due to genuine effect 3. Determining statistical significance •df = n - 1 for paired-samples •df = (n1 - 1) + (n2 - 1) for independent samples •Critical value is based on the selected alpha level, df, and whether we are testing one-tailed or two-tailed hypothesis • •In our case, t = 5.67 One-tailed or two-tailed hypothesis T-test write-up •The 25 participants who received the drug intervention (M = 480, SD = 34.5) compared to the 28 participants in the control group (M = 425, SD = 31) demonstrated significantly better peak flow scores, t(51) = 2.1, p = .04. •There was a significant increase in the volume of alcohol consumed in the week after the end of semester (M = 8.7, SD = 3.1) compared to the week before the end of semester (M = 3.2, SD = 1.5), t(52) = 4.8, p < .001.