9 One-way analysis of variance The topic will be introduced by two examples. We are interested in the effect of the day time (the early morning, before midday, the afternoon, the evening, the night) on the productivity of a manual worker (expressed in a total number of manufactured products). Or we would like to know if the mother's education (university educ., secondary educ., elementary educ.) has effect on the length of nursing period (expressed in a number of weeks). Generally the analysis of variance answers the question if the qualitative random variable (factor A) does have an effects on quantitative random variable X, or does not. In the first example the factor "day time" operates at 5 levels; in the second the factor "mother's education" operates at 3 levels. To find out if the factor A does effect the random variable X, ni independent observations of variable X at i-th factor level are acquired. Thus if the factor A has r levels then for each level one random sample is assigned and these r random samples are mutually independent: factor A Random sample level 1 X11, . . . , X1n1 N(1, 2 ) level 2 X21, . . . , X2n2 N(2, 2 ) ... ... level r Xr1, . . . , Xrnr N(r, 2 ) If the factor A does not effect the random variable X then the expected values 1, 2, . . . , r should be the same. Thus at the significance level we are testing hypothesis: H0 : 1 = 2 = . . . = r versus H1 : At least two of the expected values are different. This test can be thought of as an extension of the two-sample t-test. This leads to the question whether the hypothesis H0 can not be accomplished by means of r 2 separate two-sample t-test each of them at the significance level ? If at least one t-test would reject the equality of two expected values, then H0 about equality of all r expected values could be rejected. At the same time we would know which pair of expected values is different. But this procedure does not meet the condition that the type I error should be at most . (The error would be substantially greater.) This condition is satisfied by ANOVA (Analysis of variance) method, which tests hypothesis about equality of all r population means. If this hypothesis is rejected at the significance level then it is often desirable to know which factor level is responsible for the difference between population means. There are many different multiple comparison procedures that deal with this problem. We will illustrate these methods using Tukey's and Scheffe's multiple comparison method. First let us start with customary ANOVA notation. Notation 9.1 41 n = r i=1 ni the total size of all observations Mi = 1 ni ni j=1 Xij the sample mean of the i-th sample M = 1 n r i=1 ni j=1 Xij the total sample mean of all n observationes ST = r i=1 ni j=1 (Xij - M)2 the total sum of squares (the statistic ST has fT = n - 1 degrees of freedom) SA = r i=1 ni (Mi - M)2 the factor (or between-groups) sum of squares (statistic SA has fA = r - 1 degrees of freedom) SE = r i=1 ni j=1 (Xij - Mi)2 the error (or within-groups) sum of squares (statistic SE has fE = n - r degrees of freedom) Remark 9.2 ˇThe statistic ST is nothing but the numerator of the expression for the variance of all n observations of the random variable X. It characterizes the total variation of all n observations Xij around the total mean M. ˇStatistic SA measures the proximity of r sample means to each other and brings out the factor impact on variability. ˇStatistic SE is a pooled measure of variation within the particular samples which is caused randomly, thus it is not caused by the factor. Each sum of squares has its degrees of freedom which are given by the number of independent variables within considered group. Considering the statistic ˇST , n observations correspond to n summands of the sum r i=1 ni j=1 (Xij - M). But this summands are not totally arbitrary, they have to satisfy the condition r i=1 ni j=1 (Xij - M) = 0. Thus ST has fT = n - 1 degrees of freedom. Analogously the 42 statistic ˇSA has r observations which have to satisfy the condition r i=1 ni(Mi - M) = 0. Thus SA has fA = r - 1 degrees of freedom. The degrees of freedom for the statistic ˇSE can be calculated from the formula fE = fT - fA. (This formula follows from the relationship ST = SA + SE, which is stated in the following theorem.) Thus fE = n - r. Theorem 9.3 Considering the notation in 9.1 it holds: 1.) ST = SA + SE. 2.) S2 = SE n-r , where S2 is the weighted mean of sample variances. 3.) SE 2 2 (n - r). [E( SE n-r ) = 2 ] 4.) Variables SE 2 and SA 2 are independent. In addition if H0 : 1 = 2 = . . . = r is true then 5.) SA 2 2 (r - 1). [E( SA r-1 ) = 2 if the assumption holds.] 9.4 Testing the hypothesis that the population means are equal At the significance level we are testing the hypothesis: H0 : 1 = 2 = . . . = r versus H1 : At least two of the expected values are different. The random variable Xij, i = 1, . . . , r; j = 1, . . . , ni follows the normal distribution: Xij N(i, 2 ). Thus Xij can be expressed as follows: Xij = i + ij = + i + ij, where ij are independent random variables following the distributionN(0, 2 ) is that part of an expected value E(X), which is common to all r random samples i is the population effect of the i-th factor level. (Parameters and i are unknown and we require the validity of following equation r i=1 nii = 0) The null hypothesis is true if the factor A has absolutely no effect on the random variable X. Thus the null could be rewritten to the form H0 : 1 = 2 = . . . = r = 0 Hence if H0 is true then: Xij = + ij The statistic Mi is a point estimator of the mean value i The statistic M is a point estimator of the mean value 43 The statistic Mi - M is a point estimator of the effect i = i - To make decision about the null hypothesis, we compare the mean squares SA/fA and SE/fE, whose expected values should be the same under the true null hypothesis. This suggests that the test statistic FA = SA/fA SE/fE follows (under true null) Fisher's distribution F(fA, fE). It is obvious that the big differences between Mi and M bring evidence against the null; thus the statistic SA is responsible for rejecting or not rejecting the null. The statistic SE is instrumental towards estimation of the parameter 2 . Therefore the null hypothesis is rejected at the significance level if: FA = SA/fA SE/fE F1-(r - 1, n - r) The results of the one-way analysis of variance are often displayed in a table similar to following one. Sources of variability sum of squares degrees of freedom mean squares test statistic factor SA fA = r - 1 SA/fA FA = SA/fA SE/fE error SE fE = n - r SE/fE total ST fT = n - 1 When the null hypothesis in the analysis of variance is rejected we are interested in comparing all pairs of expected values to find at least one pair of different expected values which caused the rejection. These pairs may be identified by multiple comparison methods. 9.5 Tukey's method This method requires equal sample sizes, thus p := n1 = n2 = . . . = nr. At the significance level we are testing the hypothesis: H0 : k = l versus H1 : k = l The hypothesis about equality k = l is rejected at the significance level when |Mk - Ml| q1-(r, n - r) S p , where the values q1-(r, n - r) are tabulated and are known as the studentized range quantiles. This procedure identifies all pairs k, l, in which the expected values k, l differs significantly at the significance level . 9.6 Scheffe's method The advantages of this method is simplicity and applicability to groups of unequal size. It is known to be relatively insensitive to departures from normality and homogenity of variance. At the significance level we are testing the hypothesis: H0 : k = l versus H1 : k = l The hypothesis about equality k = l is rejected at the significance level when |Mk - Ml| S (r - 1) 1 nk + 1 nl F1-(r - 1, n - r) 44 The following situation may occur: The hypothesis H0 : 1 = 2 = . . . = r is rejected, but the multiple comparison methods do not identify any pair with significant difference of expected values. Then the more complicated combination of expected values known as contrast is significantly different. Here let us recall the ANOVA assumptions and then we will follow with tests of these assumptions. 9.7 Assumptions of ANOVA According to the established notation the random samples should have following properties: 1.) The normality: Xi1, . . . , Xini N(i, 2 ), i = 1, . . . , r. 2.) The independence: The particular random samples are mutually independent. 3.) The homoskedasticity: The variances of particular samples are equal, thus 2 := 2 1 = . . . = 2 r The normality is either known or the tests of normality may be used. (Generally ANOVA is not too sensitive to departures from normality.) The independence should follow from the design of an experiment. In the end the homoskedasticity should be verified, thus we have to run a test that the r population variances are equal. The following tests may be used: 9.8 Leven's test Leven's test is in fact one-way ANOVA formally applied on variables |Xij - Mi|. At the significance level we are testing the hypothesis: H0 : 2 1 = 2 2 = . . . = 2 r := 2 versus H1 : At least two of the variances are different. Let us denote Zij = |Xij - Mi|. Then according to ANOVA notation we denote MZi = 1 ni ni j=1 Zij MZ = 1 n r i=1 ni j=1 Zij SZA = r i=1 ni (MZi - MZ )2 SZE = r i=1 ni j=1 (Zij - MZi)2 If the null hypothesis about equal variances is true then the statistic FZA = SZA/(r - 1) SZE/(n - r) F(r - 1, n - r) The null hypothesis about equal variances is rejected at the significance level if FZA F1-(r - 1, n - r). 9.9 Bartlett's test If the r population sizes are at least 7 the Bartlett's test about equality of variances may be used. It's disadvantage is substantial sensitivity to the violence of assumption of normality. If the null hypothesis about equal variances is true then the statistic B = 1 C (n - r) ln S2 - r i=1 (ni - 1) ln S2 i 2 (r - 1), where 45 C = 1 + 1 3(r-1) r i=1 1 ni-1 - 1 n-r S2 i = ni j=1 1 ni-1 (Xij - Mi)2 S2 = 1 n-r r i=1 (ni - 1)S2 i = SE n-r The null hypothesis about equal variances is rejected at the significance level if B 2 1-(r - 1, n - r). 9.10 The summing up An outlie of the steps: 1.) We have to verify assumptions of ANOVA; to verify homoskedasticity use Leven's test, or Bartlett's test. 2.) Using ANOVA table we make decision about the null hypothesis stating that population means are equal. 3.) If the hypothesis about equal population means is rejected the multiple comparison methods may be used. These methods are aimed to identify pairs which caused the rejection. Use Tukey's method, orScheffe's method. Example 9.11 Considering four sorts of potatoes we are interested in the total weight od potatoes from one bunch. The results in kg are performed in the following table: the sort the weight I. 0,9 0,8 0,6 0,9 II. 1,3 1,0 1,3 III. 1,3 1,5 1,6 1,1 1,5 IV. 1,1 1,2 1,0 Run a test at = 5% that the expected values of bunch weights are not effected by the sort. If you reject the null, find out at = 5% which pairs of sorts are different. Solution We assume the data to be realization of four mutually independent normally distributed random samples with equal population variance. Thus Xi1, . . . , Xini N(i, 2 ); i = 1, 2, 3, 4. We are testing the hypothesis that all the four population means are equal: H0 : 1 = 2 = 3 = 4 versus H1 : At least two of the expected values are different. First we have to determine the realization of needed statistics: m1 = 0, 8; m2 = 1, 2; m3 = 1, 4; m4 = 1, 1; m = 1, 14 SE = 0, 3; SA = 0, 816; ST = 1, 116. Further r = 4; n = 15 Source sum of squares degrees of freedom mean squares test statistic factor SA = 0, 816 fA = r - 1 = 3 SA/3 = 0, 272 FA = SA/fA SE/fE = 9, 97 error SE = 0, 3 fE = n - r = 11 SE/11 = 0, 02727 total ST = 1, 116 fT = n - 1 = 14 The critical region follows W = F0,95(3, 11); ) = 3, 59; ). The realization of the test statistic 46 9, 97 W, thus H0 about equal population means is rejected at . Using Scheffe's method we identify the pairs which caused the rejection at = 0, 05. Compared sorts Differences |Mk - Ml| The right side of an unequality I., II. 0,4 0,41 I., III. 0,67 0,36 I., IV. 0,3 0,41 II., III. 0,2 0,40 II., IV. 0,1 0,44 III., IV. 0,3 0,40 At = 0, 005 the sorts I. and III. are different. (The asterisk in the table identifies the pair, in which the difference |Mk - Ml| is significant.). 47