STATISTICAL TESTS BASED ON RANKS Jana Jureckovä Charles University, Prague Jana Jureckovä Contents 1 Basic concepts of hypotheses testing in nonparametric setup 5 1.1 Introduction........................................ 5 1.2 Principle of invariance in hypotheses testing...................... 6 2 Properties of ranks and of order statistics 9 3 Locally most powerful rank tests 11 4 Selected two-sample rank tests 15 4.1 Two-sample tests of location .............................. 15 4.2 Two-sample rank tests of scale............................. 19 4.3 Rank tests of Ho against general two-sample alternatives based on the empirical distribution functions................................... 20 4.4 Modification of tests in the presence of ties ...................... 22 5 Tests for comparison of the treatments based on paired observations 25 5.1 Rank tests of if i..................................... 25 5.2 One-sample Wilcoxon test................................ 27 5.3 Sign test.......................................... 27 6 Tests of independence in bivariate population 29 6.1 Spearman test ...................................... 29 6.2 Quadrant test....................................... 30 7 Rank test for comparison of several treatments 31 7.1 One-way classification.................................. 31 7.2 Kruskal-Wallis rank test................................. 32 7.3 Two-way classification (random blocks) ........................ 33 3 Jana Jureckovä Chapter 1 Basic concepts of hypotheses testing in nonparametric setup 1.1 Introduction Let X = (Xi,...,Xn) be a random vector (vector of observations) and let H and K be two disjoint sets of probability distributions on (MH,Bn). We say that X fulfills the hypothesis if the distribution of X belongs to H and that X fulfills the alternative if its distribution belongs to K. We shall use the same symbols H and K either to denote the hypotheses or the set. The hypothesis is usually the homogeneous, symmetric, independent of the statements while the alternative means inhomogeneity, asymmetry, dependence etc. The problem is to decide between the hypothesis and alternative on the basis of observations Xi,..., Xn. Every rule, which assigns just one of the decisions "to accept H" or "to reject H" to any point x = (x\,..., xn), is called the test (nonrandomized) of hypothesis H against alternative K. Such test partitions the sample space X into two complementary parts: the critical region (rejection region) Ak and acceptance region Ah- The test rejects H if x € Ak and accepts H if x € AH. If we perform the test on the basis of observations x, then either our decision is correct or we could make either of the following two kinds of errors: (1) We reject H even if it is correct (error of the first kind); (2) we accept H even if it is incorrect (error of the second kind). It is desirable to use the test with the smallest possible probabilities of both errors. If the true distribution P of X satisfies P € H, then the probability of the error of the 1st kind equals %(K € Ak) and supPeH P(X € Ak) is called the size of the test with the critical region Ak- If the true distribution Q of X satisfies Q € K, then the probability of the error of the second kind equals q(X € AH) = 1 - q(X € AK). The probability fi(Q) = q(X € AK), Q G K is called the power of the test against the alternative Q. The function (3(Q) : K \-¥ [0,1] is called the power function of the test. The desirable test maximizes the power function uniformly over the whole alternative and has the small probability of the error of the first kind for all distributions from the hypothesis. The testing theory and searching for the optimum considerably simplifies when we supplement the family of tests by the randomized tests. A randomized test rejects H with the probability <&(x) and accepts with probability 1 — $(x) while observing x, where 0 < <&(x) < 1 Vx is the test function. The set of randomized tests coincides with the set {<&(x) : 0 < $ < 1} and hence it is convex and weekly compact. 5 6 Jana Jureckovä If X has distribution P, then the test <£> rejects H with the probability /3$(P) = EP($(X))= f $(x)e/P(x). Jx Intuitively, the best test should satisfy /0*(Q) = EQ($(X)) :=max VQ G if (1.1) and simultaneously /3$(P) = Ep($(X)) := min VP € H. (1.2) Because no test satisfies both conditions simultaneously, the optimal test is defined in the following way: Select a small number a, 0 < a < 1, called the significance level, and among all tests satisfying /M-P) < « VP € (1.3) we look for the test satisfying (1.1). Such test, if it exists, is called the uniformly most powerful test of size < a, briefly the uniformly most powerful a-test of H against K. The hypothesis [alternative] is called simple if H [K] is a one-point set; otherwise it is called composite. The test of a simple hypothesis against a simple alternative is given by the fundamental Neyman-Pears on lemma. THEOREM 1.1.1 Neyman-Pearson Lemma. Let P and Q be two probability distributions with densities p and q with respect to some measure fj, (e.g., fj, = P + Q). Then, for testing the simple hypothesis H : {P} against the simple alternative K : {Q}, there exists the test $ and a constant k such that EP($(X)) = a (1.4) and v ' \ 0 if q(x) < k.p(x). v ' This test is the most powerful a-test of H against K. 1.2 Principle of invariance in hypotheses testing Let g be a 1:1 transformation X : X. We say that the problem of testing of H against K is invariant with respect to g if g retains both H and K, i.e. X satisfies H iff gX. satisfies H X satisfies K iff gX. satisfies K. If the problem of testing H against K is invariant with respect to the group Q of transformations of X onto X, then we could naturally consider only the tests which are equally invariant, i.e. the tests $ satisfying ${gX) = $(X) Vx eX, Vg € Q. We shall then look for the most powerful invariant a-test. In some cases, there exists a statistic T(X), called maximal invariant, such that every invariant test is a function of T(X). Statistical Tests Based on Ranks 7 DEFINITION 1.2.1 The statistic T = T(X) is called maximal invariant with respect to the group Q of transformations, provided T is invariant, i. e. T(5x)=T(x) VxG*, VgeQ and i/T(xi) = T(x2) then there exists g €E Q such that x2 = 3x1. The structure of invariant tests is characterized in the following theorem. THEOREM 1.2.1 Let T(X) be the maximal invariant with respect to the group of transformations X. The the test <& is invariant with respect to Q if and only if there exists a function h such that $(x) = /i(T(x)) Vx € X. proof, (i) If $(x) = /i(T(x)) Vx, then $(tpc) = h(T(gx)) = /i(T(x)) = $(x) Vg E Q and hence <& is invariant. (ii) Let <& be invariant and let T(xi) = T(x2). Then, by the definition, x2 = 3x1 for some / € Q and hence <&(x2) = $(xi). ■ Examples of maximal invariants 1. Let x = (x\,..., xn) and let G be the group of translations gx = (zi + c, ...,£„ + c), c € IR1. Then the maximal invariant is, e.g., T(x) = (x2 — x\,... ,xn — x\). 2. Let G be the group of orthonormal transformations IR" •->• IR". Then T(x) = Y17=i x1 1S *ne maximal invariant. 3. Let G be the set of n! permutations of xi,..., x„. Then the vector ordered components of x (vector of order statistics) T(x) = (xn:i < xn:2 < ... < xn:n) is the maximal invariant with respect to G. 4. Let G be the set of transformations x\ = f(x{), i = 1,... ,n) such that / : IR1 >-)• IR1 is continuous and strictly increasing function. Consider only the points of the sample space X with different components. Let Ri be the rank of among x\,..., xn, i.e. Ri = Y^j=i I\xj ^ Xi], i = 1,..., n. Then T(x) = ..., i?n) is the maximal invariant for G. Actually, a continuous and increasing function does not change the ranks of the components of x, i.e. T is invariant to G. On the other hand, let two different vectors x and x' have the same vector of ranks R±,..., Rn. Put /(xj) = x\, i = 1,..., n and let / be linear on the intervals [xn:i,xn:2], • • •, [xn:n_i, xn:n]; define / in the rest of the real line so that it is strictly increasing. Such / always exists, hence T is the maximal invariant. Jana Jureckovä Chapter 2 Properties of ranks and of order statistics Let X = (Xi,...,Xn) be the vector of observations; denote Xn:i < Xn.^... < Xn:n the components of X ordered according to increasing magnitude. The vector X(.) = (Xn:i,... ,Xn:n) is called the vector of order statistics and Xn:i is called the ith order statistic. Assume that the components of X are different and define the rank of X{ as R4 = X)j=i < X{\. Then the vector R of ranks of X takes on the values in the set R of n! permutations (ri,..., rn) of (l,...,n). The first property of X^ and R is described in the following proposition. Proposition 1. The pair (X( is a sufficient statistic for any family of absolute continuous probability distributions of X. proof. If X(.) = X(.) and R = r are prescribed, then P(X € A|X(0 = x(0, R = r) = P {(Xn:ri,..., Xn:rn) € A|X(0 = x(.}, R = r) = 0 orl depending on whether (xn:rixn:rn) is an element of A or not; this probability does not depend on the original distribution of X and this is the property defining the sufficiency. ■ DEFINITION 2.0.1 We say that the random vector X satisfies the hypothesis of randomness Hq, if it has a probability distribution with density of the form n p(x) = Y[f(xi), xel» i=l where f is an arbitrary one-dimensional density. Otherwise speaking, X satisfies the hypothesis of randomness provided its components are independent identically distributed (i.i.d.) random variables with absolute continuous distribution. The following theorem gives the general form of the distribution of X(.) and of R. THEOREM 2.0.1 Let X have the density pn{xi,..., xn). (i) Then the vector X(_) of order statistics has the distribution with the density -/ \ _ J ^2r£fcP(xn:rn • • • 5 xn:r„) ■■■Xn:l < ■ ■ ■ < Xn:n , . P(Xn:l, ■ ■ ■ , Xn:n) - | Q otherwise. ^ 9 10 Jana Jureckovä (ii) The conditional distribution of R given X(.) = X(.) has the form Pr(fl = r|X(0=x(0) = ^;'---^") (2.2) for any r € 1Z and any xn:i < ... < xn:n. The distributions of X^ and i? considerably simplify under the hypothesis Hq : This is described in the following theorem. THEOREM 2.0.2 //X satisfies the hypothesis of randomness Hq, then X(.) and R are independent, the vector of ranks R has the uniform discrete distribution pT(R = r) = -1 reK (2.3) n! and the distribution ofX.^ has the density -i \ _ J Tl\p(xn:\i . . . , Xn:n) ...Xn:i < . . . < 2^n:n j\ PlXn:1' • • •'Xn:nj " \ 0 ...otherwise. Finally, the following theorem summarizes some properties of the marginal distributions of the random vectors R and X(.) under Hq. THEOREM 2.0.3 Let X satisfies the hypothesis H0. Then (i) Pr(Ri=j) = ±Vi,j = l,...,n. (ii) Pv(Ri = k,Rj = m) = n(n1_1) for 1 < i,j,k,m < n,i ^ j, k ^ m. (Hi) TERi = *±i, i = l,...,n. (iv) var Ri = 1, i = 1,..., n. (v) coviRi.Rj) = -5±1, I ka 0 ...n\Q(R = r) ka)} + 7#{r : n\Q(R = r) = ka} = n\a, 0 < a < 1. However, many composite alternatives of the practical interest are too rich and the uniformly most powerful rank tests against such alternatives do not exist. Then we may take excurse to the local tests and look for a rank test most powerful locally in a neighborhood of the hypothesis. DEFINITION 3.0.1 Letd(Q) be a measure of distance of alternative Q € K from the hypothesis H. The a—test $0 is called the locally most powerful in the class M. of a—tests of H against K if, given any other $ € M, there exists e > 0 such that (3$o(Q) ^ VQ satisfying 0 < d(Q) < e. We shall illustrate the structure of the locally most powerful rank tests of Hq against a class of alternatives covering the shift and regression in the location and scale. THEOREM 3.0.2 Let A be a class of densities, A = {g(x, 9) : 9 € J} such that (a) J d M1 is an open interval, J 3 0. (b) g(x,9) is absolutely continuous in 9 for almost all x. (c) For almost all x, there exists the limit g(x,0) = ]hn)i\g(x,e)-g(x,0)] 8—»0 u and /oo /*oo \g(x,9)\dx = / \g(x, 0)\dx. 00 ./-oo 11 12 Jana Jureckovä Consider the alternative K = {q& : A > 0}, where n qA(x±,..., xn) = JJ g(xi, Acj), ci,..., cn given numbers. Then the test with the critical region ^2cian(Ri,g) > k is the locally most powerful rank test of Ho against K with the significance level a = i'Q^jLi cian{Ri k), where P is any distribution satisfying Hq, 9(Xn..i,0) g(Xn:uO)\ i = 1, and Xn:i,..., Xn:n are the order statistics corresponding to the random sample of size n from the population with the density g(x,0). Let us apply the theorem to find the locally most powerful rank tests of Hq against some standard alternatives. I. We shall start with the alternative of the shift in location and test Hq on the random vector (Xi,..., Xn) against the alternative K\ : {q& : A > 0} where n i=l i=m+l where / is a fixed absolute continuous density such that \f'(x)\dx < oo. I J —c (3.1) (3.2) Then the family of densities A with g(x,9) = f(x — 9) and J = IR1 fulfills the conditions (a) -(c) of Theorem 3.2. Then the locally most powerful rank a—test of Hq against K has the critical region n ^2 an{Rhf) > k =m+l where k satisfies the condition P{YliLm+i an(Ri, f)>k) = a, P € Hq and ojv(*,/) = E f'(XN:i) f(XN:i) 1.....N (3.3) (3.4) and Xjv:i < ... < Xjv:at are the order statistics corresponding to the sample of size N from the distribution with the density /. The scores (3.4) may be also written as aN(i, f) = ^{UN:U f),i = l,...,N (3.5) where 0} where m N , x qA(xu...,xN) = l[f(xi-ri J] e-A/f^M,A>0 (3.8) i=l i=m+l where / is an absolutely continuous density satisfying §™ \xf'(x)\dx < oo and fj, is the nuisance parameter. Then the family of densities A with g(x,9) = e~df((x — fi)e~d, J = IR1, fulfills the conditions (a) - (c) of Theorem 3.2 and the locally most powerful test has the critical region N alN(Ri,f)>k, (3.9) i=m+l where k is determined by the condition P{YliLm+i aiN(Ri, f)>k) = a, P G Ho and the scores have the form a1N(i,g) =E I-l-XN:iQ^l} =J&pi(UN:i,f), (3.10) i = 1,...,N, where 0} where K% : q^izi, ■ ■ ■, %n) = YliLifixi ~ ^-ci) with a fixed absolutely continuous density / satisfying (3.2) and with given constants ci,..., cat, YliLi cl > 0- Then the locally most powerful test has the critical region N J2ciaN(Rhf)>k (3.11) i=l with the scores (3.5) and with k determined by the condition P{YliLi Ci&N{Ri, f)>k) = a. Jana Jureckovä Chapter 4 Selected two-sample rank tests Consider two random samples (Xi,...,Xm) and (Yi,...,Yn) with the respective distribution functions F and G. For the sake of brevity, we shall also denote (X±,..., Xm, Yi,..., Yn) = (Zi,..., Zn) with N = m + n. The hypothesis of randomness for the vector (Zi,..., Zn) in this special case could be reformulated as Hq : F = G. Consider first testing Hq against the alternative Ki : G(x) < F(x) Vx € M1, G(x) ^ F(x) at least for one x. K\ is a one-sided alternative stating that the random variable Y is stochastically larger than X. The problem of testing Hq against K\ is invariant to the group Q of transformations z\ = g(zi), i = 1,..., N where g is any continuous strictly increasing function. As we have seen before, the vector of ranks R±,..., Rn of Z\,..., Z^ is the maximal invariant with respect to Q. Then, by Theorem 1.2, the class of invariant tests coincides with that of rank tests. Hence, we shall restrict our considerations to the rank tests. However, we could still reduce the class of tests due to the following considerations. Because both (Xi,..., Xm) and (Yi,..., Yn) are random samples, the distribution of the vector of ranks (R±,..., Rm, Rm+i, ■ ■ ■, i?m+n) is symmetric in the first m and the last n arguments under all pairs of distributions F and G. Hence, the sufficient statistic for the vector ..., i?m, i?m+i,..., Rm+n) are two vectors of ordered ranks R[<... 0. 15 16 Jana Jureckovä If we know that F is normal, we use the two-sample t-test. Generally, the test statistic of any rank test is a function of the ordered ranks of the second sample. Theorem 3.2 and the following example I. show that the locally most powerful test generally has the critical region of the form N ^ aN(Ri) > k; i=m+l hence the test criterion really depends only on the ordered ranks of Yj's. The scores etjv(i) = TEip(U:i) (which could be approximated by etjv(i) = (^j^fij), i = 1,... ,N, are generated by an appropriate score function tp : (0,1) i->- IR1. We shall now describe three basic tests of this type which are the most often used in practice. Every one is locally most powerful for some special F, but the probabilities of the error of the first kind are the same for all F € Hq. (i) Wilcoxon / Mann-Whitney test. The Wilcoxon test has the critical region N W= Ri>ka (4.2) i=m+l i.e., the test function ' 1 ...W>ka $(z) = < 0 ...W < ka , 7 -W = ka where ka is determined so that Ph0(W > ka) + jPh0(W = ka) = a, 0 < a < 1 (a = 0.05, a = 0.01). This test is the locally most powerful against Ki with F logistic with the density f(x) = 7^—-To, a; € IR. For small m and n, the critical value ka could be directly determined: For each combination si < ... < sn of the numbers 1,..., N we calculate Ya=i si an(^ order these values in the increasing ( N \ magnitude. The critical region is formed of the Mjv largest sums where Mjv = al J ; if there ( N is no integer Mjv satisfying this condition, we find the largest integer Mjv less than a I and randomize the combination which leads to the (Mjv + 1)—st largest value. However, this systematic way, though precise, becomes difficult for large N, where we should use the tables of critical values. There exist various tables of the Wilcoxon test, organized in various ways. Many tables provide the critical values of the Mann-Whitney's statistic N m i=m+l j=l we could easily see that Un and Wn are in one-to-one relation Wn = Un + "^"2+1^. For an application of the Wilcoxon test, we could alternatively use the dual form of the Wilcoxon statistic: Let Z\ < ... < Z^:n be the order statistics and define Vi,...,Vn in the following way: Vi = 0 if Zjv:j belongs to the 1st sample and V{ = 1 if Zjv:j belongs to the second sample. Then WN = ^=liVi. Statistical Tests Based on Ranks 17 For large m and n, where there are no tables, we use the normal approximation of Wn : If m, n —> oo, then, under Hq, Wn has asymptotically normal distribution in the following sense: Hm PHAWN^-N is the standard normal distribution function. To be able to use the normal approximation (4.3), we must know the expectation and variance of Wn under Hq. The following theorem gives the expectation and the variance of a more general linear rank statistic, covering the Wilcoxon as well other rank tests. THEOREM 4.1.1 Let the random vector R±,... ,Rn have the discrete uniform distribution on the set R of all permutations of numbers 1,...,N, i.e. Pr(i? = r) = , r €E 1Z; let ci,..., cjv and a\ = a(l),... , etjv = a(N) are arbitrary constants. Then the expectation and variance of the linear statistic n SN = J2cia(Ri) (4-4) i=l are . n n E5^ = ^Ec*Eai (4-5) N i=l j=l n n i=l • ' where c = ^ YliLi ci and 0 = j? YliLi ai- Proof. Actually, n n ^ n lESiv = Ci.TEa(Ri) = E c^ E a i=l i=l j=l n var Sn = c2.var a(Ri) + CjCjCOv(a(i?j), a(Rj)) i=l i^j n var a(i?i) ^ c2 + cov(a(i?i), a(i?2)) ^ ^ CjCj i=l i^j n = N2c2 .cov (a (Ri), a (i?2)) + ■ i=l Theorem 2.3 further gives 1 * var a(i?i) = — ^(a, - a)2, i=l 1 * cov(a(i?i),a(JR2)) = ~ N(N _ y Zfo' ~ Q) 18 Jana Jureckovä hence N N 1 N N var SN = -^—f2 J^(oi - ä)2 + —- ^(oj - äf.^c). i=l i=l j=l As a special case, we get the parameters of the Wilcoxon statistic under Hq : mTI, n(N + l) „r mn(N + l) , A ^WN = -'-, var WN =-K—-'-. (4.7) The tables of critical values profit from the fact that the distribution of Wn under Hq is symmetric around IE Wat. If we test the Hq against the left-sided alternative (A < 0, the second sample shifted to the left with respect the first one), we reject Hq if Wn < 2MWn — ka. A sufficient condition for the symmetry of the linear rank statistic, which cover the Wilcoxon, follows from the following theorem: THEOREM 4.1.2 Let (R±,..., Rn) be a random vector with discrete uniform distribution on the set 1Z of permutations of 1,..., N. Let ci,..., cjv and a± = a(l),..., etjv = a(N) be constants such that either o-i + etjv-j+i = K = const, i = 1,... ,N (4.8) or Ci + Cjv-j+i = K = const, i = 1,..., N. (4.9) Then the distribution of the statistic Sn = YliLi cia(Pi) *s symmetric aroundTESiy, i.e. Sn—TESn and —(Sn — ESjv) have the same distributions. Proof. Under (4.8), 2Na = YliLi ai + YliLi an-i+i = NK, hence etj + a^-i+i = 2a, i = 1,..., N. Because (N — R\ + 1,..., N — Rn + 1) and (i?i,..., Rjy) have the same distributions, S'N = YliLi cia(N — Ri + 1) has the same distribution as Sn and n S']\f = 2a ^ ] Cj — Sn = 2TESjv — Sn ==/> — IES1^ = TBSjv — Sjv i=l => Pr(5jv - E5jv = s)= Pv(S'N - TES'N = s) = Pr(E5jv - SN = s) holds for any s. Analogously, under (4.9), c, + cjy-i+i = 2c, i = 1,... ,N and (Rn,-..,Ri) has the same distribution as (R±,..., Rn). Hence, S'^ = YliLi cn-i+id(Ri) = YliLi cia(P-n-i+i) has the same distribution as Sn and SN = 2c~Yli=i ai ~ $n = 2IE5at — Sn Sn — IESjv = ESjv — is the standard normal distribution function. The van der Waerden test is convenient for testing Hq against K\ if the distribution function F has approximately normal tails. In fact, the test is asymptotically optimal for Hq against the normal alternatives and its relative asymptotic efficiency (Pitman efficiency) with respect to the t-test is equal to 1 under normal F and > 1 under all nonnormal F. For Statistical Tests Based on Ranks 19 these good properties the test could be recommended; for large m, n, if we do not have the tables at disposal, we could use the critical values of the test based on the normal approximation -/V(ESjv, var 1. 20 Jana Jureckovä The locally most powerful rank test against K4 is given by (3.9) and (3.10). However, instead of the tests optimal against some special shapes of F with complicated form of the scores, we shall rather describe tests with simple scores which are really used in the practice. Notice, by (3.10), that the score function ip\ for the scale alternatives is not more monotone but U—shaped and the test statistics are of the form Sn = £ *i (Jh) ■ (4-11) i=m+l N + 1 (i) The Siegel-Tukey test. This test is based on reordering the observations, leading to new ranks, and to the test statistics whose distribution under Hq is the same as that of the Wilcoxon statistic. Let Zn-.i < Zn-.2 < ... < Zn-.n be the order statistics corresponding to the pooled sample of N = m + n variables. Reorder this vector in the following way: (4.12) , N. The critical Zn-.I, Zn-.N, Zn-.N-I, Zn-.2, Zn-.3, Zn-.N-2, Zn-.N-3, Zn-A, Zn-.5, and denote R{ the new rank of Z{ with respect to the order (4.12), i = 1,. region of the Siegel-Tukey test has the form N S'N = £ Ri ^ k'a i=m+l where k'a is determined so that Ph0(S'n < k'a) + jPh0(S'n = k'a) = a. The distribution of S'N under Hq coincides with the distribution of the Wilcoxon statistic, hence we could use the tables of the Wilcoxon test. However, unlike in the case of the Wilcoxon test, the Pitman efficiency of the Siegel-Tukey test to the F—test is rather low under normal F, namely ^JU = 0.608. Anyway, we should not use the two sample F—test of scale unless we are sure of the normality; namely this test is very sensitive to the deviations from the normal distribution. (ii) Quartile test. Put in (4.11) and we get the test statistic Sn 0 ...0.25 < u < 0.75 0.5 ...« = 0.25, « = 0.75 1 ...0 < u < 0.25 and 0.75 < u < 1 N E =m+l sign Ri N + l + 1 (4.13) and reject Hq for large values of Sn- The value of Sn is, unless n +1 is divisible by 4, the number of observations of the Y—sample which belong either to the first or to the fourth quartile of the pooled sample. If n is divisible by 4, then Sn has the hypergeometric distribution under Hq, analogously as the median test. 4.3 Rank tests of H0 against general two-sample alternatives based on the empirical distribution functions. Again, Xi,...,Xm and Yi,...,Yn are two samples with the respective distribution functions F and G. We wish to test the hypotheses of randomness Hq : F = G either against the one-sided alternative K+ : G(x) < F(x) Vs, F/G Statistical Tests Based on Ranks 21 or against the general alternative K5: F^G. This case in not covered by Theorem 3.2; moreover, testing against K§ is invariant to all continuous functions and there is no reasonable maximal invariant under this setup. In this case we usually use the tests based on the empirical distribution functions, which are the maximal likelihood estimators of the theoretical distribution functions in such nonparametric setup. Among these tests, we shall describe the Kolmogorov -Smirnov tests; another known test of this type is the Cramer - von Mises test. The empirical distribution function Fm corresponding to the sample Xi,..., Xm is defined as 1 m Fm(x) = - VipQ Cq, T •••Dmn = Cq, 0 ...Dmn < Ca The statistic Dmn is the rank statistic, though not linear. To see this, consider the order statistics Zjv:i < ... < Zjv:jv of the pooled sample and establish the indicators Vi,...,Vn where Vj = 0 if Zn:j comes from the X—sample and Z^:j = 1 otherwise. Because Fm and Gn are nondecreasing step functions, the maximum in (4.15) could be attained only in either of the points Zjv:i, • • •, Zn:n; moreover Fm(ZN-.j) — Gn(ZN:j) = — mn what gives the value of the test criterion m + n Dmn =-• max mn i* one of the ranks R±,..., Rm is equal to i. Thus Vi,...,Vn are dependent only on the ranks, and so is also Dmn. This implies that the distribution of Dmn under Hq is the same for all F. (4.16) is also used for the calculation of Dmn. Analogous consideration could be made for the one-sided Kolmogorov-Smirnov criterion D+ which could be expressed in the form _. , m + n DZ.n =-. max mn mn i 0. (4.17) m,n-)-oo \m + n I 22 Jana Jureckovä 4.4 Modification of tests in the presence of ties If both distribution functions F and G are continuous, then all observations are different with probability 1 and the ranks are well defined. However, we round the observations to a finite number of decimal places and thus, in fact, we express all measurement on a countable network. In such case, the possibility of ties cannot be ignored and we should consider the possible modifications of rank tests for such situation. Let us first make several general remarks: • If the tied observations belong to the same sample, then their mutual ordering does not affect the value of the test criterion. Hence, we should mainly consider the ties of observations from different samples. • A small number of tied observations could be eventually omitted but this is paid by a loss of information. • Some test statistics are well defined even in the presence of ties; the ties may only change the probabilities of errors of the 1st and 2nd kinds. Let us mention the Kolmogorov -Smirnov test as an example: The definitions of the empirical distribution function (4.14) and of the test criterion (4.16) make sense even in the presence of ties. However, if we use the tabulated critical values of the Kolmogorov - Smirnov test in this situation, the size of the critical region will be less than the prescribed significance level. Actually, we may then consider our observations X\,..., Xm, Y\,..., Yn as the data rounded from the continuous data XI,...,X^,Y*,...,Y*. Then the possible values of Fm{x) - Gn(x), x € M1 form a subset of possible values of F^(x) — x € 1R1 where F^ and G* are the empirical distribution functions of X*'s and Yf's, respectively; hence max[Fm(x) - Gn(x)] < max[F^{x) - G*n(x)] and similarly for the maxima of absolute values. We shall describe two possible modifications of the rank tests in the presence of ties: randomization and method of midranks. Randomizat ion Let Z\,..., Zjy be the pooled sample. Take independent random variables U\,..., Un, uniformly R(0,1) distributed and independent of Z\,..., Zjj. Order the pairs (Z\, Ui),..., (Zn, Un) in the following way: Denote R\,..., i?* the ranks of the pairs (Z\, Ui),..., (Zn, Un). We shall say that Z\,..., Zn satisfy the hypothesis H if they are independent and identically distributed (not necessarily with an absolutely continuous distribution). Then, under H, the vector R\,... is uniformly distributed over the set 1Z of permutations of 1,...,N. We shall demonstrate it on an important special case when Z\,..., Z^ could take on the equidistant values, e.g. when the data are rounded on k decimal places. Statistical Tests Based on Ranks 23 THEOREM 4.4.1 Let Zi,...,Zn be random variables satisfying the hypothesis H which take on the values from the set a + kd; k = 0, ±1, ±2,..., a € IR1, d > 0. Then the vector R* = (i?*,..., R*N) has the probability distribution PROOF. We may assume, without loss of generality, that Z\,..., Zn take on the integer values. Then the random variable Tj = Z{ + U{ is equivalent to the pair (Zj, U{), because Z{ = [Tj] and Ui=Ti- [Tj] with probability 1, i = 1,..., N. Because Pr(Tj = t) = 0 Vt G IR1, the distribution function of Tj is continuous. Moreover, (Zi,E/j) < {ZhUj)^Ti x. In this section we shall consider the rank tests of Hi against various alternatives. 5.1 Rank tests of Hi Apply the folowing transformation to (Xi, Yj), i = 1,..., n : Zi = Yi-Xi, Wi = Xi+Yi, i = l,...,n (5.2) Under Hi, the distribution of the vector (Zi, Wi),..., (Zn, Wn) is symmetric around the w—axis, while under the alternative it is shifted in the direction of the positive half-axis z. The problem of testing Hi against such alternative is invariant with respect to the transformations z\ = Zi, w\ = g(wi), i = 1,..., n, where g is a 1 : 1 function with finite number of discontinuities. The vector (Zi,..., Zjv) is the maximal invariant with respect to such transformations, hence the invariant tests will be only the functions of (Zi,..., Zn), which forms a random sample from some one-dimensional distribution with a continuous distribution function D. The problem of testing Hi is then equivalent to H[ : D(z) + D(-z) = 1 z € M1 (5.3) 25 26 Jana Jureckovä stating that the distribution D is symmetric around 0, against the alternative K[ : D(z + A) + D(-z + A) = 1 Mz 0 (5.4) what means that the distribution is shifted in the direction of the positive z. The distribution!) is uniquely determined by the triple (p, Fi, F2) withp = Pr(Z < 0), Fi(z) = Pr(|Z| < z\Z < 0) and F2(z) = Pv(Z < z\Z > 0). Equivalent expressions for H[ and K[ are H'{: p = 1/2, F2 = FU p< 1/2, F2 < F±. This problem is invariant with respect to the group of transformations G : z[ = g(zi), i = 1,..., n, where g is continuous, odd and increasing function. We could easily see that the maximal invariant with respect to G is (Si,..., Sm, R±,..., Rn), where Si,...,Sm are the ranks of the absolute values of negative Z's among \Zi\,..., \Zn\ and Ri,..., n are the ranks of positive Z's among \Zi\,..., \Zn\. Moreover, the vectors S[ < ... < S'm and R[ < ... < R'n of ordered ranks are sufficient for (S\,..., Sm, R±,..., Rn) and, further, one of them uniguely determines the other; hence it is finally consider only, e.g., R[ < ... < R'n and the invariant tests of Hi [or of H[] depends only on R[ < ... < R'n. Let v be the number of positive components of (Z\,..., Zn). Then v is a binomial random variable B(N, it); it = 1/2 under H\ and, for any fixed n, PHl(R'1 = r1,...,R'u = r„,v = n)= (5.5) PHl (R[ =ri,...,R'„ = rv\is = n)PHl (v = n) = for any n—tuple (ri,..., rn), 1 < r± < ... < rn < N. The number of such tuples is (^)JV. The critical region of any rank test of the size a = contains just k such points (n,...,rn). However, among such critical regions, there generally does not exist the uniformly most powerful one for H" against K". We usually consider H" against the alternative of shift in location under which (Zi,..., Zn) has the density qA, A > 0 : n qA(zi,...,zN) = Y[f(zi-A): A>0 (5.6) i=l where / is a one-dimensional symmetric density, f(—x) = f(x), x € M1. A = 0 under Hi [or H'(.\ The locally most powerful rank test of H\ against (5.6) has the critical region n J2aURtJ)^zi>ka (5-7) i=l where R^ is the rank of \Z{\ among \Zi\,..., \ Zn\ and the scores a~j^(i, f) have the form u + 1 a+(i, f) = J&p+iUftJ), i = l,...,N V+(uJ) = 0 among \Z\\,..., \ Z^\, v is the number of positive components. Obviously W+ = 2W++ - ±N(N + 1). We reject Hi if > Ca, i.e.if the test criterion exceeds the critical value. For large N, when the tables of critical values are not available, we may use the normal approximation: { WZ - JEWZ 1 -ff, Ca 7 ...n = Ca (5.14) 0 ...n < Ca 28 Jana Jureckovä where Ca and 7 are determined by the equation (5.15) The criterion of the sign test is simply the number of positive components among Z\,..., Zn and its distribution under Hi is binomial b(N, 1/2). For large N we could again use the normal approximation. If all distribution functions Di,...,Dn coincide, the sign test is the locally most powerful rank test of Hi for D of the double-exponential type with the density d{z) = ^e~^_Al, z €E M1. When we want to use the rank test, we need not to know the exact values X{, Yj, i = 1,..., N; it is sufficient to know the signs of the differences Yj — X{. This is a very convenient property of the sign test: we could use this test even for the qualitative observations of the type: "drogue A gives a better pain relief than drogue B". As a matter of fact, we do not have any better test under such conditions. Chapter 6 Tests of independence in bivariate population Let (Xi,Y\),..., (Xn,Yn) be a random sample from a bivariate distribution with a continuous distribution function F(x,y). We want to test the hypothesis of independence H2 : F(x,y) = F1(x)F2(y) (6.1) where F\ and F2 are arbitrary distribution functions. The most natural alternative for H2 is the positive [or negative] dependence, but it is too wide and we could hardly expect to find a uniformly most powerful test against such alternative. Instead of it we consider the alternative Xt = X? + AZ, Y = Y° + AZi o A>0, i = l,...,n, (6.2) where Xf, Y®, Zi, i = 1,..., n are independent and their distributions are independent of i. The independence then means that A = 0. Let ... ,Rn be the ranks of Xi,..., Xn and let Si,...,Sn be the ranks of Yi,..., Yn, respectively. Under the hypothesis of independence, the vectors (Ri,..., Rn) and (Si,..., Sn) are independent and both have the uniform distribution on the set 1Z of permutations of 1,..., n. The locally most rank powerful test of H2 against the alternative K2 in which Xf has the density fi and Y® the density f2, respectively, both densities continuously differentiable, has the critical region n ^2an(Ri, fi)an(Ri, f2) > Ca (6.3) i=l with the scores an(i, f) given in (3.4), which are usually replaced be approximate scores (3.7). We shall briefly describe two the most well-known rank tests of independence. 6.1 Spearman test The Spearman test is based on the correlation coefficient of (Ri,..., Rn) and (Si,..., Sn): r, =_kT,?=iRiSj-RS_ where 29 30 Jana Jureckovä i=l i=l i=l v ' n+l\2 n2-l n f—' n r—' n r--' V 2 / 12 Then we could express (6.4) in the simpler form r5_ 12 ^oo 3(n + l) (6.5) nfn2 — 1) -f—' 5>s,-^f. (6.6) r—' n — 1 The Spearman test rejects i?2 if > Ca, or, equivalently, if 5 = X^=i Ri^i > I11 some tables we find the critical values for the statistic n S' = J2(Ri-Si)2 (6.7) i=l for which rs = 1 — ns_nS'. The test based on 5' rejects H2 if 5' < C'a. For large n we use the normal approximation with n(n + l)2 n2(n + l)2(n- 1) ElS = -A-' S = -7J~A-• 4 144 The Spearman test is the locally most powerful against the alternatives of the logistic type. 6.2 Quadrant test This test is based on the criterion Q=\ J^signto - ^) + l][sign(S - ^) + 1] (6.8) 4 i=l and rejects H2 for large values of Q. For even n is Q equal to the number of pairs (Xi,Y{), for which Xi lies above the X—median and Y{ lies above the Y—median. Statistic Q then has, under the hypothesis H2, the hypergeometric distribution m \ m Pr(Q = q)= v H \ V x 7 (6.9) n m for g = 0,1,..., m, m = n/2. For large n we use the normal approximation with the parameters 712 EQ = n/4, var Q = —--. (6.10) 16(n — 1) Chapter 7 Rank test for comparison of several treatments 7.1 One-way classification We want to compare the effects of p treatments; the experiment is organized in such a way that the i-th treatment is applied on rii subjects with the results xn,..., Xini, i = 1,... ,p, Ya=i ni = n-Then xn,... ,Xini is a random sample from a distribution with a distribution function Fi, i = l,...,p. The hypothesis of no difference between the treatments could be then expressed as the hypothesis of equality of p distribution functions, namely H2: ^ = ^2 = ... = ^ (7.1) and we could consider this hypothesis either against the general alternative K2 : Fi(x) ± Fj(x) (7.2) at least for one pair i,j at least for some x = xo, or against a more special alternative K2 : Fi(x) = F(x - At), i = l,...,p (7.3) and Aj ^ Aj at least for one pair The alternative (7.3) claims that the effects of treatments on the values of observations are-linear and that at least two treatments differ in their effects. The classical test for this situation is the F-test of the variance analysis; this test works well if we could assume that F{ ~ N(n + «j, cr2), i = 1,... ,p. We obtain the usual model of variance analysis Xij = n + oti + eij, j = 1,..., nf, i = l,...,p (7.4) where e,j are independent random variables with the normal distribution A/"(0, a2). The hypothesis H2 could be then reformulated as H2 : ol\ = a2 = ... = ap = 0. The .F-test rejects the hypothesis H2 provided T_n~V Y.Unj{X, - X.f 31 32 Jana Jureckovä where = - E x*and x- = - E E > ""i ■ 1 " • 1 • 1 i = 1,... ,p and where the critical value Ca is found in the tables of F-distribution with (p—1, n—p) degrees of freedom. 7.2 Kruskal-Wallis rank test Let us order all observations Xn, . . . , Xini, X2l, ••• , 2?2ri2) • • • ) Xpl, . . . , Xpnv according to the increasing magnitude. Let Rn,..., Rini be the ranks of the observations xn,..., Xjni. Let R*ix < ... < R*n. be the same observations ordered in increasing magnitude. Then, under the hypothesis H2, -Ph2 (-^11 = fn,..., R\m = r\ni,..., R*i = rpi,..., Rpnp = fpn^j = — (7-6) for any permutation (rn,..., r±ni,..., rpl,..., rprp) of numbers 1,..., n such that rn < ... < rini for alii = 1,... ,p. Denote Ri- = — ^Rij, i = l,...,p 1 p n* 11 If we replace X{. and X.. in (7.5) by and R.., respectively, i = 1,... ,p, we obtain »-p ELi^(^-^-)2 p-iELiE-ii(^-^-)2 and this is proportional to the criterion of the Kruskal- Wallis test, K ^ni^.-—)) (7.8) n(n + 1) ■f-' 12 p = —,-7^y^niRl -3(n + l). n(n+ 1) 4^ In the special case p = 2, the Kruskal-Wallis reduces to the two-sided (two-sample) Wilcoxon test. We reject the hypothesis H2, provided K, > Ca, where the critical value Ca is either obtained from special tables, or, if p > 3 and rii > 5, i = 1,... ,p, we use the asymptotic critical values: It could be shown that, under H2 and for large ni,..., np, tie criterion K, has asymptotically %2 distribution with p — 1 degrees of freedom. Remark. In case of ties between the observations we replace the ranks by the midranks, similarly as in the case of the Wilcoxon test. Statistical Tests Based on Ranks 33 7.3 Two-way classification (random blocks) We want to compare p treatments, but simultaneously we want to reduce the influence of non-homoheneity of the sample units. Then we could organize the experiment in such a way that we divide the subjects in n homogeneous groups, so called blocks, and compare the effects of treatments within each block separately. The subjects in the block are usually assigned the treatments in a random way. Let us consider the simplest of these models with n blocks, each containing p elements, and each treatment is applied just once in each block. We assume that the blocks are independent of each other. The observations could be formally described by the following table: Treatment Block 1 2 3 P 1 Xu Xl2 Xl3 X\p 2 X21 X22 X23 X2p n 2-Til Xn2 Xn3 Xnp The observation xij is the measured effect of the j-th treatment applied in the i-ih block. We cLSSU.1116 tllclt Xij are independent random variables and Xy has a continuous distribution function F{j, j = 1,..., n; i = 1,... ,p. We wish to verify the hypothesis that there is no significant difference among the treaments, hence H3:Fil=Fi2 = ... = Fipyi = l,...,n (7.9) against the alternative K3 : Fij ± Fik (7.10) at least for one i and at least for one pair j, k, or against a more special alternative K3: Fij(x) = Fi(x-Aj), j = l,...,n; i = l,...,p (7.11) Aj ^ A*, at least for one pair j, k. (7.12) The classical test of H3 is the F-test, corresponding to the model X^ = n + oti + Pj + Eij, j = 1,..., n; i = 1,... ,p, (7.13) where are independent random variables with the normal distribution Af(0, a2), fj, is the main additive effect, a, is the effect if the i-th block and (3j is the effect of the j-th treatment, j = 1,..., n; i = 1,... ,p. The hypothesis H3 then reduces to the form Pi =02 = ...= Pp. The critical region of the F-test of H3 has the form {n-l^-AX.j-X.))2 _ - - , T =--'-—(Xij ~ Xt.) - X.j + X.)2 > Ca (7.14) where Ca is the critical value of F-distribution with p — 1 and (p — l)(n — 1) degrees of freedom. 34 Jana Jureckovä Friedman rank test Order the observations within each block and denote the corresponding ranks Rn,... ,Rip; i = 1,..., n. The ranks we arrange in the following table: Treatment Block 1 2 3 p Row average 1 Rn R12 Rl3 Rip p+i 2 2 R21 R22 R23 R2p p+i 2 n Rnl Rn2 Rn3 Rnp p+i 2 Column average Al R.2 A3 Overall average R.. = *±i where A,- = I S"=i and A. = ^ E"=i E?=i Ay- The Friedman test is based on the following criterion: Qn 12" .^(^-£±1)2 n -^YX.-3n(p+l) (7.15) while the large value of the criterion are significant. As n —> 00, then the distribution of Qn is approximately %2 with p — 1 degrees of freedom. In case p = 2, the Friedman test is reduced to the two-sided sign test. The Friedman test is applicable for the comparison of p treatments even in the situation that we observe only the ranks rather than exact values of the treatment effects. Bibliography [1] H. Bünning and G. Trenkler (1978). Nichparametrische statistische Methoden (mit Aufgaben und Lösungen und einem Tabellenanhang). W. de Gruyter, Berlin. [2] J.Häjek and Z.Sidäk (1967). Theory of Rank Tests. Academia, Prague. [3] J. Jureckovä (1981). Rank Tests (in Czech). Lecture Notes, Charles University in Prague. [4] J. P. Lecoutre and P. Tassi (1987). Statistique non parametrique et robustesse. Economica, Paris. [5] E. L. Lehmann (1975). Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco. [6] E. L. Lehmann (1986). Testing Statistical Hypotheses. Second Edition. J. Wiley. 35