Chapter 1 Displaying the Order in a Group of Numbers Learning Objectives To understand, includingbeing ableto carry out any necessaryprocedures: El Descriptive versus inferentialstatistics. W Frequency tables. W Terminology of variable, value, and score. W Grouped frequencytables. W Histograms. W Frequency polygons. W Unimodal and bimodal frequency distributions. H Symmetrical and skewedfrequency distributions. W Normal and kurtotic frequency distributions. Ways in which fiequency tables and their graphic equivalentscan be misused. Chapter Outline - I. The Two Branches of StatisticalMethods A. Descriptive statistics are used to summarize and make understandable a group of numbers collected in a research study. (This is the focus of Chapters 1-4.) B. Inferential statistics are used to draw conclusions based on but going beyond the numbers actually collected in the research. (This is the focus of Chapters 5-17.) 11. Frequency Tables A. They showhow frequentlyeach of the different values occur. B. They make the pattern of numbers clear at a glance. C. To understand them, some basic statisticsterminology is needed. 1.Variable-a characteristic that can take on differentvalues. 2. Value-a number of category that a score can have. 3.Score-a particularperson's value on a variable. D. There are three steps to creating a frequency table. 1.Make a list down the page of each possible value, starting fiom the highest and ending with the lowest. 2. Go one by one through the group of scoresyou wish to describe, making a mark for each next to the correspondingvalue on your list. 3.Make a neat table of how many times each value on your listing occurs. Chapter One 111. Grouped Frequency Tables A. They are used when there are so many different possible values that the frequency table is too cumbersome to give a simple account of the information. B. They group values of all cases within a certain interval. C. There are four stepsto creating a grouped frequencytable. 1.Subtract the lowest from the highest value to find the range. 2.Divide the range by a reasonable interval size. a. Use 2, 3, 5, 10, or a multiple of 10 if possible. b. The size, after rounding up, should represent a reasonable number of intervals (in general, no fewer than 5, no more than 15). 3.Make a list of the intervals from highest to lowest, making the lower end of each interval equal to a multiple of the interval size. (Be sure that the intervals do not overlap.) 4.Proceed as you would for an ordinary frequency table. IV. Histograms A. They are one type of graphic display of the information in a frequency table. B. They are a kind of bar chart (in which the bars are put right next to each other without divisions);the height of each bar correspondsto the frequency of each value or interval in the frequency table. C. They look like a city skyline. D. There are four stepsto creating a histogram. 1.Make a frequency table (or a grouped frequency table). 2.Place the scale of values (or intervals) along the bottom of a page. a. The values (or intervals) should go from left to right, from lowest to highest. b. For a grouped frequency table it is conventioilal to mark only the midpoint of each interval, in the center of each bar. (The midpoint is figured as the point between the start of the interval and the start of the next interval.) 3.Make a scale of frequencies along the left edge of the page. 4.Make a bar for each value (or interval), the height corresponding to the frequency of the value (or interval) it represents. E. Making a histogramwill be easiest if you use graph paper. V. Frequency Polygons A. They are another type of graphic display of the information in a frequency table. B. They are a kind of line graph in which the bottom of the graph shows the values or intervals (as in a histogram), and the line moves from point to point, with the height of each point showingthe number of cases in that interval. C. They look like a mountain-peak skyline. Chapter One D. There are four steps to creating a frequency polygon. 1.Make a frequency table (or a grouped frequency table). 2.Place the scale of values (or intervals) along the bottom. a. The values (or intervals) should go from left to right, from lowest to highest. b. Be sure to include one extra value (or interval) above and one extra value (or interval) below the values (or intervals) that actually have any scores in them. C. ~ ~ b ~ M ~ ~ l ~ T n p 7 n 7 - nis figured as the point between the start of the interval and the start of the next interval.) 3.Along the left of the page make a scale of frequencies that runs from 0 at the bottom to the highest frequency in any value (or interval). 4.Mark a point above the center of each value (or interval) corresponding to the frequency of that value (or interval). 5.Connect the points with lines. VI. Shapes of Frequency Distributions A. A frequency table, histogram, or frequency polygon describes a frequency distribution-how the number of cases or "frequencies" are spread out or "distributed." B. It is useful to describe in words the key aspects of the way numbers are distributed-which can be thought of as the shape of the histogram or frequency polygon that represents the frequency distribution. C. Unimodal and bimodal. 1.A distribution with a single high peak is unimodal. (This is the most common in psychology.) 2.A distribution with two major peaks is bimodal. 3.Any distribution with two or more peaks is called multimodal. 4.A distribution in which all the values have about the same frequency is called rectangular. D. Symmetrical and skewed. 1.A distribution with approximately equal numbers of cases (and a similar shape) on both sides of the middle is symmetrical. (Approximations to this are the most common in psychology.) 2.A distribution which is clearly not symmetrical is skewed. a. The direction of skew refers to the side with the long tail. b.A distribution that is skewed to the right-the positive side of the middle-is also called positively skewed. c. A distribution skewed to the left-the negative side of the middle-is also called negatively skewed. Chapter One 3.In practice, highly skewed distributions come up in psychology mainly when what is being measuredhas someupper or lower limit. a. The situation in which many scores pile up at the low end because it is not possible to have any lower score is called afloor efect. b. The situation in which many scores pile up at the high end because it is not possible to have any higher score is called a ceiling efect. E7K&osi s. 1.If a distributionis particularly flat or particularlypeaked it has something called kurtosis. 2.The standard of comparison is the normal curve, a bell-shaped curve that is widely approximated in frequency distributions in psychological research-and in nature generally. (See Chapter 5.) 3.How peaked and pinched together versus flat and spread out a distribution is, compared to the normal curve, is called its degree of kurtosis. VII.Controversies and Limitations: How Frequency Tables and their Graphic Equivalents Can Be Misused A. Failure to use equal interval sizes. B.Exaggeration of proportions. 1.Ordinarilythe height of a histogram or frequencypolygon should begin at 0 or the lowest value of the scale and continueto the highest value of the scale. 2. The overall proportion of the graph shouldbe about 1.5times as wide as it is tall. VIII.Frequency Tables, Histograms, and Frequency Polygons in Research Articles A. They are mainly used by researchers as an intermediary step in the process of more elaboratestatistical analyses. B. Frequency tables are used in two situations. 1.Sometimes they are presented to compare frequencies on two or more variables or for two or more groups. 2.Most often they are used when the values of the variable are categories rather than numbers. C. Histograms and frequencypolygons almost never appear in research articles (except articles about statistics); but their shape is sometimes commented on in the text of the article, particularly if the distribution seems to be far from normal. Chapter One How to Make a Frequency Table I. Make a list down the page of each possible value, starting from the highest and ending with the lowest. 11. Go one by one through the group of scores you wish to describe, making a mark for each next to the corresponding value on your list. I-Il~Mzkea n e a t - t ~ i b l e s ~ h o wmany times eachaliue on your list occurs. How to Make a Grouped Frequency Table I. Subtract the lowest from the highest value to find the range. II. Divide the range by a reasonable interval size. 1. Use 2,3, 5, 10,or a multiple of 10if possible. 2. The size, after rounding up, should represents a reasonable number of intervals (in general, no fewer than 5, no more than 15). 111. Make a list of the intervals, from highest to lowest-making the lower end of each interval equal to a multiple of the interval size. (Be sure that the intervals do not overlap.) IV. Proceed as you would for an ordinary frequency table. How to Make a Histogram I. Make a frequency table (or a grouped frequency table). 11. Place the scale of values (or intervals) along the bottom of a page. A. The values (or intervals) should go from left to right, from lowest to highest. B. For a grouped frequency table, it is conventionalto mark only the midpoint of each interval,placed in the center of each bar. (The midpoint is figured as the point between the start of the interval and the start of the next interval.) 111. Along the left of the page make a scale of frequencies that runs from 0 at the bottom to the highest frequency in any value (or interval). IV. Make a bar for each value (or interval), the height corresponding to the frequency of the value (or interval) it represents. Chapter One How to Make a Frequency Polygon I. Make a frequency table (or a grouped frequency table). 11. Place the scale of values (or intervals) along the bottom of a page. A. Be sure to include one extra value (or interval) above and one extra value (or interval) below the values (or intervals) that actually have any scores in them. B?F-or a groupedrfrequency t~ble,use t h e d p o i n t of each interval. ( ' 1 ' F midpoint is figured as the point between the start of the interval and the start of the next interval.) 111. Along the left of the page make a scale of frequencies that runs from 0 at the bottom to the highest frequency in any value (or interval). IV. Mark a point above the center of each value (or interval) corresponding to the frequency of that value (or interval). V. Connect the points with lines. Chapter Self-Tests Multiple-Choice Questions 1. Psychologists use statistics like frequency tables to help make sense of the numbers they collect. a. inferential. b.descriptive. c. intuitive. d.abstract. 2. A researcher studies the amount of self-confidence people have after doing well on a test. Self-confidence in this study is a a. score. b,descriptivestatistic. c. value. d.variable. 3. A psychologist administers a personality scale on which people can get a score of any number between 24 to 86. The results would be described using a grouped frequency table rather than an ordinary frequency table because an ordinary frequencytable would a.have too many values. b.not be able to include all cases. c. create a skewed distribution. Chapter One d.has to start at 0 (or 0%). Chapter One 4. To determine the interval size to use in a grouped frequency table, you fllrst fmd the range and then try different numbers to divide it by, trying to end up with an interval sizethat is a. twice the range. b. an odd number if possible. c. some common regular number (such as 2,3,5 or 10). d.some even number if possible. 5. What is generally the largest number of intervals you would want in a grouped frequency table? 6. A histogram a. is similar to a line graph. b. always approximates a normal curve. c. is a graphic description of frequency table. d.describes the relation between two variables. 7. In a frequency polygon, the vertical (up and down) dimension represents a. frequency. b.possible values the variable can take. c. intensity of the variable. d.mean score. 8. Suppose 50 people take a math exam-25 math experts, 25 people very poor at math. This distribution will probably be . a.unimodal. b.bimodal. c. normal. d. skewed. 9. Describe the distribution of the following scores (you will probably need to make a frequency table, histogram, or frequency polygon to do this). 1,10,6,8,7,5,5,4,9,2,9,8,6,7,8,3,4,3,5,5,7,6,4,6,6,7 a unirnodal and approximatelynormal. b.bimodal and negatively skewed. c. normal and positively skewed. d.normal and negatively skewed. Chapter One 10.A graphic display of a frequency distribution is misleading when a.the proportions are close to 1-112 acrossto 1up. b. the frequencies are grouped or percentages are used. c. some of the intervals are largerthan others. d.the intervals (or values) are put acrossthe bottom. 1. When making a frequency table, the are listed in the &st column and the frequencies corresponding to each of these are listed in the next column. 2. A researcher used a(n) because there were too many different values for an ordinary frequency table. 3. Because the range went from 10 to 99, the researcher used a(n) of 10 for the grouped frequency table. 4. A graph describing a frequency table and looking like a city skyline is called a(n) 5. A(n) distribution shows up in a frequency polygon as two peaks that are much higher than all the others. 6. If the number of scores at each value is approximately the same, this creates a(n) distribution, which is also symmetrical. 7. The distribution of incomes in a small community tends to be skewed, because most people earn small to modest incomes, and a decreasingnumber earn large incomes (though a very small number earn a very great deal). 8. In a pilot test of a planned memory study, the majority received a perfect score. Seeking to avoid such a(n) in the real study,the researcher made the task more difficult. 9. The represents a particular unimodal, symmetrical distribution commonly found in psychology research and nature generally. 10. refers to a distribution being much more peaked or flat than is typical of distributions in psychology. Chapter One Essays and Problems 1. Under what conditions would you prefer to make a grouped frequency table over an ordinary frequency table? Why? 2. Twenty-four adolescent girls whose parents had just divorced completed a questionnaire about their attitudes toward their parents. 'i'heir scores were as follows (the scale goes from 0, very negative attitude, to 80, very positive attitude): (a) Make a grouped frequency table. (b) Make a histogram based on the grouped frequency table. (c) Describe in words the shape of the histogram. 3. The average life spans in captivity for 43 different mammals (as reported in the 1989 World Almanac, p. 258) are as follows: 12201825205 15 12 1220615 8123540157 10815 2042520712151512331101251520121210165 (a) Make a grouped frequency table. (b) Make a frequency polygon based on the grouped frequency table. (c) Describe in words the shape of the histogram. 4. Explain what it means to have a floor effect in a distribution of scores. Give an example. Using SPSS/PC+ Studentware Plus with this Chapter "Ifyou are using SPSS for the first time, before proceeding with the material in this section, read the Appendix on Getting Started and the Basics of Using SPSS/PC+ Studentware Plus. You can use SPSS to create frequency tables and histograms, including grouped frequency tables and histograms based on those grouped frequency tables. You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try to create frequency tables and histograms on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to try the suggestions for using the computer to deepen your understanding, and you can explore the additional SPSS procedure, the box plot, described at the end. I. An Example A. Data: Fourteen students rated the quality of social life in their dormitory on a 1to 10 scale. Theratingswere6, 9, 9,4, 9, 7, 1, 7, 4, 10,7, 1, 9, and4. B. Follow the instructions in the SPSS Appendix for starting up SPSS and be sure the cursor is in the ScratchPad window. 10 Chapter One Chapter 2 The Mean, Variance, Standard Deviation, and Z Scores Learning Objectives To understand, includingbeing able to conduct any necessary computations: itt! The mean. Statisticalsymbols and formulas. The mode and the median. II Variance. iii Standard deviation. Z scores and convertingto and from raw scores. Objectionsto using statisticalmethods. H How mean, variance, and standard deviation are reported in research articles. Chapter Outline I. The Mean A. It is usually the best single number for describinga group of scores. B. It is the ordinary average, the sum of all the scores divided by the number of scores. C. It gives the central tendency, or the general, typical, or representative value of the group of scores. D. You can visualize the mean as a kind of balancing point for the distribution of scores: Imagine a board balanced over a log; on the board are piles of blocks distributed along the board, one for each score in the distribution; and the mean would be the point on the board where the weight of the blocks on each side would exactly balance. E. The formula for the mean is M= X I N. 1.Mis a symbol for the mean. 2.Z is the symbol for "sum of." 3.Xrefers to scores in the distribution of the variableX. 4.N stands for number of scores in the distribution. Chapter Two F. The mode is an alternativemeasure of central tendency. 1.It is the most common single number in a distribution. 2.It is the value with the largest frequency in a frequency table, the high point or peak of a distribution's frequency polygon or histogram. 3. In a perfectly symmetrical unimodal distribution, and in a few other cases, the mode is the same as the mean. 4.When the mode is not the same as the mean, it generally corresponds less well to what we intuitively understand as central tendency. 5.The mode, unlike the mean, can be unaffected by changes in some scores. 6.Psychologists rarely use the mode. G. The median is another alternativemeasure of centraltendency. I.It is the middle score if you line up all the scores from highest to lowest. 2.If there are two middle scores, it is their average (when, as can happen with an even number of cases, the two "middle" score would fall between two different values). 3.It is less distorted by a few extreme cases (outliers) than the mean. 11. The Variance and the StandardDeviation A. These measure how spread out a distribution is. B. Variance. 1.It is the average of each score's squared difference from the mean. a. First subtract the mean of the distribution from each score in the distribution. b. Square each of these deviation scores (to remove the effect of positive and negative deviations which would cancel each other out when summed). c. Add up all the squared deviation scores. d.Divide this sum of squared deviations by the number of cases. 2.You can visualize the variance as an average area, considering each squared deviation as a square whose sides are the amount of deviation. 3.The variance plays an important role in many other statistical procedures, but is used only occasionally by itself as a descriptive statistic since it is scaled in squared units, a metric that is not very intuitively direct for giving a sense of just how spread out the distribution is. C. The standard deviation. 1.It is the most widely used for describing the spread of a distribution. 2.It is the square root of the variance. 3.Roughly speaking, the standard deviation is the average amount that scores differ from the mean. (It is not exactly this because the squaring, summing, and taking the square root does not quite give the same result.) D. The variance formula is SD2= C(X-w21N. 1.SD2is the symbol for the variance. 2. C!X-M)2describes the sum of squared deviations from the mean. 3.Sum of squared deviations from the mean is also symbolized as SS-thus, SD2= SSIN. E. The standard deviation formula is SD = ~ S D Z . Chapter Two F. The above formulas are "definitional";in other books you may see them in a different "computational"fom. 1.Computational formulas are mathematically equivalent and easier to use when computing by hand or with a hand calculator. 2.Computational formulas are rarely used today in research practice because most statistical computations are done by computer. 3. This textbook emphasizes the defmitional formulas (the version of the formula that corresponds to and reminds you of the defmition of the statistic). a. Thus, doing exercisesreinforces understanding. b.But the exercises involve fewer and simpler numbers than that of most real research situations so that the total time to complete the exercises is no more than if you were using computationalformulas. 4.The traditional computational formulas are provided in a Chapter Appendix. G. Variance and standard deviation are sometimes computed using the sum of squared deviations divided by N-1. 1.These situations are described beginning in Chapter 9. 2.Hand calculators and computer outputs sometimes give this figure instead of the formula emphasized in this chapter. 111. Z Scores A. It is an ordinary score transformed so that it better describes that score's location in a distribution. B. It is the number of standard deviations the score is above the mean (if it is a positive Z score) or the number of standard deviations the score is below the mean (if it is a negative Zscore). (Thus, the standard deviation serves as a kind of standardyardstickthrough changing raw scores to Z scores.) C. Z scores have many practical uses and are crucial ingredients in many of the statisticalprocedures in the rest of this book. D. Scores on different scales can be easily compared once they are converted to Z scores. E. The formula for convertingfrom a raw score is Z = (X-M)/ SD. F. The formula for converting to a raw score isX= (.Z)(SD) +M. G. There are three characteristicsof a distribution of Z scores. 1.The mean is always exactly 0 (because converting to a Z score involves subtracting the mean from each raw score). 2.The standard deviation is always exactly 1.0 (because converting to a Z score involves dividing each raw score by the standard deviation). 3 ~ I l l t ; V s 1 ~ l ~ l i l y ~ I . ~ s e - i t - I s t ~ u a r ~ f - t h e - s t - ; t n d a r ~ w i a always 1.0). Chapter Two IV. Controversiesand Limitations: The Tyranny of the Mean A. Although the use of statistics is a central part of psychology, there are several opposing schoolsof thought. B. Behaviorism was the first to criticize statistical procedures. 1.Behaviorism dominated much of the history of psychology research and rejected the study of inner states because they are impossible to observe objectively-hence it focused instead on externally observable behavior. 2.Its leading modern exponent, B. F. Skinner, was opposed to using statistics in behavioral research because the averaging of observations over many cases can lose or distort the information revealed from each case. C. Humanistic psychology was the next to criticize statistics. 1.In the 1950's it became the "third force," in reaction to behaviorism and psychoanalysis (the only significant applied psychology at the time). 2.It holds that human experience should be studied intact, as a whole, as it is experienced by individuals. 3.It does not object to all uses of statistics, but notes that human experience can never be fully reduced to numbers and each individual's experience is unique. 4.This emphasis on an idiographic or in depth study of the single person, over the nomothetic or the searching for general laws that apply over many persons, has been a long tradition in clinical and personality psychology as well. D. Qualitative research is sometimes seen as an alternative or complement to quantitativeresearch methods. 1.Qualitative methods are based in part on phenomenology, a philosophicai position opposed to logical positivism (a basis of modem scientific thinking that holds there is an objective reality to be known); phenomenology seeks instead to gain a deep understanding of the unique reality of each individual. 2.Qualitative methods also come from the ethnographic research tradition in cultural anthropology. 3.Many proponents hold that one should first use qualitative methods to discover which variables are most important, then determine their incidence in the larger population through quantitative methods. E. The "statistical mood." Some depth psychologists raise the concern that emphasizing averages dilutes our feelings about the importance of the single individual and thus, among other consequences, makes immoral actions more likely. V. The Mean, Variance, Standard Deviation, and Z Scores as Described in Research Articles A. The mean, variance and standaraceviation are commoniy reportedmd in a variety of ways-either in the text, in tables, or in graphs. B. Z scoresrarely appear in research articles. 26 Chapter Two Formulas I. Mean (M) Formula in words: Sum of all the scores divided by the number of scores. Formula in symbols: M= ;r3YIN (2-1) C is an instruction to sum all the scores. X is each score in the distribution of the variableX. ;rX is the sum of allthe scores in the distributionof the variableX. N is the number of scores in the distribution. 11. Variance (SD2) Formula in words: Average of each score's squared difference from the mean. Formula insymbols: SD" Z(X-M)2IN or SS IN (2-2,2-3) C(X-M)2 is the sum of squareddeviations from the mean. SS is the sum of squared deviations from the mean. 111. Standard Deviation (SD) Formula in words: Square root of the variance (square root of the average of each score's squared difference from the mean). Formula in symbols: SD = ~ S D Z (2-4) IV. Z Scores from Raw Scores Formula inwords: The number of standard deviations above or below the mean-the deviation score (the score minus the mean) dividedby the standarddeviation. Formula in symbols: Z = (X- M) I SD (2-7) V. Raw Scores from Z Scores Formula in words: Multiply the Z score times the standard deviation (to get the raw deviation above or below the mean) and add the mean. Formula in symbols:X = (Z)(SD) +M (2-8) How to Compute the Mean I. Add up all the scores. 11. Divide by the number of scores. Chapter Two How to Compute the Variance and the Standard Deviation I. Compute the mean (M): Add up all the scores and divide by the number of scores. II. Compute the deviation scores: Subtract the mean from each score. 111. Compute the squared deviations scores: Multiply each deviation score times itself. IV. Compute the sum of the squared deviation scores (SS): Add up all the squared deviation scores. V. Compute the variance (SD?, the average of the squared deviation scores: Divide the sum of the squared deviation scores by the number of scores. VI. Compute the standard deviation (SD), the square root of the average of the squared deviation scores: Find the square root of the number computed above. Wow to Convert a Raw Score to a Z Score I. Compute the deviation score: Subtract the mean from the raw score. 11. Compute the Z score: Divide the deviation score by the standard deviation. How to Convert a Z score to a Raw Score I. Compute the deviation score: Multiply the Z score by the standard deviation. 11. Compute the raw score: Add the mean to the deviation score. Chapter Two Outline for Writing Essays on the Logic and Computations for the Mean, Variance, Standard Deviation, and Z Scores The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defmed it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are sometimes long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. You engrain it in your mind. The time is never wasted. It is an excellent way to study. Essays on Finding the Mean, Variance, and Standard Deviation I. Find mean. A. Procedure: M= XIN. B. Explanation: This is the ordinary average, the sum of the scores divided by the number of scores. II. Find the variance. A. Procedure: SD2= C(X-M)2I N. B. Explanation. 1.Describes the spread of the scores. 2.Finds the average of the squared amount each score differs from the mean. Chapter Two 111. Find the standard deviation. A. Procedure: SD= ~SDZ. B.Explanation. 1.The variance is an average of squared scores; by taking its square root, the measure of spread of the scores is returned to ordinary nonsquared scores. 2.The result is approximately the average amount each score varies from the mean. 3.To be exact it is the square root of the average squared deviation from the mean. 4.Due to the squaring, averaging, and square root process, it is not quite the same as the average amount each score varies from the mean, but the use of squaring in the process avoids mathematical problems (for example, it eliminates the sign of the deviations-the fact that some are negative and some positive). Essays InvolvingZ Scores I. Find a Z score based on a raw score. A. Procedure: Z = (X- M ) I SD. B. Explanation. 1.Explain mean and standard deviation (as described in outline above). 2.Finding a Z score converts an ordinary score to its number of standard deviations above or below the mean. 3.This is done by subtracting the mean from the score and dividing the result by the standard deviation. 4.This puts the score on a scale that is highly standard-for example, high scores (those above the mean) are always positive Zscores, low scores (those below the mean) are always negative Z scores, and the amount a Z score is above or below the mean is in direct proportion to the standard deviation. 5.This procedure puts scores on different variables onto the same scale, permitting comparisons between them. 11. Find a raw score based on a Z score. A. Procedure:X=(Z)(SD)+M. B. Explanation. 1.Explain mean and standard deviation (as described in outline above). 2.This converts a Zscore, a special score that indicates a score's number of standard deviations above or below the mean, back to an ordinary score. 3.This is done by multiplying the2 score times the standard deviation to get the number of ordinary score units above or below the mean, and then adding this to the mean to get the actual raw score. Chapter Two Chapter Self-Tests Multiple-Choice Questions 1. Six students record the amount of time studied on a particular evening (rounded off to the nearest hour). They report 0, 0, 1, 1,4 and 6 hours. What is the mean time studied? 2. What does"ZYt refer to? a. standard deviation ofX. b. expected value ofX. c. estimated value ofX. d. sum ofX. 3. What is the mode of the following scores? 0,1,1,1,2,3,7,8,8,9,15 4. What is the median of the following scores? 0,1,1,1,2,3,7,8,8,9,15 5. In the following set of scores, which would be the preferred measure of central tendency? 5,41,42,42,44,46,47,47,47 a.mean. b.median. c.mode. d. standard error. 6. The variance is d. the surrr uT tkqmreddevf~-fromtfremean. b. the average of the deviations from the mean. c. the sum of the square roots of the deviations from the mean. d.the average of the squared deviations from the mean. Chapter Two 7. If the variance is 7, what is the standard deviation? a. 47. b. 72. c.7-M. d.(7-M)2. 8. What is the variance of the four scores, 1, 5, 5, and 9? 9. The standard deviation of a distribution of Z scores is always a. 0. b. 1. c. smaller than the mean of the raw scores. d.greater than the standard deviation of the raw scores. 1O.A person has a Z score of .5. If the mean of the distribution is 71 and the standard deviation is 20, what is this person's raw score? Fill-In Questions 1. The sum of the scores dividedby the number of scores is the 2. The mean, median, and mode are examples of indicators of the of a distribution. 3. A study produces the scores 14, 15, 17, 18 and 18. What is N? 4. The of the scores 6, 6, 6, 7, and 9 is 6. 5. If you line up all the scores from highest to lowest,the middle score is the 6. If a group of scores are 14, 17, 17, 18, 18, and 91, the score of 91 is called a(n) 7. In symbols, the formula for the variance is 32 Chapter Two 8. A deviation score is the score minus 9. The is the most commonly used descriptivestatisticfor indicating spread of a group of scores. 10.The formula for converting a raw scoreto a Z score is Problems and Essays 1. A psychologist administers a test of hand-eye coordination to eight severely depressed adult men and finds scores of 3.1, 3.8,4.0,4.5,4.5, 5.4,6.0 and 8.7. (a) Compute the mean, variance, and standard deviation. (b) Explain what you have done and what the results mean to a person who has never had a course in statistics. 2. A person visits a vocational counselor and is administered various tests. The person scores 50 on a test that measures aptitude for a career in sales (for people in general on this test,M = 40, SD = 4) and 95 on a test of aptitude for a career in education(for people in general on this test, M = 80,SD = 20). (On both tests high scoresmean greater aptitude). (a) In relation to other people, what is this person's greater aptitude? (Be sure to show the calculationsthat are the basis of your answer.) (b) Explain what you have done and the basis of your conclusion to a person who has never had a course in statistics. 3. Two children complete a standard test of vocabulary which was sent off to a special service for scoring. The school counselor received the results in terms of Z scores. One child, Mary, received a Z score of 1.23 and the other child, Susan, received aZ score of -.62. The school counselor is interested also in the raw number correct each student received. Looking up the information in the test's manual, the school counselor found out that the mean raw score for this test is 42 questions correct, with a standard deviation of 6.5. (a) What are the raw numbers correct for each child? (b) Explain what you have done to a person who has never had a course in statistics. 4. A researcher administered a questionnaireto a group of healthy adults all over 80 years old. The questionnaire included one item that asked about happiness with life, using a 10-point scale fiom 1 = very unhappy to 10 = very happy. The researcher reported the result on this scale as follows: "Forthe 65 subjects who completedthis item, M= 6.83, SD = 2.41." Explain what this result means . . to a person whoh-amevi-kad a comse in stalxt~cs. Chapter Two Using SPSS/PC+ StudentwarePlus with this Chapter If you are using SPSS for the frst time, before proceeding with the material in this section, read the Appendix on Getting Started and the Basics of Using SPSS/PC+ StudentwarePlus. You can use SPSS to compute the mean and variance and to convert a series of raw scores toZ scores. However, as you will see, the variance that SPSS gives you is computed using a slightly different formula than what you have been using in Chapter 2-instead of dividing the sum of squared deviations by N, it divides by N-1. (This other way of computing the variance is also correct, but is used for a different purpose that you will learn in Chapter 9.) Thus, when computing the variance using SPSS, you have to make an adjustmentto what SPSS gives you, using your hand calculator. Once you have made the adjustment, you can take the square root of your result (again with your hand calculator) to get the standard deviation. You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestionsfor Additional Practice. Finally, you may want to try the suggestions for using the computer to deepen your understanding, and you can explore the additional, advanced SPSSprocedure at the end, involving skew and kurtosis. I. Example A. Data: The number of therapy sessions for each of 10 clients of a psychotherapist (fictional data), from the example in the text. The numbers of sessions are 7, 8, 8, 7, 3, 1,6, 9, 3, and 8. B. Follow the instructions in the SPSS Appendix for starting up SPSS and be sure the cursor is in the ScratchPad window. C. Enter the data as follows. 1.Type DATA LIST FREE / SESSIONS. and press Enter to go to the next line. 2. Type BEGIN DATA. and press Enter to go to the next line. 3.Type 7, the number of sessions for the first subject, and press Enter. 4. Type 8, the number of sessions for the second subject, and press Enter. At this point the screen should look like Figure SG2-1. Chapter Two chapter 3 Correlation Learning Objectives To understand, including being able to conduct any necessary computations or carry out any necessary procedures: H Scatter diagrams. H Patterns of correlation: linear, positive, negative, curvilinear, and no correlation. The correlation coefficient (r). H Causality and correlation. H Use of proportionate reduction in error (P)to compare correlations. H Effect of restriction in range and unreliability of measurement on the correlationcoefficient. II Binomial-effect-sizedisplay. H How correlation results are presented in research articles. Chapter Outline I. Key Terms and Concepts A. The pattern of high scores on one variable going with high scores on the other variable, low scores going with low scores, and moderate with moderate, is an example of a correlation. B. When two variables are correlated, and one is considered a cause of the other, they are distinguishedby different terms. 1.The variable considered as cause is called the independentvariable. 2. The variable considered as effect is calledthe dependent variable. C. When two variables are correlated and it is not clear what their casual relationship is (a common situation in research involving correlations), we still often speak of predicting one variable fiom the other. 1.The variable being predicted from is called the predictor variable. 2.The variable being predicted about is still usually called the dependent variable. (The proper but rarely used term is "criterionvariable.") Clzapber Three 11. Graphing Correlations: The Scatter Diagram A. A scatter diagram displays the degree and pattern of relation of the two variables. B. Making a scatter diagram involves three steps. 1.Draw the axes and determinewhich variable should go on which axis. a. The independent or predictor variable goes on the horizontal axis. b. The dependent variable goes on the vertical axis. 2.Determine the range of values to use for each variable and mark them on the axes. a. Your numbers should go upward on each axis, starting from where the axes meet. b. Ordinarily, begin with the lowest value your measure can possibly have, or zero, and continue to the highest value your measure can possibly have. c. When there is no obvious or reasonable lowest or highest possible value, begin or end at a value that is as high or low as people ordinarily score in the group of people of interest for your study. 3.Mark a dot for the pair of scores for each case. a. Locate the place on the horizontal axis for that person's score on the predictor variable. b.Move up to the height on the vertical axis which represents that person's score on that variable and mark a clear dot. c. If there are two cases in one place, you can either put the number "2" in that place or locate a second dot as near as possible to the first-if possible touching-but being sure it is clear that there are in fact two dots in the one place. 111. Patterns of Correlation A. These can be identified by the generalpattern of dots on the scatter diagram. B. Linear correlation is where the pattern of dots follows a straight line. C. Positive correlation (or positive linear correlation)is the term for one pattern. 1.It refers to low scores on one variable going with low scores on the other variable, mediums with mediums, and highs with highs. 2. On the scatter diagram, the dots follow a line which slopes up and to the right. (That is, the line has a positive slope.) D. Negative correlation (or negative linear correlation) is the term for another pattern. 1.It refers to low scores on one variable going with high scores on the other variable, mediums with mediums, and highs with lows. 2.On the scatter diagram, the dots follow a line which slopes down and to the right. (That is, the line has a negative slope.) E. Curvilinear correlation is yet another pattern. 1.It refers to when the relationship between two variables does not follow any kind of straight line, positive or negative, but instead follows a curving or more complex pattern. 2.Not all curvilinear relationships are even simple curves. CItapter Three F. No correlation ineans no pattern-the two variables are completely unrelated to each other. 1.On the scatter diagram the dots are spread everywhere, and there is no line, straight or otherwise, that is any reasonable representation of a simple trend. 2.Note that in actual research situations sometimes the relationship between two variables does exist (that is, it is not really a no-correlation situation), but it is not very strong and thus hard to see visually in a scatter diagram. IV. Computing an Index of Degree of Linear Correlation: The Pearson CorrelationCoefficient (r) A. The degree of correlation can be considerednumerically or graphically. 1.It is the extent to which there is a clear pattern, some particular relationship, between the distributions of scores on two variables. a. For a positive linear correlation, it is the extent to which high numbers go with highs, mediums with mediums, lows with lows. b. For a negative linear correlation, it is the extent to which lows go with highs, etc. 2.In terms of a scatter diagram, a high degree of linear correlation means that the dots all fall very close to a straight line (the line sloping up or down depending on whether the linear correlation is positive or negative). B. The comp~~tationof the degree of linear correlation has a clever logic behind it. 1.The frst requirement of any such measure is that it be a consistent way of deciding for both variables what is a high and what is a low score-and how high is a high and how low is a low-this is accomplished by converting all scores to Z scores. a. This puts both variables on the same scale. b. A high score will always be positive. c. A low score will always be negative. d.The extent to which a score is high or low will be in proportion to its standard deviation. 2.A second requirement is a way to combine this information to get a number that reflects the degree to which highs go with highs, etc.-this is accomplished by computing the sum of cross-products of Z scores. a. A cross-product of Z scores is the Z score of the subject on one variable times the Z score of that subject on the other variable. b.If highs go with highs (positive Z times positive Z ) or lows with lows (negative Z times negative Z), the result is positive in either case-thus a high positive sum (of these cross-products, over all subjects) results when there is a strong positive linear correlation. c. If highs go with lows (positive Z times negative Z ) or lows with highs (positive Z tunes negative Z), the result is negative in either case-thus a high negative sum results when there is a strong negative linear correlation. Clznpter TItree d.If highs sometimes go with highs (positivez times positive Z)but sometimes highs go with lows (positive Z times negative Z),and so forth, the result is that positives and negatives cancel out-thus the sum is near zero when there is no correlation. 3.A third requirement is that the measure of the degree of correlation have some standard scale-this is accomplished by finding the average of the cross-products of Z scores (dividing the sum by the number of cases). a. This is called r, the Pearson correlation coefficient. b. When there is a perfect positive linear correlation, r = 1. c. When there is a perfect negative linear correlation, r =-1. d.When there is no linear correlation, r = 0. e. In between degrees of correlation have values between 0 and 1 (for positive correlations) and between 0 and -1 (for negative correlations). C. In terms of a fonnula: r = C(Z,yZ) 1N. D. A computational formula for the correlation coefficient is described in Chapter Appendix I. V. Testing the Statistical Significance of the Correlation Coefficient A. The correlation coefficient, by itself, is a descriptive statistic-it describes the degree and direction of linear correlation in the particular group measured. B. However, when conducting research in psychology we are often more interested in a particular set of scores as representative of the larger population which we have not directly studied. C. A correlation is said to be "significant" if it is very unlikely (less than 5% or 1%probability) that we could have obtained a correlationthis big if in fact the overall group had no correlation. D. The logic and procedures of statistical significance are the major focus of the text starting with Chapter 5, but are not considered in any more detail in this chapter on correlation (except in Chapter Appendix 11). VI. Issues in Interpreting the Correlation Coefficient A. Causality and correlation. 1.For any particular correlation between variables X and Y, there are three possible directions of causality. a.Xcauses Y. b. Y causesX. c. Some third factor could be causing both Xand Y. 2.The issue is confused by two uses of the word "correlation": a. A statistical procedure. b. A type of research design (that does not use random assignment and is often, but not always, analyzed using the correlation coefficient). B. Comparing magnitude of correlations. 1.Larger values of r (values further from zero) indicate a higher degree of correlation, but are not proportionally related: for example, an r of .4 is not twice as strong as an r of .2. 2.To compare correlations with each other, the measure you use is r", called the proportionate reduction in error. (The meaning of this statistic is discussed in detail in Chapter 4). C. Restriction in range: When only a limited range of the possible values on one or both variables are included in the group studied, the resulting coi-relation can not be properly extended to apply to the entire range of values the variable might have among people in general. D. Unreliability of measurement. 1.A measure that is not perfectly accurate is unreliable. 2.If one or both variables in a correlation are unreliable, the correlation is reduced (attenuated). 3.More advanced texts describe formulas for estimating what the correlation between two variables would be if the two variables were perfectly reliable (called disattenuating or correcting for attenuation). VII. Controversiesand Recent Developments: What is a Large Correlation? A. Traditionally a large correlation is considered to be .5 or above, a moderate correlation to be about .3, and a small correlationto be about .l. B. In fact, in psychology it is rare to obtain correlationsgreater than .4. C. It is traditional to caution that a low correlation is not very important even if it is statistically significant. D. Even experienced research psychologists tend to overestimate the degree of association that a correlationcoefficient represents. E. However, Rosnow and Rosenthal(1989) have argued tl~atin many cases even very low correlationscan have important practical implications. F. They illustrate their point with a binomial-effect-sizedisplay: 1.This is a table in which the cases are divided in half on each variable, and the number of cases in each of the four combinations is considered. 2.In this layout, the difference in percentages between the two halves is exactly what you would get if you computed the correlation coefficient. VIII.Now CorrelationCoefficients Are Described in Research Articles A. Correlation coefficientsare often described in the text of a research article. 1.They are usually reported with the letterr, an equal sign, and the correlation coefficient (e.g., r = .31). 2. Sometimes the significance level, such as "p < .05," will also be reported. B. A table of correlations, called a correlation matrix, is common when correlations have been computed among many variables. Formula Correlation coefficient (r) Formula in words: Average of the cross-product of Z scores. Formulas in symbols: r = Z(ZxZy)I N (3-1) Zx is the Z score for each case on theXvariable. Z, is the Z score for each case on the Yvariable. ZxZy is the cross-product of Z scores(for each case, Zxtimes Zy). N is the number of cases. How to Graph and Compute a Correlation I. Construct a scatter diagram. A. Draw the axes and determinewhich variable should go on which axis. B. Determine the range of values to use for each variable and mark them on the axes. C. Mark a dot for the pair of scoresfor each case. D. Determine if clearly curvilinear-if so, do not compute the correlation coefficient (or do so with the understanding that you are only describing the degree of linear relationship). 11. Estimate the directionand degree of linear correlation. 111. Computethe correlation coefficient. A. Convert all scoresto Z scores. B. Computethe cross-product of the Z scores for each case. C. Sumthe cross-products of the Z scores. D. Divide by the number of cases. IV. Check the sign and size of your computed correlation coefficient against your estimatefrom the scatter diagram. Chapter Three Outline for Writing Essays on the Logic and Computations for a Correlation Problem The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you'll be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself start being just a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Construct a scatter diagram. A. Procedure: A two-dimensional graph with each variable on one axis and a dot for each score representing its score on the two variables. B. Explanation: This graph shows the pattern of relationship among the two variables. 11. Estimate the direction and degree of linear correlation. A. Procedure: Inspect the pattern of dots. B. Explanation: The general pattern of dots shows whether highs go with highs, lows with lows (or the reverse), indicating the degree of association. (Note the pattern for your particular data.) Clznpter Three 55 111. Compute the correlation coefficient. A. Procedure: r = C(Z,Z,) / N. B. Explanation. 1.The correlation represents the degree high scores go with high scores and low scores with low scores (or if describing a negative correlation, the reverse). 2. One can identify the extent to which a score is high or low by converting to Z scores (explain meaning of a Z score, and mean and standard deviation in lay terms, as per the explanation in the outline for writing essays in Chapter 2 of this Study Guide). a. With Z scores, a high score is always positive. b.A low score is always negative. c. The degree to which a score is high or low is in proportion to the standard deviation. 3.After converting all scores to Z scores, one multiplies each individual's Z score on one variable times that individual's Z score on the other variable. a. If describing a positive correlation, note that if highs go with highs then positives will always be multiplied by positives (giving a positive product) and if lows go with lows then negatives will always be multiplied by negatives (also giving a positive product)-the sum over all cases will thus be high and positive. b. If describing a negative correlation, note that if highs go with lows then positives will always be multiplied by negatives (giving a negative productkthe sum over all cases will thus be a large negative number. 4.One then sums the cross-products and divides by the number of cases to get an average cross-product (called r). a. If describing a positive correlation, note that the more highs go with highs and lows with lows (or the dots fall near a straight line that slopes up), the closer r is to 1, with no association being an r of 0. (Discuss the degree of correlation of your particular result.) b. If describing a negative correlation, note that the more highs go with lows and lows with highs (or the dots fall near a straight line that slopes down), the closer r is to -1, with no association being an r of 0. (Discuss the degree of correlation of your particular result.) IV. Check the sign and size of your computed correlation coefficient against your estimate from the scatter diagram (Step I1 above). Clzapter Three Chapter Self-Tests Multiple-ChoiceQuestions 1. A variable that is consideredto be an effect is called a(n) a,dependentvariable. b.predictor variable. c. causal variable. d.independentvariable. 2. Which of these statementsabout scatter diagrams is true? a. their usual purpose is to describethe relationshipbetween three or more variables. b. when the dots on the graph seem to form a straight line,this is called a curvilinear correlation. c.the lowest to highest values of the independent variable are marked on the vertical axis and the lowestto highest values of the dependent variable are marked on the horizontal axis. d.each individual'spair of scores is representedas a dot on this two-dimensional graph. 3. A study finds that the more exercise people do, the less money they spend on medical treatment, but only up to a point. Beyond that point, the more exercise they do, the more money they spend on medical treatment. The relation between amount of exercise and money spent on medical treatment represents a. a positive linear correlation. b. a negative linear correlation. c. a curvilinearcorrelation. d.no correlation (that is, neither linear nor curvilinear). 4. Which choicebest describesthe data on this scatter diagram? a. no correlation. b. curvilinear correlation. c. positive linear correlation. d. negative linear correlation. Chapter Three 5. Which of the following statements is true about the correlation coefficient? a. it is an index of the degree of curvilinear correlation. b. it is the average of the cross-products of Z scores. c. it is symbolized as c. d.it is highly positive when there is a strong negative linear correlation. 6. An employer conducts a survey of how much coffee workers drink each day and how much work they get done. The result is a positive correlation that is statistically significant at the .05 level. What should she conclude? a. Coffee increasesthe rate at which people work. b.Working quickly causes people to drinkmore coffee. c. Having a faster metabolism causes people to work faster and to crave coffee. d. She cannot make any definite conclusions about the direction of causalityjust from knowing that the correlation is positive and significant. 7. Given that the correlation coefficient is .3,what is the proportionate reduction in error? 8. Suppose you were conducting a study of the relation between appearance (good looks) and income. Which of the following is likely to DECREASE the size of the correlation you would find? a. studying a group of people who vary a great deal on appearance. b.using a measure of income that is highly exact. c. using a measure of good looks that is fairly ambiguous. d.none of the above. 9. In psychology a large correlation coefficient is traditionally considered to be a correlation that is at least a. .3 b. .5 c. .7 d. .9 10.In a research article, the correlations among several variables are often presented in a table called a a. contingency table. b. scatter diagram. c. correlation matrix. d.C table. Clzapter Thee Fill-In Questions 1. IfX is considered to be the cause of Y,Xis called the variable. 2. In a scatter diagram of the relation between amount of tiredness and confusion in thinking (in which tiredness is considered to be the cause of confusion in thinking), a dot located at 6 across and 4 up means the person had a score of 6 on 3. A scatter diagram shows a pattern of dots that follow a line that starts going up and to the right and then about half-way along stops going up andjust stays flat as it continues to the right. This pattern is an example of correlation. 4. A study finds that the more self-confidence people have, the less fear they have of meeting new people. The relation between self-confidence and fear of meeting new people is an example of a(n) correlation. 5. If two variables have a positive linear correlation, then in general the cross-products of Z scores should be 6. When there is a perfect negative correlation, the average of the cross-product of Z scores equals 7. If pairs of scores for a group of people are highly correlated, a researcher will often conclude that it is likely that for people in general these scores are highly correlated. The researcher would say that this result is statistically 8. A correlation of .6 is considered to be times as strong a relationship as a correlation of .2. 9. When a correlation is computed between two variables that are measured with questionnaires having a lot of ambiguous questions (and are thus not very reliable), the resulting correlation is likely to be ("lower than," "higher than," or "the same as") if the two variables were measured with a completely unambiguous questionnaire. 10.Rosnow and Rosenthal have shown that even small correlations can have important implications. One way of illustrating this is by making a table in which each variable is divided into high and low and the numbers of cases falling into each high-low combination of the two variables is shown. This kind of a table is called a(n) Problems and Essays 1. Four individuals kept records for a month on how many eggs they ate per day and then were measured on their cholesterol level. Here are the (fictional) results: Average Eggs Cholesterol Eaten Per Day Level 2 210 0 100 1 180 5 270 (a) Make a scatter diagram of the raw data. (b) Describe the general pattern of the data in words. (c) Compute the correlationcoefficient. (d) Compute the proportionate reduction in error. (e) Indicate plausible directions of causality in terms of the variables involved. (f) Explain your resultto a person who has never had a coursein statistics. 2. The following (fictional) data are from six LA gang leaders. Here are raw scores and Z scores of the size of each leader's gang and of each leader's willingness to help in a campaignto stop gang violence: Size of Rated Willingness Gang Gang to Help Campaign Leader Raw Z Raw Z GangA 24-1.2 10 1.6 GangB 106 .4 4 -.5 Gang C 42 -.9 7 .5 Gang D 70 -.3 6 .2 GangE 90 .1 5 -.2 Gang F 178 1.9 1 -1.6 (a) Make a scatter diagram of the raw data. (b) Describe the generalpattern of the data in words. (c) Computethe correlationcoefficient. (d) Computethe proportionate reduction in error. (e) Indicate plausible directions of causality in terms of the variables involved. (f) Explain your answer to a person who has never had a course in statistics. Chapter Three 3. A Canadian social psychologist conducted a (fictional) survey of attitudes towards a particular immigrant nationality among members of that nationality. The survey also included questions about how much the respondents identified with their group of national origin and about how many generations the person's parents had been in Canada (which ranged from 1-first-generation immigrants-through 6). The researcher reported the following results: "There appears to be a strong association between positive attitudes and degree of identification, with a correlation between the two measures of r = .51, p < .01. However, there was little association between positive attitudes and number of generations, r = .16, not significant." Explain this result to a person who has never had a course in statistics. (Discuss the statistical-significance aspect only in a very general way.) 4. An educational psychologist analyzed the relation between high school students' performance in various classes as measured by their grades and reported the (fictional) results in the following table. English Mathscience History English .12 .19 .53** Math .68** .09 Science .28* History Explain these results to a person who has never had a course in statistics. (Discuss the statistical-significance aspect only in a very general way.) Using SPSS/PC+ StudentwarePlus with this Chapter If you are using SPSS for the first time, before proceeding with the material in this section, read the Appendix on Getting Started and the Basics of Using SPSS/PC+ Studentware Plus. You can use SPSS to create a scatter diagram and to compute a correlation coefficient. You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to try the suggestions for using the computer to deepen your understanding, and you can explore the advanced SPSS procedure, a correlation matrix, at the end. I. Example A. Data: The number supervised and stress level for five managers (fictional data), from the example in the text. The scores for the managers for the two variables are number supervised 6, stress 7; 8, 8; 3, 1; 10, 8; and 8, 6. Clznpter TIzree 61 Chapter 4 Prediction Learning Objectives To understand, including being able to conduct any necessary computations: E Bivariate prediction model with Z scores and associated formulas, terminology and symbols. H Why prediction is sometimes called regression. II Two methods of bivariate prediction with raw scores and associated terminology, formulas and symbols. E Regression line. Proportionate reduction in error in bivariate prediction and associated formulas, terminology and symbols. E Multiple regression withZ scores and associated formulas, terminology, and symbols. (You are not expected to be able to compute Ds, but should be able to interpret them and make predictions using them if they are given.) E Multiple regression with raw scores and associated formulas, terminology, and symbols. (You are not expected to be able to compute a and bs, but should be able to interpret them and make predictionsusing them if they are given.) H The multiple correlationcoefficient. E Limitationsof regression. Basic issues in interpretingthe relative importance of predictor variables in multipleregression. E How results of studiesusing multiple regression are reported in research articles. Chapter Outline I. Terminology of Bivariate Prediction A. In bivariate prediction (same as bivariate regression), we use a person's score on a predictor (or independent) variable to make predictions about a person's score on a dependent variable. B. The predictor variable is usually labeledX,the dependent variable Y. 11. The Bivariate Prediction Model with Z Scores A. A person's predicted Z score on the dependent variable is found by multiplying a particular number, called a regression coeficient, times that person's Z score on the predictor variable. Chapter Four B. Because we are working with Z scores (which are also called standard scores), the regression coefficient in this case is called a standardized regression coeflcient. 1.It is symbolized by the Greek letter beta (D). 2.Because it is a kind of measure of how much weight or importance to give the predictor variable, it is called a "beta weight." C. Formula and symbols: Predicted Zy= v)(Z,). D. 13 = r (in bivariate prediction). Below is some insight as to why. I.If there is no correlation (r = O), B = 0: The predictor variable is irrelevant, and the best predictor is the mean of the dependent variable (a Zyof 0). (Zy= v][Zx] = [O][ZdY]= 0.) 2.If there is a perfect correlation (r = l), 13 = 1: The dependent variable'sz score exactly equals the predictor variable's Z score. (Zy= V][Z,] = [l][Z,] = Z,.,,.) 3. Thus, in the intermediate cases, when r is between 0 and 1, the best number for beta is also between 0 and 1. E. Why predictionis sometimes called regression. 1.When there is less than a perfect correlation between two variables, the dependent variable Z score is some fraction (the value of r) of the predictor variable Z score. 2.As a result, the dependent variable Z score is closer to its mean (that is, it regresses or returns to a Z of zero). 111. BivariatePredictionUsing Raw Scores A. Convert the raw score on the predictor variableto a Z score. B. Multiply beta times this Z score to get the predicted Z score on the dependent variable. C. Convert the predicted Z score on the dependent variable to a raw score. IV. Direct Raw-Score-to-Raw-Score Prediction A. Reduces the three-step process (I11 above) to a single formula which automaticallytakes into account the conversioninto and from Z scores. B. Raw-score prediction formula: Qf= a +(b)(a 1.b (the raw-score regression coefficient) = (D)(SDy/SD,). 2. a (the regression constant) =My- (b)(Mx). V. The Regression Line A. Graphic representation of the prediction model. B. A line in which the horizontal axis represents the predictor variable scores and the vertical axis represents the predicted scores on the dependentvariable. ChapterFour C. The slope of the regression line. 1.It equals b, the raw-score regression coefficient. 2.This equivalence (of the slope of the regression line andb) emphasizes that a regression coefficient serves as a kind of rate of exchange between the predictor and dependent variable (that is, for each unit increase in the predictor variable, there is a predicted fractional increase in the dependent variable equal to b). D. How to drawthe regression line. 1.Draw and label the axes of the graph, as for a scatter diagram. 2.Pick any value of the predictor variable, compute (using the prediction rule) the corresponding predicted value on the dependent variable, and mark the point on the graph. 3.Do the same thing again, starting with any other value of the predictor variable (it is easiestto draw an accurate line if the predictor-variable values used in this step and 2 are far apart). 4.Draw a line that passes through the two marks. E. You can check the accuracy of a drawn regression line by finding any third point. F. The point where the regression line crosses the vertical axis, the "Yintercept," is the point at whichX= 0, and thus f = a. VI. Error and ProportionateReduction of Error A. The accuracy of predictions using a prediction model can be estimated by consideringhow much error would result using the predictionmodel to predict the scores used to compute the correlation coefficient in the first place. B. Error is the actual score minus the predicted score. C. We used squared error (in part because positive and negative errors will offset each other): Error2= (Y- f)'. D. Graphic interpretation of error: The vertical distance between the dot for a subject'sactual score and the regression line. E. Proportionatereduction in error. 1.It is the most common way to think aboutthe accuracy of a prediction model. 2.It is a comparison of squared errors indicatinghow much better predictions are likely to be using a particularmodel than making predictions without the model. F. Strategy for computingproportionate reduction in error. A 1.Computetotal of squared errors using the prediction model: SSE= C(Y-Y)Z. 2.Compute total of squared errors not using a prediction model (that is, predicting the mean for each dependent variable score): SS, = C(Y-M)Z. 3.Computethe reduction using the prediction rule: Reduction =SS, - SSE. 4.Computethe proportionate reduction: Reduction/SSTor (SST-SSE)/SST. Chapter Four G. Some insight into meaning of proportionate reduction in error. 1.If the prediction model is no improvement, SSE = SST; thus, reduction = 0 and proportionate reduction = 0 or 0%. 2.If the prediction model predicts perfectly, SSE= 0; thus reduction =SST- 0 = SST,and proportionate reduction = 1or 100%. 3.For in between cases, where the prediction rule is some improvement but does not predict perfectly, proportionatereduction in error is between 0% and 100%. H. Proportionate reduction in error = r2. I. Proportionate reduction in error is also called the proportion of variance accounted for (because SSTis also the sum of squared deviations from the mean, the essential ingredientin the variance). J. Graphic interpretationof proportionate reduction in error. 1.If each score is predicted to be the mean, the line representingthese predictions would be a horizontal line. 2.The proportionate reduction in error can be thought of as the extent to which the regression line's accuracy is greaterthan the horizontal line's accuracy. VII.Extension to Multiple Regression and Correlation A. The association between a dependent variable and two or more predictor variables is called multiple correlation; making predictions in this situation is called multiple regression. B. These procedures have become increasingly important in psychology and are today very widely used. C. Because this is an advancedtopic, it is only introduced in a general way in this text. D. Z score prediction model in multiple regression: Predicted Z, = (D,)(Z,) + (fi,)(Z,D) + (133)(Z,) ...... E. In multiple regression, D for a predictor variable is not the same as r for that predictor variable with the dependentvariable. 1.R is usually lower (closer to 0) than r because part of what any one predictor variable measures will overlap with what the other predictor variables measure. 2.In multiple regression, I3 is based on the unique, distinctive contribution of the variable, excluding any overlap with other predictor variables. I?. Raw-score prediction model in multipleregression: f = a +(b,)(X,) +(b3(X3 + (b3)(X3)...... (each b gives the raw-score rates of exchange for its predictor variable at any given levels of the other predictor variables). ChapterFour G. The multiple correlation coefficient (R) describes the overall correlation between the predictor variables, taken together, and the dependentvariable. 1.Due to overlap in predicting the dependent variable,R is usually less than (and cannot be more than) the sum of each predictor variable's r with the dependent variable. 2.R ranges from 0 to 1.0 (unlike r, it can not be negative). H. Proportionate reduction in error in multiple regression. 1.It follows the same principle as in bivariate regression except errors are calculated using predictions based on the multiple regressionprediction rule. 2.Proportionate reduction in error = RZ. VIII. Controversies and Limitations A. The limitations of correlation (see Chapter 3) apply with equal or greater force to bivariate and multipleprediction. 1.It only assesseslinear aspect of relationship. 2.It is distortedby restriction in range. 3.It is distortedby unreliability of measures. 4.It does not indicate direction of underlying causal relationships. B. There is controversy over how to assess the relative importance of the several predictor variables in predicting the dependentvariable. 1.For purely predictive purposes, the regression coefficients (either standardized or raw score) serve well. 2.But for a theoretical understanding of the relative importance of the different predictors, the regression coefficientsare not necessarilythe best indicators. a. They reflect only the unique contribution of the predictor variable to the prediction after considering all the other predictors. b.Thus, when predicting by itself, without considering the other predictors (that is, using r), a variable may appear to have a quite different importance relative to the other predictors. 3.Whenever the predictor variables are correlatedwith each other (called multicollinearity, the usual situation in multiple regression) there is no agreed-upon approach to the question of the relative importance of these sorts of predictor variables. IX. Prediction Models as Described in Research Articles A. Bivariate prediction models are rarely cited in psychology research articles-in most cases, simple correlations are reported. B. Multiple regression models are commonly reported (either in the text or a table), including the following. 1.The regression coefficients (13s or a and bs or all of these). 2.RZ. 3.R. 4. Statistical significanceof R2,regression coefficients, or both. ChapterFour Formulas I. Predicted Z score for a particular subject on the dependent variable (predicted2,) in bivariate prediction. Formula in words: Standardized regression coefficient times the subject'sZ score on the predictor variable. Formula in symbols: Predicted Zy= v)(ZX) (4-1) 13 is the standardizedregression coefficient. Zx is the known subject's known Z score on the predictor variable. 11. Standardized regressioncoefficient (0) in bharide prediction. Formula in words: It is the same as the correlation coefficient. Formula in symbols: 13= r r is the ordinary correlation coefficient between the predictor variable and the dependent variable. III. Predicted raw score for a particular subject on the dependentvariable (ff) in bivariate prediction Formula in words: The raw-score regression constant plus the product of the raw- score regression coefficient times the subject's score on the p~edictorvariable. Formula in symbols: Y = a +(b)(X) (4-2). a is the raw-score regression constant. b is the raw score regression coefficient. X is the subject'sknown raw score on the predictor variable. IV. Raw score regression coefficient(b) Formula in words: The standardized regression coefficient times the ratio of the standard deviation of the predictor variable to the standard deviation of the independent variable. Formula in symbols: b = (R)(SDdSDx) (4-3) SD, is the standard deviation of the dependent variable. SDx is the standard deviation of the predictor variable. V. Raw scoreregression constant (a) Formula in words: The mean of the dependent variable minus the product of the raw-score regression coefficient times the mean of the predictor variable. Formula in symbols: a =My- (b)(Mx) (4-4) My is the mean of the dependent variable. Mx is the mean of the predictor variable. ChapterFour VI. Sum of squared error using the prediction rule (SS,) Formulain words: Sum over all the scores of the square of each dependentvariable score minus its predicted scoreusing the prediction rule. Formula in symbols: SSE= E(Y- VII. Sum of squared error predicting each score from the mean (SS,) Formula in words: Sum over all the scores of the square of each dependent variable score minus the mean of the dependentvariable scores. Formula in symbols: SST= C(Y-M)Z VIII.Proportionate reduction in error (lP) Formula in words: Amount of squared error reduced (sum of squared error using the mean to predict minus sum of squared error using the prediction rule) divided by the total squared error available to be reduced (sum of squared error using the mean to predict). Formula in symbols: RZ= (SST-SS,) 1SST (4-6) IX. Predicted Z score for a particular subject on the dependent variable (predicted 2,) in multiple prediction. Formula in words: Sum, over all predictor variables, of the product of each predictor variable'sstandardizedregression coefficienttimes the subject'sZ score on that predictor variable. Formula in symbols:Predicted Z, =(131)(Zm)+(132)(Z,)+(R,)(Z,) ...... (4-8) X. Predicted raw score for a particular subject on the dependent variable @) in multiple prediction. Formula in words: Raw-score regression constant, plus the sum over all predictor variables of the product of each predictor variable's regression c~efficienttimes the subject's score on that predictor variable. Formula in symbols: Y = a+(bl)(Xl)+(b2)(X2)+(b3)(X3)...... (4-9) Wow to Construct a Z-Score Prediction Model in Bivariate Regression I. Compute the correlation coefficient. 11. Set D equal to r. III. The prediction model: Predicted Zy= (D)(Z,) ChapterFour How to Make a Z-Score Prediction for a Particular Subject in Bivariate Regression I. Determine the Z-score prediction model. 11. Find the Z score for that subject's score on the predictor variable: Zx= (X- MX)ISDX). 111. Substitute the above Zscore into the prediction model and solve. How to Make a Raw-Score Prediction for a Particular Subject in Bivariate Regression Based on a Z-Score Prediction Rule I. Determine the Z-score prediction model. 11. Find the Z score for that subject's score on the predictor variable: Zx= (X- MX)ISDX). 111. Substitute the above Z score into the prediction model and solve for the predicted Zscore on the dependent variable. IV. Find the raw score corresponding to that subject's predicted2 score on the dependent variable: f = (SDy)(PredictedZy)+My Wow to Construct a Raw-Score Prediction Model in Bivariate Regression I. Determine the 2-score prediction model. II. Compute b: b = (13)(SDylSDx). III. Compute a: a =My- (b)(Mx). IV. The prediction model is f = a +(b)(m. Wow to Make a Raw-Score Prediction for a Particular Subject in Bivariate Regression Based on a Raw-Score Prediction Rule I. Determine the raw-score prediction model. 11. Substitute the subject's raw score on the predictor variable into the prediction model and solve. 80 ChapterFour How to Draw a Regression Line I. Draw and label the axes for a scatter diagram of the two variables, with the predictor variable on the horizontal axis and the dependent variable on the vertical axis. PI. Pick a low value of the predictor variable, compute the corresponding predicted value on the dependent variable, and mark the point on the graph. 111. Pick a high value on the predictor variable, compute the corresponding predicted value on the dependent variable, and mark the point on the graph. l[V. Draw a line that passes through the two marks. V. Check the accuracy of your line by finding any third point and being sure it falls on the line. How to Compute the Proportionate Reduction in Error in Bivariate Regression (Using the Method Involving Computation of Errors) I. Determine the 2-score or raw-score prediction model. 11. Compute for each dependent variable score the corresponding raw-score predicted value for that score. 111. Determine the sum of squared errors using the bivariate prediction rule. A. Find the error for each score: subtract the predicted score fiom the actual score;that is, error = Y- f). B. Squareeach error. C. Sumthe squared errors; that is, SS, = C(Y- f ) ~ . IV.Determine the sum of squared errors using the mean to predict. A. Find the error for each score: subtract the mean fiom the actual score; that is, error = Y- My). B. Square each error. C. Sumthe squared errors;that is, SS, = C(Y-My)2. V. Find the reduction in error: Subtract the sum of squared error using the mean to predict from the sum of squared error using the bivariate prediction rule; that is, reduction in error =SS, -SS,. Chapter Four 81 VI. Find the proportionate reduction in error: Divide the reduction in error by the sum of squared error using the mean to predict: that is, proportionate reduction in error = reduction in error / SS, or proportionate reduction in error = (SS,-SS,) / SST. VII.Cross check your calculation by squaring the correlation coefficient-l.2 should equal the proportionate reduction in error as computed above. How to Make a 2-Score Prediction for a Particular Subject in Multiple Regression I. Identify the Z-score prediction model (it must be given to you, sinceyou have not learned how to compute this model in this text). 11. Find the Z score for that subject's score on each predictor variable: For example, for the first predictor variable, Zxl= (XI-Mxl)/SD,. 111. Substitute above Zscores into the prediction model and solve. How to Make a Raw-Score Prediction for a Particular Subject in Multiple Regression I. Identify the raw-score prediction model (it must be given to you, since you have not learned how to compute this model in this text). II. Substitute the subject's predictor variable scores into the prediction model and solve. Chapter Four Outline for Writing Essays on the Logic and Computations for a Bivariate Prediction Problem The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explainingin words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; @) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear justwhy that formula or procedurewas appliedand why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correctways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to SetI PracticeProblems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understandsright up to whatever point you yourself startbeingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Construct a scatter diagram and compute the correlation: Carry out and explain procedures as described in the Outline for Writing Essays in Chapter 3. (This also includes explaining mean, standard deviation and2 scores.) 11. Identify the predictor and dependent variable in your particular problem. ChapterFour 111. Determine the bivariate prediction modelwith Z scores. A. Procedure: Set up formula, Predicted Zy= (B)(Zx),setting B equal to r. B. Explanation. 1.The formula describes the optimalprinciple that allows prediction. 2.Some insight into why the correlationcoefficientrepresentsthat key relationship. a. If there is no correlation (r = O), the predictor variable is irrelevant and the best predictor is the mean of the dependent variable (which for anyZ score is 0). (Z, = Wl[Z,I = [Ol[Z,l= 0.) b.If there is a perfect correlation (r = l),I3= 1: The dependentvariable'sz score exactly equalsthe predictor variable'sZ score. (Zy= W][Zx] = [l][Z,] = ZX.) c. Thus, in the intermediatecases, whenr is between 0 and 1,the best number for beta is also between 0 and 1. IV. Determine bivariateprediction model using raw scores. A. Procedure: Compute a and b and set up the formula of = a + (b)(X),in which b = (B)(SDJSDx)and a =My- (b)(Mx). B.Explanation. 1.One could compute raw-score predictionsby convertingthe predictor variable raw score to a Z score,making the prediction using theZ-score prediction rule, and then converting the predictedZ score to a raw score. 2.However, by combining the formulas for converting raw scores toZ scores and vice versa into the Z-score prediction formula, you get a set of formulas that use the means and standard deviations and allow you to end up with a formula that makes direct predictions fiom raw scores to raw scores. (You do not need to explain the specific arithmeticof this,just the principle.) 3.Give the specificraw-scoreprediction formula for your data. V. Draw the regression line. A. Procedure. 1.Draw and label the axes of the graph, as for a scatter diagram. 2.Pick a low value of the predictor variable, compute (using the prediction rule) the corresponding predicted value on the dependent variable, and mark the point on the graph. 3.Do the same thing again using a high value of the predictor variable. 4.Draw a line that passes through the two marks. 5.Pick an intermediate value of the predictor variable, find the corresponding dot, and check that it falls on the regression line you have drawn. B. Explanation. 1.The regression line is a graphicrepresentation of the prediction model. 2.The line representsthe predicted scores on the dependentvariable. ChapterFour VI. Find the proportionate reduction in error. A. Procedure. 1.Compute for each dependent variable score the correspondingraw-score predicted value for that score (using the prediction model). 2.Find the sum of squared errorsusing the prediction model to predict. a. Find the error for each score: Error = Y- f ). b. Squareeach error. c. Sum the squarederrors: SSE= C(Y- f z). 3.Find the sum of squared errors using the mean to predict. a.Find the error for each score: Error = Y -My). b. Squareeach error. c. Sum the squared errors: SST= C(Y-My)2. 4.Find the proportionate reduction in error: Proportionate reduction in error = (SS,-SS,) / ss,. 5.Cross check: Proportionatereduction in error = rZ. B.Explanation. 1.The accuracy of predictions using a prediction model can be estimated by considering how much error would result using the prediction model to predict the scores used to compute the correlationcoefficientin the first place. 2.Error is the actual score minus the predicted score. Give an example of one of the scores in your data. (Also note that error can be seen graphically as the vertical distance between the dot for a subject's actual score and the regression line.) 3.We used squared error (in part because positive and negative errors will offset each other). 4.Proportionatereduction in error is the most common way to think about the accuracy of a prediction model. 5.Proportionate reduction in error is a comparison of squared errors indicating how much better predictions are likely to be using a particular model than making predictions withoutthe model. 6.Predictingwithoutthe model is the same as using the mean to predict. 7.Thus, we compare squared error using the prediction rule to squared error using the mean to predict. 8.The proportionate reduction is the reduction from using the prediction model (squared error using the mean to predict minus squared error using the prediction rule) as a proportion of (that is, divided by) the squared error without the model (the squared error using the mean to predict). (Give all the figures from the data in your sample as you explainthis.) ChapterFour 9.This gives a proportion between 0% and 100%. (Below is some insight as to why.) a. If the prediction model is no improvement,error using the two methods (the prediction rule and predicting from the mean) are equal, thus there is a 0% reduction and a 0% proportionate reduction. b. If the prediction model predicts perfectly, there is no error using the prediction rule, thus the reduction in error (error using the mean minus error using the prediction model) is the same as the error using the mean to predict. When divided by the error using the mean to predict, this gives a 100%reduction in error. c. In your data (probably), the reduction is in between, since there is some improvement but not a perfect prediction. 10. Also note that the extent to which the regression line is different from horizontal (the line for predicting from the mean) representsthe proportionate reduction in error. 11. It turns out that the proportionate reduction in error is the same as the square of the correlation coefficient. This serves as a check on the computations. (Give the equivalentnumbers.) Chapter Self-Tests Multiple-Choice Questions 1. In the equation, predicted Z, = (B)(Zx), the symbol Zx stands for the a. known Z score of the predictor variable. b,standardized regression coefficient. c. predicted value of the Z score for the dependent variable. d.regression constant. 2. Suppose that there is a .52 correlation between performance on the midterm exam and performance on the final exam. If a person's midterm exam score is 3 standard deviations above the mean (Z score = +3), then what is the person's predicted Z score on the final? a. 31.52 = 5.8. b. .52 +3 = 3.52. c. .52/3 = .17. d.(.52)(3) = 1.56. 3. Suppose that every time a person's score goes down 2 points on a depression scale it is associated with a decrease of 3 points on predicted amount of insomnia. In this example the 3 is the a.proportionate reduction in error. b.regression constant. c. raw score regression coefficient. d.standardized regression constant. ClzapterFour 4. On a scatter diagram, a horizontal line whose height is the mean of the dependent variable represents a.the regression line when r = 1. b.the squared error of estimateline. c. predictions using the mean as the predictor. d.the regression line when r = -1. -6 A study is done of three students during finals week, which involves predicting tension from number of finals to be taken. The subjects' scores on the tension questionnaire were 12, 8, and 4. Their respective predicted scores, using a bivariate prediction rule from the data, were 10, 8, and 6. 5.Which of the followingis the correct computationfor SS,? 6.Which of the following is the correct computationfor SSE? 7. What does the proportionatereduction in error tell you? a. The reduction in error when predicting from the mean versus when predicting from the raw scores. b.The amount of error when predicting from the mean. c. The amount of error when predicting from the raw scores. d.How much of an advantage it is to use the prediction model to make a prediction over predicting from the mean. 8. The proportionate reduction in error equals a. the correlation coefficientsquared. b.the regression coefficient. c. the regression constant. d.the raw-score regressioncoefficient squared. Chapter Four 9. What is the raw-score formula for multiple regression with two predictor variables? a. Predicted Y= a +(bl)(Xl)+(b2)(X2). b.Predicted Y= a2+[(bl)(Xl) +(b2)(X2)I2. c. Predicted Y= a - [(bl+Xl) +(b,+X2)]. d.Predicted Y= a +(bl+b,)l(Xl+X2). 10.How are prediction models described in research articles? a. Prediction models, both bivariate and multiple regression, are rarely cited in research articles, in most cases, only the simple correlationsare reported. b.Both bivariate and multiple regression prediction models are often reported in research articles, usually in the form of a table. c. Bivariate prediction models are rarely cited; however, multiple regression models are commonly reported using a table that includes the betas, the regression constant, andR2, as well as other statistics. d.Bivariate prediction models are often reported in research articles, using a table that includes the betas, regression constant, andR2,as well as other statistics, whereas, multiple regression models are rarely cited. Fill-In Questions 1. Bivariate prediction is also calledbivariate 2. In the bivariate prediction model for your data,a=15 and b=3. For an individual whose score on X is 8, the predicted score for Y is 3. When making a prediction using the raw-score regression equation, each subject's predicted score is the raw-score regression coefficient times his or her score on the predictor variable, plus the (do not give a symbol). 4. The of the regression line corresponds to the raw-score regression coefficient. 5. is the person's actual score minus the person's predicted score. 6. If SS,= 80 and SS,= 60,1-?= 7. The proportionate reduction in error is also called . (Do not give a symbol.) 8. describes the situation when there are correlations among the predictor variables in multiple regression. 88 Chapter Four 9-10 In a multipleprediction situation with two predictor variables, it has been found that R1= .3, R, = -.4., a = -16.2, bl =2.5, and b, = -16. 9.A particular subject'sZ score on the first predictor variable is -1.5, and this subject'sZ score on the second predictor variable is .8. This subject's predictedz score on the dependent variable is 10. A particular subject's raw score on the first predictor variable is 14, and this subject's raw score on the second predictor variable is 2. This subject's predicted raw score on the dependent variable is Problems and Essays 1. A researcher is interested in predicting how well people will do who participate in an outdoor training program (ratings by instructor at end of program on a 100-point scale from 0 to 10) on the number of hours they exercise each week. In a pilot study of five people, these results were obtained. Hours Exercised Person Per Week Tested X Z, 1 3 -.59 2 11 1.76 3 6 .29 4 1 -1.17 5 4 - .29 Performance in Training Program y ZY 45 -.54 90 1.47 53 -.I8 25 -1.43 72 .67 The correlation was found to be 37. Based on the above information, (a) give theZ-score prediction formula for predicting performance in the training program based on number of hours of exercise per week; (b) compute the proportionate reduction in squared error based on the scores in this sample (do this based on making actual predictions for each score-show your work for the entire process); (c) draw a diagram showing the regression line; (d) using the bivariate prediction rule, predict the performance in the training program for a person who has exercised 8 hours per week; and (e) explain what you have done to a person who has never taken a course in statistics. Chapter Four 2. A psychologist is interested in predicting number of days to recover fiom a particular illness, based on number of stressful events the person experiencedin the preceding month. The scores for four people studied are shown here (all data are fictional): Number Stressful Person Events Tested X Z, A 2 -.395 B 0 -1.17 C 3 0 D 7 1.57 Number Days to Recover ZY -.63 4 -1.26 7 .63 8 1.26 The correlation was found to be .93. Based on the above information, (a) give the Z-score prediction formula for predicting number of days to recover fiom number of stressful events; (b) compute the proportionate reduction in squared error based on the scores in this sample (do this based on making actual predictions for each score-show your work for the entire process); (c) draw a diagram showingthe regression line; (d) using the bivariate prediction rule, predict the number of days to recover for a person who has experienced 5 stressful events in the previous month; and (e) explainwhat you have done to a person who has never taken a course in statistics. 3. Based on insurance company statistics, a health psychologist computed the following multiple regression equation for predicting how long a person can expect to live based on a study of women in a particular industrialnation (fictionaldata): Predicted Years = 75 - (.1) (poundsoverweight) - (4) (packs of cigarettessmokedper day) +(.9)(hours exercised per week) +(3) (number of grandparentswho lived past 80) (a) Compute the predicted life expectancy for a woman fiom this nation who is 15 pounds overweight, does not smoke, exercises 2 hours a week, and has one grandparent who lived past 80. (b) Compute the predicted life expectancy for another woman fiom this nation who is not at all overweight, smokes two packs of cigarettesa day, exercises 1hour a week, and has no grandparents who lived past 80. (c) Explain what you have done (including why the predictions for the two women are different)to a person who has never had a course in statistics. Chapter Four Chapter 5 Some Ingredients for Inferential Statistics: The Normal Curve, Probability, and Population versus Sample Learning Objectives To understand, including being able to conduct any necessary computations: The shape of the normal distribution. El 'The 50%-34%-i4% approximarionsfor normai curve areas. tk4 The normal curve table. Converting Z and raw scoresto percentages of cases in a normal distribution. !&! Converting percentages of cases in a normal distributionto Z and raw scores. Long-run relative-frequency interpretation of probability. Subjective interpretation of probability. sl Calculating probabilities. sl The normal curve as a probability distribution. Sample and population. Random and nonrandom methods of sampling. !i#iStatistical terminology and symbols regarding samples and populations. Chapter Outline I. The Normal Distribution A. The distributions of many variables that psychologists measure follow a unimodal, roughly symmetrical,bell-shaped distribution. B. These bell-shaped histograms or frequency polygons approximate a precise and important mathematical distribution called the normal distribution, or more simply, the nornzal curve. C. Why the normal curve is so common in nature. 1.The score on any particular variable can be thought of as influenced by a large number of essentially random factors, which on the average balance out to a middle value, with equal but decreasing numbers of cases balancing out above and below the middle value. 2. This produces a unimodal, symmetrical distribution. 3.It can be shown mathematically that in the long run, if the influences are truly random, a precise normal curve will result. D. Because this shape is standard, there is a known percentage of cases below or above any particular point. Clznpter Five 101 E. Approximatepercentages of cases in a normal curve for major demarcations. 1.Because the distribution is symmetrical, exactly 50% of the cases fall below the mean and exactly 50% above the mean. 2.Approximately 34% of the cases fall between the mean and one standard deviation from the mean in each direction. 3.Approximately 14% of the cases fall between one and two standard deviations from the mean in each direction. 4.These 50%, 34%, and 14%approximations are useful practical rules. 5.Knowing these percentages, if you know a distribution on a variable is normal, you can determine what percentage of cases fall between, above, or below various whole number Z scores. 6.IOlowing t'hese percentages aiso permits you to approximate a person's number of standard deviations from the mean from their percentage in relation to other people in their distribution. F. The normal curve table and Z scores. 1.Because the normal curve is exactly defined, it is also possible to compute the exact percentage of cases between any two Z scores. 2. Statisticianshave created helpful tables of the normal curve that give the percentage of cases between the mean (a Z score of 0) and any otherZ score. (Table B-1 in the text is a normal curve table.) 3.This table can be used to compute the percentage of cases fromZ scores or from raw scores (by converting them to Z scores). 4.This table can be used to compute a Z score or raw score (by converting from a Z score) from percentage of cases. 11. Probability A. Scientific research does not permit determining the truth or falsity of theories or applied procedures. B. Inferential statistics can be applied to results of research to permit probabilistic conclusions about theories or applied procedures. C. Probability is a large and controversial topic, but there are only a few key ideas you need to know to understandbasic inferential statistical procedures. D. The long-run relative-frequency interpretation of probability. Probability is the long-run, expected relative frequency of a particular outcome. 1.An ozltcorne is the result of an experiment (or virtually any event, such as a coin coming up heads or it raining tomorrow). 2.Frequency means how many times something occurs. 3.Relativefrequency means the number of times something occurs relative to the number of times it could have occurred. 4.Long-run relativefrequency is what you would expect to get, in the long run, if you were to repeat the experimentmany times. Chapter Five E. Subjective interpretation of probability: How certain one is that a particular thing will happen. F. Calculating a probability: The number of possible successful outcomes divided by the number of all possible outcomes. G. The range of probabilities. 1.Something that has no chance of happening has a probability of 0. 2. Somethingthat is certain to happen has a probability of 1. H. Probability is usually symbolized by the letter p and expressed as equaling, greater than, or less than some fraction or percentage. I. The normal distribution can also be thought of as a probability distribution: The proportion of cases between any two Z scores is the same as the probability of selecting a case between those two Zscores. 111. Sample and Population A. Apopulation is the entire set of things of interest. B. A sample is the subset of the population about which you actually have information. C. Why samples are studied (instead of populations). 1.Usually it is not practical to study entire populations. 2.The goal of science is to make generalizations or predictions about events beyond our reach, such as the behavior of entire populations. D. The general strategy of psychology research is to study a sample of individuals who are believed to be representative of the general population (or of some particular population of interest). E. At the minimum, researchers try to study people who at least do not differ from the general population in any systematic way which would be expected to matter for that topic of research. F. There are several methods of sampling. 1.The ideal method of sampling is random selection: The researcher obtains a complete list of all the members of a population and randomly selects some number of them to study. 2.Haphazard selection is quite different from true random selection and is likely to produce a sample that is a biased subset of the population as a whole. 3.111 psychology research it is rarely possible to employ true random sampling, but researchers try to study a sample that is not systematically tinrepresentative of the population in any known way. C11apter Five G. Statisticalterminology for samples and populations. 1.The mean, variance, and standard deviation of a population are called population parameters. a.A population parameter is usually an unknown which is, at best, estimated from sample information. b.Population parameters are symbolized by Greek letters. i. Populationmean = p. ii.Population standard deviation = o. 2.The mean, variance, and standard deviation you calculate to describe a sample are sample statistics. a.A sample statistic is computed from known information. b.Generaiiy, the symbois for sampie statistics (which is what we have used so far) are ordinary letters. i. Samplemean =M. ii.Sample standard deviation = SD. IV. Relation of Normal Cuwe, Probability, and Sampleversus Population A. In most research situations the population parameters are unknown. B. However, the population distribution is often assumed to be approximately normal. C. Thus, researchers collect information from a sample in order to make probabilistic inferences about the parameters of a normally distributed population. D. This is illustratedin the logic of a two-group experiment. 1.The experimental group is a sample intended to represent a population exposed to the experimentalmanipulation. 2.The control group is a sample intended to represent a population not exposed to the experimentalmanipulation. 3.The populations these samples represent may not even really exist (as no one other than the experimental subjects may have ever been exposed to the experimental manipulation). 4.If they did exist it is presumed that the population would be normally distributed. 5.The question of interest is a question about probability. a. Suppose in fact the means of the two populations (population parameters) are actually the same, and thus the experimental manipulation makes no difference. b.What then is the probability that the means of our two samples (sample statistics) could be as different as they actually are? Chapter Five V. Controversies and Limitations A. Is the normal curve really so common? 1.It is widely assumedthat it is, and this assumption plays an importantrole in carrying out statisticalanalyses of research in psychology. 2.One recent study indicates that the measures most commonly used in psychology often do not yield scores that are normally distributed. 3.Still more recent research, however, suggests that the kinds of variations from normal that have been found may not create serious problems for the typical applications of statisticalmethods in psychology. B. Bayesianmethods. 1.These are based on the subjectiveinterpretationof probability. 2.Bayesians hold that science is about conducting research in order to adjust our pre- existingbeliefs in light of evidencewe collect. 3.Bayesianmethods are based on this principle. 4.Critics of Bayesian methods argue that conclusions drawn fi-om each study would depend too heavily on the subjective belief of the particular scientist conducting the study. C. The appropriateness of drawing conclusionsfiom nonrandom samples. 1.Samples used in psychology experiments are usually nonrandom (whoever is available) and small. 2.Psychologists are fairly comfortable with this situation because they are mainly interested in the pattern of relationships among variables, which are thought to be fairly constant even if mean levels of variables vary greatly acrosspopulations. VI. Normal Curves, Probabilities, Samples, and Populations as Described in Research Articles A. These topics, which are important fundamentals, are rarely found discussed explicitlyin research articles (except articles about methods or statistics). B. The normal curve is sometimesmentioned when describing the distributionof scores on a particular variable. C. Probability is rarely discussed directly, except in the context of statistical significance (see Chapter 6 and beyond). D. The method of selecting the sample fiom the population is sometimes described, particularly if the study is a survey. ChapterFive How to Determine the Percentage of Cases Above or Below a Particular Score Using a Normal Curve Table I. If it is a raw score, convert it to a Z score: Z = (X-M)ISD. 11. Look up the Z score in the Z column and find the percentage in the adjacent % Mean to Z column. 111. For a percentage of cases above a particular Z score. A. If the Z score is positive, subtract this percentage from 50% (that is, 50% minus this percentage). B. if fhe Z score is negative, add 50% to it. IV. For a percentage of cases below a particular Z score. A. If the Z score is positive, add 50% to this percentage. B. If the Z score is negative, subtract this percentage from 50% (that is, 50% minus this percentage). How to Determine a Score from Knowing the Percentages of Cases Above or Below that Score Using a Normal Cuwe Table I. For a situation in which there is a particular percentage of cases higher than the score. A. If the percentage is less than 50%. 1.Subtractthe percentage from 50% (that is, 50% minus the percentage). 2.Look up the closest percentage to this difference in the% Mean to Z column and find the Z score in the adjacent Z column. B. If the percentage is greater than 50%. 1.Subtract50% from the percentage (that is, the percentage minus 50%). 2.Look up the closest percentage to this difference in the% Mean to Z column and find the Z score in the adjacent Z column. 3.Make this Z score negative (put a minus sign in front of it). II. For a situation in which there is a particular percentage of cases lower than the score. A. If the percentage is less than 50%. 1.Subtractthe percentage from 50% (that is, 50% minus the percentage). 2.Look up the closest percentage to this difference in the% Mean to Z column and find the Z score in the adjacent Z column. 3.Make this Z score negative (put a minus sign in front of it). ChapterFive B. If the percentage is more than 50%. 1.Subtract 50% from the percentage (that is, the percentageminus 50%). 2.Look up the closestpercentageto this difference in the% Mean to Z column and find the Z score in the adjacent Z column. 111. Convertthe Z score to a raw score: X = Q ( S D )+M. Outline for Writing Essays on the Logic and Computations for Determining a Percentage of Cases in a Normal Distribution from a Score or Vice Versa The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much--your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (althoughonce you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear justwhy that formula or procedure was appliedand why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correctways to go about it. And this is anoutline for an answer--you are to write the answer out in paragraph form. Examples of full essays are in the answers to SetI Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understandsright up to whatever point you yourself startbeingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. ChapterFive Determining a Percentage from a Score I. If a raw score, convert it to a Z score (see outlines for essays on Z scores, mean, and standard deviation from Chapter 2 of this Study Guide). 11. Find the percentageof cases between this Z score and the mean. A. Procedure: Look up the Z score in the Z column and find the percentage in the adjacent %Mean to Z column. B. Explanation. 1.The normal distribution. a. It is bell-shaped. b. It is a highly common distribution in psychology. c. This shape is so common because a score on anything is usually a kind of average of a number of influences which have random variation; the chance of getting many of these random influences all in the same extreme direction is low, making most cases (each case representing an averageof random influences)fall near the middle. d.It can be shown mathematicallythat in the long run, if the influencesare truly random, a precise normal curve will result. 2.The normal curve table. a.Because this shape is mathematically defined, there is a known percentage of cases below or above any particularpoint. b. Statisticians have created tables which give the percentage of cases between the mean and any particular number of standard deviations (including fiactions of standard deviations)from the mean--that is, between the mean and any positive Z score. 3.Describewhat the percentage is for your particular Z score. 111. To determine the percentage aboveyour score. A. If the Z score is positive. 1.Procedure: Subtractthis percentage from 50%. 2.Explanation. a. Sincethe Z score is positive (that is, it is above the mean), the amount above this score is the 50% above the mean less what is between this score and the mean (the percentage from the table). b. Give your actual difference. c. Illustrate with a picture of a normal curve showingyour situation. B. If the Z score is negative. 1.Procedure: Add 50% to this percentage. 2.Explanation. a. Sincethe Z score is negative (that is, it is below the mean), the total abovethis score is what is between it and the mean (the percentage fiom the table) plus the 50% above the mean. b. Give your actual sum. c. Illustratewith a picture of a normal curve showingyour situation. ChapterFive IV. To determinethe percentage below your score. A. If the Z score is positive. 1.Procedure: Add 50% to this percentage. 2.Explanation. a. Since the Z score is positive (that is, it is above the mean), the total below this score is the amount between it and the mean (the percentage from the table) plus the 50% below the mean. b. Give your actualtotal. c. Illustrate with a picture of a normal curve showing your situation. B. If the Z score is negative. 1.Procedure: Subtractthis percentagefrom 50%. 2.Explanation. a. Since the Z score is negative (that is, it is below the mean), the amount below this score is the 50% below the mean less the percentage between the mean and this score (the amount computed earlier). b. Give your actual difference. c. Illustrate with a picture of a normal curve showing your situation. Determining a Score@om a Percentage I. Explain the logic of the normal curve and a normal curve table. A. The normal distribution. 1.It is bell-shaped. 2.It is a highly common distributionin psychology. 3.This shape is so common because a score on anything is usually a kind of average of a number of influences which have random variation; the chance of getting many of these random influences all in the same extreme direction is low, making most cases (each case representing an average of random influences) fall near the middle. 4.It can be shown mathematicallythat in the long run,if the influences are truly random, a precise normal curve will result. B. Explain the logic of Z scores (see outlines for essays on Zscores, mean, and standard deviation from Chapter 2 of this Study Guide). C. The normal curve table. 1.Because this shape is mathematically defmed, there is a known percentage of cases below or above any particularpoint. 2.Statisticians have created tables which give the percentage of cases between the mean and any particular number of standard deviations (including fractions of standard deviations) from the mean--that is, between the mean and any positive Z score. Chapter Five 11. For a situationin which there is a particular percentage of cases higher than the score. A. If the percentage is less than 50%. 1.Procedure. a. Subtractthe percentage from 50% (that is, 50% minus the percentage). b.Look up the closest percentage to this difference in the % Mean to Z column and find the Z score in the adjacent Z column. 2.Explanation. a. The percentage between the mean and the score (the percentage that can be looked up on the table to find the corresponding Z score) is what remains between the percentage abovethe score and 50% (the total above the mean). b.Give your figures. c. Illustrate with a picture of a normal curve showing your situation. B. If the percentage is greater than 50%. 1.Procedure. a. Subtract 50% from the percentage (that is, the percentage minus 50%). b.Look up the closest percentage to this difference in the% Mean to Z column and find the Z score in the adjacent Z column. c. Make this Z scorenegative (put a minus sign in front of it). 2.Explanation. a. The percentage between the mean and the score (the percentage that can be looked up on the table to find the correspondingZ score) is what remains after subtractingout the 50% above the mean. b. Since there are more than 50% above this score, the score must be below the mean, and hence a negative Z score. c. Give your figures. d.Illustratewith a picture of a normal curve showingyour situation. ChapterFive 111. For a situation in which there is a particular percentage of cases lower than the score. A. If the percentage is less than 50%. 1.Procedure. a. Subtractthe percentage from 50%. b.Look up the closest percentage to this difference in the % Mean to Z column and find the Z score in the adjacent Zcolumn. c. Make this Z score negative (put a minus sign in front of it). 2.Explanation. a. The percentage between the mean and the score (the percentage that can be looked up on the table to find the correspondingZ score) is what remains after subtracting it from the total of 50% below the mean. b. Since there are less than 50% below this score, the score must be below the mean and hence a negative Z score. c. Give your figures. d.Illustrate with a picture of a normal curve showing your situation. B. If the percentage is more than 50%. 1.Procedure. a. Subtract 50% from the percentage (that is, the percentage minus 50%). b.Look up the closest percentage to this difference in the % Mean to Z column and find the Z score in the adjacentZ column. 2.Explanation. a. The percentage between the mean and the score (the percentage that can be looked up on the table to find the correspondingz score) is what remains above the mean after subtracting out the 50% below the mean. b. Give your figures. c. Illustratewith a picture of a normal curve showing your situation. IV. Convert the Z score to a raw score. A. Procedure: X=(Z)(SD)+M. B. Explanation: See outline for essay on converting a Z score to a raw score in Chapter 2 of this Study Guide. ChapterFive Chapter Self-Tests Multiple-ChoiceQuestions 1. A normal curve is a.bimodal and slightly skewed to the right. b.unimodal and symmetrical. c.unimodal and slightlyskewed to the left. d.bimodal and roughly symmetrical. 2. The mean score on a depression sca!e is 10 and the stmdlrd deviztion is 3. The distrihlltion is normal. Using the approximation rules for normal curves, how many people would get a score between 10 and 16? a. 50%. b. 34%. c. 34% + 14%= 48%. d.34% + 34% = 68%. 3. A person received a test score that was in the top 30% of all the cases. Using the normal curve table, what was this person's Z score? a. .52. b. .84. c. 5.03. d.5.34. 4. A person has a creativity score of 6.8, which equals a Z score of +2.4. What is the percentage of cases above this score? a. 49.18% - 50% = -.82%. b. 100%- 49.18% = 50.82%. c. 50% +49.18% = 99.18%. d. 50% - 49.18% = .82%. 5. What is the Z score a person would have to receive to be in the top 2% of their class? Chapter Five 6. Approximately what Z score would a person have to have to be in the bottom 16%of their group? a. -2. b. -1. c. 0. d.+l. 7. How do you calculateprobability? a. The number of all possible outcomes divided by the number of possible successful outcomes. b. The number of all possible outcomes multiplied by the number of possible successfuloutcomes. c. The number of possible successfuloutcomes divided by the number of possible outcomes. d.The number of all possible outcomesminus the number of possible successfuloutcomes. 8. What is the differencebetween random selection and haphazard selection? a. There is no difference--theymean the same thing. b.In random selection you take whoever is available, whereas in haphazard selection you choose people in a way that systematically avoidsa regular rotation. c. In random selection you obtain a complete list of all members of a population and randomly select some of them, whereas in a haphazardselection you select whoever is available. d.In random selection youjust choose who happens to be at the top of the list of the members in the population, whereas in haphazard selection you take volunteers. 9. Which of the following statementsis most accurate about population parameters? a. They are essentialto making any statements about probability. b. They are rarely known. c. They are usually smaller than the sampleparameters. d.They are essentialand cannotbe estimated from the sample information. 10.What is the "Bayesian" approach? a. It says that science is about conducting research in order to adjust our pre-existing beliefs in light of new evidencewe collect. b.It says that it is better not to make any assumptions about prior beliefs; instead one should just look at the evidenceas it is. c. It saysthat one should avoid using theory when conducting scientificresearch. d.It emphasizesthat studies shouldnot be conductedby people who have a biased opinion. ChapterFive Fill-In Questions 1. In a normal curve approximately percent of the cases fall between the mean and one standard deviation abovethe mean. 2. In a normal curve approximately 14% of the cases fall between a Z score of anda Z score of -2. 3. The percentage of cases between any two Z scores on a normal curve can be exactly determined based on a formula for the normal curve or by using a(n) 4. A person says that the chances he will get a new job are about 80%--meaning that on a scale of 0% to 100% this feels like how likely it is. This is an example of the interpretationof probability. 5. The normal curve can be thought of as a frequencydistributionor as a distribution. 6. To pick 15 people for a study of faculty opinion at a particular college, a student gives a questionnaireto each faculty member who arrives at the faculty club on a particular evening. This is an example of selection. 7. A characteristicof a population, such as its mean, is called a(n) 8. A characteristicof a sample, such as its standard deviation, is called a(n) 10. is the branch of statistics which draws conclusions about populations based on informationin samples. Problems and Essays 1. Suppose a test of musical ability has a normal distribution with a mean of 50 and a standard deviation of 5. Approximately what percentage of people are (a) above 55, (b) below 40, and (c) above 45. (Use the normal curve approximation rules.) (d) Explain y o u answers to a person who has never had a course in statistics. Chapter Five 2. Suppose the number of clients seen in any given week by full time psychotherapists in a particular city is normally distributedwith a mean of 15.3 and a standard deviation of 2.8. (a) A therapist who is in the top 5% of number of clients seen would be seeing at least how many? (b) What is the most a therapist could be seeing and still be in the bottom 10%. (c) Explain your answers to a person who has never had a course in statistics. 3. Thirty students in a particular elementary school classroom include 8 girls who are of an ethnic minority, 4 boys of an ethnic minority, 7 girls who are not of an ethnic minority, and 11boys not of an ethnic minority. A student is selected at random to represent the class. (a) What is the probability that student will be a girl? (b) What is the probability the student will be of an ethnic minority? 4. A health psychologistplans to conduct a mail survey of a sample of doctors in a particular U.S. state to ask about their attitudes toward drug advertisements. What would be the best way to go about selecting the sample of doctors to study? Explain what you would do and why to a person who is unfamiliar with research methods or statistics. Note. There is no section on using SPSS or MYSTAT with this chapter because none of the procedures covered are easily implemented using standard computerized statistical packages. ClzapterFive Chapter 6 Introduction to Hypothesis Testing Learning Objectives To understand, includingbeing ableto carry out any necessary procedures or computations: The core logic of hypothesis testing. W The populations involved in hypothesis testing (Populations 1 and 2). &I The research hypothesis. The null hypothesis. H The comparison distribution. II Determiningthe cutoff score on the comparison distribution. H Significanceand conventional levels of significance. When to reject the null hypothesisand the implicationsof this decision. &I When not to reject the null hypothesis and the implicationsof this decision. The steps of hypothesistesting and their relationto the core logic. H Directional and nondirectionalhypotheses. &I One-tailed versus two-tailed tests. W How results of studiesusing hypothesis-testing procedures are reported in research articles. Chapter Outline I. The Core Logic of HypothesisTesting A. The key idea is that we test (and hopefully reject) the notion that the experimental manipulationmade no difference. If the notion of no difference can be rejected, this supportsthe idea it does make a difference. B. Put another way, we draw conclusions by evaluating the probability of getting our researchresults if the opposite of what we are predicting were true. C. This double-negative, roundabout kind of logic is awkward, but necessary. ChapterSix 11. The Steps of Hypothesis Testing (as applied to the situation in which a single individual is exposed to the experimental manipulation and compared to a known population distribution of people not so exposed) A. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1.) 1.Research is conducted using samplesto test hypotheses about populations. 2.0ne population (Population 1) is the people who are exposed to the experimental manipulation. a. This population does not usually exist (except for the one person in the sample being studied), but is a logical construction of the group to whom the results of the experiment might be applied. b.The characteristics of the distributionof this population (p,o,shape) are unknown. 3.The other population (Population 2) is the people who have not been exposed to the experimentalmanipulation. a. This is a real population (usually the population at large of people of the same category as those in the sample, but these people were not exposed to the experimental manipulation). b.The characteristics of the distribution of this population are known (for example, from previous research). 4.The research hypothesis is a statementabout a predicted differencebetween populations. a. Typically the difference predicted is that the mean of one population is different (or more specifically, higher or lower) than the mean of the other. a. The prediction is usually based on theory or practical experience. 5.The null hypothesis is a statement about a relation between populations which represents the crucial opposite of the research hypothesis. a. It usually is a prediction of no difference (or that if there is a difference it is in the direction opposite to what is predicted). b.It has this name because it predicts a "null" or nondifference. 6.The research hypothesis and the null hypothesis are opposites and mutually exclusive. a. It is this oppositeness which is at the heart of the hypothesis-testing process. b.Because the null hypothesis is so important to the logic of the process, the research hypothesis, is often calledthe "alternativehypothesis." Chapter Six B. Determinethe characteristicsof the comparison distribution. (Step 2.) 1.In terms of hypothesis-testing language, the crucial question is as follows: Given a particular sample value, what is the probability of obtaining it if the null hypothesis is true? 2.To determinethis probability, we need to know the characteristics of the distributionthe score would come from if the null hypothesis is true; this is necessary to permit us to determinethe likelihood of getting a particular extreme score from this distribution. 3.This crucial distributionis called the comparison distribution. 4.If the null hypothesis is true, both populations are the same and the score in Population 1 comes from a distributionwith the same characteristicsas that of Population 2. 5.Thus, the comparison distribution (in the situation considered in this chapter) has the characteristics of Population 2 (the known population of individuals not exposed to the experimentalmanipulation). C. Determine the cutoff sample score on the comparisondistribution at which the null hypothesis shouldbe rejected. (Step 3.) 1.Before making an observation, researchers consider what kind of observation would be sufficiently extremeto reject the null hypothesis. 2.Researchers do not usually use an actual number of units on the direct scale of measurement of the comparison distribution; instead they state how extreme a score should be in terms of a Z score on this distribution (and the associated probability of getting a Z scorethat extreme on this distribution). 3.Very small percentages (probabilities of getting this extreme a score) are taken as the cutoff. 4. Since the comparison distribution is ordinarily a normal curve, the cutoff is found in a normal curve table. 5.The cutoff percentage is called the "level of significance." 6.Conventional levels of significance used by psychology researchers are 5% (the .05 significance level) or 1% (the .Ol significance level). 7.When a sample value is so extreme that the null hypothesis is rejected, the result is said to be statistically signijkant. D. Determinethe sample's score on the comparisondistribution. (Step 4.) 1.This is the result of the actual experiment or observation. 2.The raw-score result is convertedto a Z score on the scale of the comparison distribution to make it comparableto the cutoff Z score. ChapterSix E. Compare the scores obtained in Steps 3 and 4 to determine whether or not to reject the null hypothesis. (Step 5.) 1.This step is entirelymechanical. 2.If the actual sample's Z score (from Step 4) is more extreme than the cutoff Z score (from Step 3), the null hypothesis is rejected. a. Thus, the research hypothesis is supported. b.However, the research hypothesis is not "proven" or shown to be "trueu-no pattern of results can prove a hypothesis based on research data; it can only support or fail to support a particular hypothesis. 3.If the actual sample'sZ score (from Step 4) is not more extreme than the cutoff Z score (from Step 3), the null hypothesis is not rejected. . .a. Thris, the experiieiit is iiicoiichsive. b.However, we do not say the null hypothesis is supported (and certainly not that it is proven or true). c. The null hypothesis could be false, even though the study does not succeed in rejecting it-for example, we could fail to reject it because the effect was too weak to show up significantlyin the study. 111. On-Tailed and Two-Tailed Hypothesis Tests A. A directional hypotheses is used when there is a specificpredicted direction of effect (such as predicting an increase or predicting a decrease). 1.The research hypothesis is that Population 1's mean is higher (or lower, if that is the prediction) than Population 2's mean. 2.The null hypothesis is that Population 1's mean is not higher (or lower, if that is the prediction) than Population 2's mean. (That is, the null hypothesis is true if Population 1'smean is the same as or lower than Population 2's (presuminghigher was predicted). 3.Significancetesting is carried out as follows (for simplicity,points a and b below assume a higher score was predicted and the 5% significancelevel used). a. To reject the null hypothesis, the obtained score has to be in a region of the comparison distributionthat is in its upper 5%. b. That is, it has to be in the area on one tail only, and thus this is called a one-tailed test. ChapterSix B. A nondirectional hypothesis is used when the researcher predicts one population will be different fiom the other, without specifying whether they will be different by Population 1having higher or lower scores. 1.The research hypothesis is that Population 1's mean is different fiom Population 2's mean. 2.The null hypothesis is that Population 1'smean is not different. 3.Significance testing is carried out as follows (for simplicity, points a through c below assume the 5% significance level is used). a. To reject the null hypothesis, the obtained score has to be in a region of the comparison distribution that is in either the upper 2.5% or the lower 2.5%, making a total of 5% of the area in which the null hypothesis could be rejected. 1- r n - - L - L --..1.- U. ~ I M Lis, i~c;an ot: iii eiiner iaii of the distribution, and thus 'rhis is caiied aiwo-taiied test. c. Using a two-tailed test requires a more extreme cutoff than a one-tailed test for the same situation. C. When to use one-tailed versus two-tailed tests. 1.It is "easier"to reject the null hypothesis with a one-tailed test-easier in the sense that a sampleresult need not be so extreme before an experimental result is significant. 2.But there is a price: If the result is extreme in the other direction, no matter how extreme,the null hypothesis can not be rejected. 3.111 principle you plan to use a one-tailed test when you have a clearly directional hypothesis and a two-tailed test when you have a clearly nondirectionalhypothesis. 4.In practice, the situation is not so simple. a.Even when a theory clearly predicts a particular result, we sometimes find that the result is just the opposite of what we expected and that this reverse of what we expected may actuallybe more interesting. b.For this reason, by using one-tailed tests we run the risk of having to ignore possibly importantresults. 5.Thus, there is debate as to whether one-tailed tests should be used, even when there is a clearly directionalhypothesis. 6.To be safe, many researchers use two-tailed tests both when there are nondirectional hypotheses and also when there are directionalhypotheses. 7.In most psychology articles, unless the researcher specificallynotes that a one-tailed test was used, it is usually assumedthat it was a twetailed test. 8.In most cases the final conclusion is not really affected by whether a one- or two-tailed test is used (the result is either extreme enough to reject the null hypothesis either way, or not extremeenoughto reject the null hypothesis either way). 9.If a result is so close that it matters which method is used, results should be interpreted cautiouslypending furtherresearch. Cltapter Six IV. Controversies and Limitations: More on one- versus two-tailed tests. A. If a study is exploratory (where there is no clear basis for making a directional prediction), everyone agrees that two-tailed tests should be used. B. But when there is a basis for a directional prediction, if a one-tailed test is used there is the risk of getting a strong result in the opposite direction which would then have to be ignored. C. A possible compromise procedure is to do a fractional-tailed test, in which the 5% is divided up as 4% in the predicted direction and 1% in the nonpredicted direction. V. Hypothesis Tests As Reported in Research Articles A. Hypothesis tests are usually reported in the context of one of the specific statisticalprocedures covered in later chapters. B. For each result of interest,the article usually gives the following information. I. Whether the result was statisticallysignificant. 2. The name of the specific technique used in determining the probabilities (these are what is covered in later chapters). 3.An indication of the significance level, such as "p < .05," or "p < .01." a. ') < -05"means that the probability of these results if the null hypothesis were true is less than .05 (5%). b.If a result is close but did not reach the significance level chosen, it may be reported anyway as a "near significant trend," with "p < .10," for example. c. If the result is not significant, sometimesthe actualp level will be given (for example, "p = .27"), or the abbreviation "ns," for "not significant," will be used. 4.If a one-tailed test was used, that will usually also be noted. (Otherwise assume a two-tailed test was used.) C. Sometimesthe results of hypothesis testing are shown simply as starred results in a table; a result with a star has attained significance (at a level given in a footnote) and one without has not. D. The steps of hypothesis testing, or which is the null and which is the research hypothesis, are rarely made explicit. Chapter Six How to Test an Hypothesis (When the sample consists of one individual and the distribution of the population not exposed to the experimentalmanipulation is known.) I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1.) A. Identify the two populations. 1.Population 1 is people like those studied who have been exposed to the experimental manipulation. 2.Population 2 is people like those studied but who have not been exposed to the experimental manipulation (this is usually people of the category studied from the general public). B. Statethe research hypothesis. 1.Decide whether this will be directionalor nondirectional. 2. State in terms of the two populations (that the mean of one will be higher, lower, or the same as the other). C. State the null hypothesis: Populations 1 and 2 are the same; the manipulation lacks impact. D. Check that the research and null hypothesis are consistent (in terms of both being in relation to a directional or nondirectionalprediction.) 11. Determine the characteristics of the comparison distribution. (Step 2.) A. This will be the distribution of Population 2. B. Note its p, (J~,and shape (whichwill usually be normal). 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3.) A. Decide on the significance level (1% or 5%). B. Determine the percentage of cases between the mean and where the appropriatepercentage begins on the normal curve. 1.If a one-tailed test, this is 50% minus the significance level. 2.If a two-tailed test, this is 50% minus 112the significance level. D. Look up the Z corresponding to this percentage in the % Mean to Z column in the normal table-this is the cutoff Z. IV. Determine the sample's score on the comparison distribution. (Step 4.) A. Conduct the study and note the score of the individual (or note the result as given to you in the problem). B. Convert the raw-score result to a Z score on the comparison distribution: Z = (X-p)h. Chapter Six V. Comparethe scores obtained in Steps 3 and 4 to determinewhether or not to reject the null hypothesis. (Step 5.) A. If the actual sample's Z score (from Step 4) is more extreme than the cutoff Z score (from Step 3). 1.The null hypothesis is rejected. 2. Thus, the research hypothesis is supported. B. If the actual sample's Z score (from Step 4) is not more extreme than the cutoff Z score (fiom Step 3). 1.The null hypothesis is not rejected. 2.Thus, the experiment is inconclusive. ChapterSix Outline for Writing Essays for Hypothesis-testing Problems Involving a Single Sample of One Subject and a Known Population The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular stxdy y~:: are a~alyzkg;{c) state the various f~r;;;n!as in nonteckxical !a~guage, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself start being just a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Refrarne the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesis testing.) A. State in ordinary language the hypothesis-testing issue: Does the score of the person studied represent a higher (or lower or different) score than would be expected if this person had just been a randomly selected example of people in general-that is does this person represent a different group of people fiom people in general? ClzapterSix B. Explain language (to make the rest of the essay easier to write by not having to repeat long explanations), focusing on the meaning of each term in the concrete example of the study at hand. 1.Populations. 2.Research hypothesis. 3.Null hypothesis. 4.Rejectingthe null hypothesisto provide supportfor the research hypothesis. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) A. Explain the principle that the cnmparison distrih-ntinn is the distribctio~ (pattern of spread of the scores) of the population which did not undergo the experimental manipulation-this is the distribution from which we would expect our case to be a random sample if the null hypothesiswere true. B. Explicitly identify the characteristics of your comparison distribution, explainingwhat each characteristicmeans. 1.Its mean (explainthat this is the arithmetic average). 2.Its standard deviation. a.This is a standardmeasure of how spread out it is. b.It is roughly the average amount scores vary from the mean. c.Exactly speaking,the square root of the average of the squares of the amountthat each score differs from the mean. 3.Its shape. a.Usually it is a normal curve. b.Describethe shape (or draw it). c.Note that this is a highly common shape for distributions; the percentage of cases above any given point (as measured in standarddeviations from the mean) is available in tables. 111. Determine the cutoff score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesis testing.) A. Before you figure out how extreme your particular score is on this distribution, you want to know how extreme it would have to be to decide it was too unlikely that it could be just a randomly drawn case from this comparison distribution. B. Since this is a normal curve, you can use a table to tell you how many standard deviations from the mean your score would have to be to be in the top so many percent. C. Note that the number of standard deviations from the mean is called a Z score. (By explaining this term, you make writing simpler later on.) 126 ChapterSix D. To use these tables, you have to decide the kind of situation you have; there are two considerations. 1.Are you interested in the chances of getting this extreme a score that is extreme in only one direction (such as only higher than for people in general) or in both? (Explain which is appropriatefor your study.) 2.Just how unlikely would the extremeness of a particular group's average have to be? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) E. With a little manipulation of numbers you can then look up the percentage in the table and find the Z score corresponding to that percentage. IV.Determine the score of your sample on the comparison distribution. (Step 4 of hypothesistesting.) A. At this point the study would be conducted and the score of the individual obtained. B. The next step is to find where your actual score would fall on the comparison distribution, in terms of a Z score-that is, how many standard deviations it is above or below the mean on this distribution. C. Statethis Z score. V. Compare the scores obtained in Steps 3 and 4 to decide whether to reject the null hypothesis. (Step 5 of hypothesistesting.) A. State whether your score (from Step 4) does or does not exceed the cutoff (fiom Step 3). B. If your score exceeds the cutoff. 1.Statethat you can reject the null hypothesis. 2. Statethat by elimination,the research hypothesis is thus supported. 3. State in words what it means that the research hypothesis is supported (that is, the study shows that the particular experimental manipulation appears to make a difference in the particularthing being measured). C. If your score does not exceed the cutoff. 1.State that you cannot reject the null hypothesis. 2. State that the experimentis inconclusive. 3. State in words what it means that the study is inconclusive (that is, the study did not yield results which give a clear indication of whether or not the particular experimental manipulation appearsto make a difference in the particular thing being measured). 4.Explicitly note that even though the research hypothesis was not supported in this study, this is not evidence that it is false-it is quite possible that it is true but it has only a small effect which was not sufficient to produce a score extreme enough to yield a significant result in this study. Chapter Seif-Tests Multiple-Choice Questions 1. Which of the following statementsis most accurate about hypothesistesting? a. It is a nonsystematic way of using your intuition to draw possible hypotheses about the result of a study. b.It is a systematic procedure for determining whether the results of an experiment provide support for a particular theory. c. It is a systematicprocedure for disproving or, more importantly,proving your theory. d.It is an unimportant procedure that is rarely used anymore because of other mathematical procedures that have been found to be more efficient. 2. Suppose that a researcher wants to know if college students drink more coffee than people in general of college-student age. What would the research hypothesisbe in this case? a. People over college-student age will drink less coffee than the studentswill. b.There will be no differencebetween the two populations. c. College students do not drink more coffee than people in general of college-student age. d. College students drink more coffee than people in general of college-student age. 3. Which of the following is true about the comparison distribution? a. It is another name for hypothesis testing. b. It is another name for Population 1. c. It represents the situation in which the null hypothesis is true. d.It represents the situation in which the research hypothesis is true. ChapterSix 4. What does it mean when a researcher chooses the cutoff on the comparison distributionto be .01? a.It means that to reject the null hypothesis the result must be higher than a Z score of .Ol. b.It means that to reject the null hypothesis there must be less than a 1% chance that the result would have happened by chance if the null hypothesis were true. c. It means that if the result is larger than .O1 (such as .02 or .03), it is "statistically significant." d.It means that the research hypothesis is definitely true if the sample's score on the comparison distribution is less extreme than the cutoff that corresponds to the most extreme .l% of that comparisondistribution. 5. What can be concludedif the null hypothesis is rejected? a. The research hypothesis is supported. b. The research hypothesis is true and the null hypothesis is false. c. The null hypothesis is true and the research hypothesis is false. d.The null hypothesis shouldhave been based on a directionalhypothesis. 6. Suppose a researcher wants to know if there is a gender difference in the number of dreams people remember. The results of such a study would be analyzedusing a. a one-tailed test because only one issue is discussed-dreams. b. a one-tailed test because there is only one interaction, which is between gender and number of dreams. c. a two-tailed test because there is no predicted direction of the difference; the men could remember more or the women could. d.a two-tailed test because there are two variables involved-dreams and gender. 7. What shouldyou do if you want to use a 5% significancelevel on a two-tailed test? a. Reject the null hypothesis if the sample is so extreme that it is in either the top 5% or the bottom 5% of the comparisondistribution. b.Rejectthe null hypothesis if the sample is so extremethat it is in either the top 2.5% or the bottom 2.5% of the comparisondistribution. c.Use a comparison distributionthat is not a normal curve, such as a Poisson distribution. d.Use a comparison distribution that is a normal curve, but which has a standard deviationtwice as large as you would use for a one-tailed test. 8. What does it mean if the research hypothesis is clearly directional? a. You use a one-tailed test. b. You use a two-tailed test. c. It is fairly clear that the null hypothesis will be supported so long as the comparison distribution follows a normal curve. d.It is fairly clear that the research hypothesis will be supported so long as the comparison distributionfollows a normal c w e . ChapterSix 9. Researchers are reluctant to use one-tailed tests because a. using a one-tailed test dramaticallyreduces the chance of getting a significant result, even if your research hypothesis is true (that is, even if the null hypothesis is false). b.research in psychology rarely involves studies which have a basis for predicting a particular direction of result. c. when using a one-tailed test, if a result comes out opposite to that which is predicted, it can not be considered significant no matter how extreme it is. d.all of the above. 10.Ifa research report describes a result and notes < .05," this means a. the result is not statistically significantat the .05 level. b.the sample score falls in either the upper 5% or the lower 5% of the comparison distribution (making in reality a 10% chance of gettingthis result by chance). c.there is a 95% chance that the research hypothesis is true. d.the chances of getting this result if the null hypothesis is true are less than 5%. Fill-In Questions 1. is a procedure of inferential statistics in which you draw conclusions about hypotheses based on information in samples. 2-5 A study is conducted to test whether people who have taken a growth hormone are taller (that is, whether their height is greater) than people in general. 2.Population 1 is people who have taken the growth hormone. Population 2 is 3.What is the research hypothesis? (state in terms of Populations 1 and 2). 4.What is the null hypothesis? (state in terms of Population 1 and 2). 5.The comparison distribution is the distribution of 6. In any study the comparison distribution represents the distribution of sample scores you would expect if the is true. 7. In psychology conventional levels of significanceare -and -. ChapterSix 8. If the cutoff Z score is -1.96 and the score of your sample is -2.13, what should you conclude? 9. It is not correct to plan on using a one-tailed test, but if the result comes out in the opposite direction, to then apply a two-tailed test. For example, if the researcher were using the 5% significance level, using this plan would make the total probability of rejecting the null hypothesis by chance actually equal not to 5% but to %. 10.A study is done with a sample of one case. The general population (Population 2) has a mean of 10 and a standard deviation of 2. The cutoff Z score for significance in this study is 1.64. The raw score of the sample is 13. What should you conclude? Problems and Essays 1. A researcher was interested in whether "mom's old cure," a glass of warm milk before bedtime, actually facilitates falling asleep. His previous research indicated that an unassisted subject falls asleep in a laboratory situation after an average of 27 minutes with a standard deviation of 7 minutes. He had a test subject drink a glass of warm milk and then measured the amount of time it took for the subject to fall asleep. It took the subject 14minutes. (These are all fictional data.) (a) Based on these data, did the subject fall asleep significantly faster? (Use the .05 significance level.) (b) Explain your conclusionand procedure to a person who has never had a course in statistics. 2. The government reports that the average plane arrives 10 minutes late, with a standard deviation of 1.5 minutes. Your experience, however, is that a particular airline typically arrives later than other airlines. To test this, you go to the airport and check the arrival on a randomly selected flight from this company. The flight arrives 13.25minutes late. (These are all fictional data.) (a) Based on these data, what should you conclude about this company's timeliness compared to airlines in general? (Usethe -05significancelevel.) (b) Explain your conclusionand procedure to a person who has never had a course in statistics. 3. Can drugs affect memory? A psychologist interested in this question administered a new drug to a group of students in order to see if their ability to immediately recall information was affected in any way. The psychologist had been using nonsense syllables in her studies and found that the average subject was able to immediatelyrecall 7 items with a standard deviation of 2. The first data available from the drug study indicated that the subjecthad been able to recall only 4 items. (a) Was this finding significant at the .05 level? (b) Explain your conclusion and procedure to a person who has never had a course in statistics. Cltapter Six 4 A socialpsychologistwas interested in whether teenagers who had played a lot of video games as a child were able to learn to drive more quickly than the average. He obtained previous data regarding the amount of time (in hours) that it took the average teenager to learn to drive. Then he selected one high school student who reported having spent many hours playing video games and measured the amount of time it took for her to learn how to drive. Her Z score (Z= 1.23)was not significantat the .05 level. Explain these results and the steps of hypothesistesting to someoneunfamiliarwith statistics. Note. There is no section on using SPSS or MYSTAT with this chapter because none of the procedures covered are easily implemented using standard computerized statistical packages. (The material in this chapter is mainly preparation for carrying out procedures that are widely used, however, and for which the computer can be very heipfui.j ChapterSix Chapter 7 Hypothesis Tests with Means of Samples Learning Objectives To understand, including being able to carry out any necessary procedures or computations: Why the distribution of means is the appropriate comparison distribution when using a sample of more than one case. How you would constructa distribution of means. II The mean of the distribution of means, including why it equals the mean of the population of individual cases. IB The variance of the distributionof means. II Why the variance of the distribution of means is less than the variance of the population of individual cases. tti Why the larger the sample size,the smaller the variance of the distribution of means. II The standard deviation of the distribution of means. The shape of a distributionof means, including why it tends to be unimodal and symmetrical. II The conditions under which the distribution of means is or closely approximates a normal curve. Conducting a hypothesis test with a sample of more than one case and a known population distribution. Type I and Type I1 error and the basis for conventional levels of significance. IB The possible correct and erroneous decisions in hypothesistesting. Problems with using norms in standardized tests. II The use of the standard error in reporting research results. Chapter Outline I. The Distributionof Means as a ComparisonDistribution A. When testing hypotheseswith a sample of more than one case, the comparison distribution is not simply the distribution of Population 2 (the general population, people not exposed to the experimentalmanipulation). 1.The score of interest in your sample in this case is the mean of the group of scores. 2.But the distributionof Population 2 is a distributionof individual cases. 3.Thus using Population 2's distribution as the comparison distribution would be a mismatch. ClzapterSeven B. The appropriate comparison distribution in this case is a distribution of means, the distribution of all possible means of samples of the size of your sample. C. It helps to have an intuitive understanding of how one would construct such a distribution. 1.Select a random sample of cases of the given size (that is, of the given number of cases) from the population and compute its mean. 2. Select another random sample of this size from the population and compute its mean. 3.Repeat this process a very large number of times. 4.Make a distributionof these means. 5.Note that this procedure is only to explain the idea, and would be too much work and unnecessary in practice. 11. Characteristicsof a Distributionof Means A. There is an exact mathematical relation of a distribution of means to the population of individual cases the samples are drawn fiom. This means that the characteristics of the distribution of means can be determineddirectly fiom knowledge of the characteristics of the population and the size of the samples involved. B. The mean of a distributionof means (pM). 1.Rule: The mean of the distribution of means is the same as the mean of the population of individual cases from which the samplesare taken. 2.Formula: pM = p. 3.Explanation. a.Each sample is based on randomly selectedvalues from the population. b. Thus, sometimes the mean of a sample will be higher and sometimes lower than the mean of the whole population of individuals. c. There is no reason for these to average out higher or lower than the original population mean. d.Thus the average of these means (the mean of the distribution of means) should in the long runequal the population mean. ChapterSeven C. The Variance of a distribution of means (uM2). 1.Principle: The distribution of means will be less spread out than the population of individual cases from which the samples are taken. 2.Explanation. a. Any one score, even an extreme score, has some chance of being selected in a random sample. b.However, the chance is less of two extreme scores being selected in the same random sample, particularly since in order to create an extreme sample mean they would have to be two scores which were extreme in the same direction. c. Thus, there is a moderating effect of numbers: In any one sample, the deviantstend to be balanced out by middle cases or by deviants in the opposite direction, making each sarnpie tend towards the middie and away from extremevaiues. d.With fewer extreme values for the means, the variance of the means is less. 3.Principle: The more cases in each sample, the less spread out is the distribution of means of that sample size. With a larger number of cases in each sample, it is even harder for extreme cases in that sample not to be balanced out by middle cases or extremes in the other direction in the same sample. 4.Rule: The variance of a distribution of means is the variance of the distribution of the population of individual cases divided by the number of cases in the samples being selected. 5.Formula: oMZ= 02/N. (N is the number of cases in each sample.) D. The standard deviationof a distribution of means (o,). 1.Rule: The standard deviation of the distribution of means is the square root of the variance of the distributionof means. 2.Formula: oAa=.\loM'= .\~(cY~/N). 3. Special name: the standard error of the mean or the standard error, for short. E. The shape of a distributionof means. 1.It tends to be unimodal. This is due to the same basic process of extremes balancing each other out that we noted in the discussion of the variance-middle values are more likely and extreme values less likely. 2.Tends to be symmetricalfor the same reason-since skew is caused primarily by extreme scores, if there are fewer extreme scores, there is less skew. 3.As the number of subjects in each sample gets larger, the distribution of means of all possible samples of that number of subjects is a better and better approximation to the normal curve. 4.With samples of 30 or more each, even with a quite nonnormal population of individual cases, the approximation of the distribution of means to a normal curve is so close that the percentages in the normal curve table will be extremely accurate. 5.Whenever the population distribution of individual cases is normal, a distribution of means, of whatever sample size, will always be normal. Chapter Seven 111. Hypothesis Testing Involving a Distribution of Means A. The distribution of means is the comparison distribution to which the sample mean can be compared in order to see how likely it is that such a sample mean could have been selected if the null hypothesis is true. B. It is the characteristics of the distribution of means that must be determined in Step 2 of the hypothesistestingprocess. C. It is the location of your sample on the comparison distribution that must be- determinedin Step 4 of hypothesis testing. 1.You are now finding a Z score of a sample mean on a distribution of means (instead of the Z score of a single subject on a distributionof a population of single subjects). 2.Thus, the formula is Z= (M-p)la,. D. Other than using the distributionof means as the comparison distribution and locating the mean of your sample on this distribution, the process of hypothesis testing is exactly the same as in Chapter 6 (which focused on hypothesis testing involvinga sample ofjust one subject). IV. Levels of Significance A. Type I error. 1.The significance level is the level of probability (such as 5%) up to which you would say your result could have occurredjust by chance; with a result less likely than that level of probability you would reject the null hypothesis and conclude your research hypothesis has been supported. 2. The significance level can also be thought of as the risk or probability you accept of mistakenlyrejecting the null hypothesis. 3.Mistakenly rejecting the null hypothesis is called a Type I error. 4.Thus, the significance level is the probability of making a Type I error. 5. Type I errors are of serious concern to scientists,who could construct entire theories and research programs-let alone practical applications-based on a result which in fact is fallacious. 6.Thus, the conservative approach is to set a very stringent significance level so that the results have to be quite extremeto reject the null hypothesis. B. Type I1error. 1.If you set a very stringent significance level, you may fail to reject the null hypothesis-that is, you may fail to decide the research hypothesis is demonstrated by the evidenceof the sample-when in fact the research hypothesis is true. 2.Mistakenlyfailing to reject the null hypothesis is called a Type I1error. 3.Type I1 errors concern scientists, and especially those interested in applications of psychological knowledge, because a Type I1 error could mean that a good theory or useful practical procedure is not used. C. Minimizing the chance of making one kind of error increases the chance of making the other. 136 CI2apter Seven D. The conventional 5% and 1% significance levels represent a compromise between these offsetting risks. E. There are two kinds of possible correct decisions and two kinds of possible erroneous decisions you can make in hypothesis testing. (These are hypothetical possibilities useful for understanding the logic of setting significance levels, but we never actually know whether the research hypothesis is tiue or false.) 1.The research hypothesis is actually true and the hypothesis-testing procedure results in rejecting the null hypothesis (a correct decision). 2.The research hypothesis is actually false and the hypothesis-testing procedure results in rejecting the null hypothesis (a Type I error). 3. The research hypothesis is actually true and the hypothesis-testing procedure results in failing to reject the null hypothesis (a Type I1 error). 4.The research hypothesis is actually false and the hypothesis-testing procedure results in failing to reject the null hypothesis (a correct decision). V. Controversiesand Limitations: Population Norms A. The known population means and standard deviations on standardized psychological tests are called norms. B. In practice norms are actually based not on the entire population but on very large samples that are thought to be reasonably representative of the larger population. C. However, these samples, by averaging all types of people together may over- or underrepresent subgroups (such as ethnic minorities) for whom the test or the testing procedure may have a different meaning. D. This problem is especially acute when ability or intelligence tests are used to compare various groups and conclusions are drawn (incorrectly) about differences in abilities or intelligence. VI. Topics of this Chapter as Reported in Research Articles A. Research situations in which there is a known population mean and standard deviation are quite rare in psychology, so they seldom appear in research articles; the main reason we have asked you to learn about this situation is because it is a necessary building block to understanding hypothesis testing in the more common research situations. B. When such hypothesis tests are reported, the procedure may be described as a "2test." ChapterSeven C. Researchers will sometimes report one statistic discussed in this chapter, the standard deviation of the distribution of means. 1.This is done as an indication of the amount of variation that might be expected among means of samples of a given size from this population. 2.In this context it is usually identified as the "standarderror" or abbreviated as SE. 3. Often the lines that go above and below the tops of the bars in a bar graph refer to standard error (instead of standard deviation). Formulas I. Variance of a distributionof means (crM2) Formula in words: The variance of the distribution of the population of individual cases dividedby the number of individual cases in each sample. Formula in symbols: oM2= 021N (7-1) 02 is the variance of the population of individual cases. N is the number of individual cases in each sample. 11. Standard deviation of a distributionof means, standard error (oM) Formula in words: The square root of the variance of the distribution of means. Formula in symbols: oM= d o$ (7-2) or oM= O I ~ N (7-3) 111. Location of the sample mean on the distributionof means (2) Formula in words: Deviation of the mean of the sample from the mean of the known population (same as mean of the distribution of means), divided by the standard deviation of the distribution of means. Formula in symbols: Z = M- y 1 oM M is the mean of the sample. y is the mean of the known population (same as the mean of the distributionof means). ClzapterSeven How to Test an Hypothesis Involving a Single Sample of More than One Subject and a Known Population I. Reframe the question into a research hypothesesand a null hypothesis about populations. (Step 1.) A. Identify the two populations. 1.Population 1 is people like those studied who have been exposed to the experimental manipc!.tion. 2.Population 2 is people like those studied but who have not been exposed to the experimental manipulation. (This is usually people from the general public, of the category studied.) B. State the research hypothesis. 1.Decide whether this will be directional or nondirectional. 2. State in terms of the two populations (that the mean of one will be higher, lower, or the same as the other). C. State the null hypothesis-Populations 1 and 2 are the same; the manipulation had no impact. D. Check that the research and null hypothesis are consistent (in terms of both being in relationto a directional or nondirectional prediction.) 11. Determine the characteristics of the comparisondistribution. (Step 2.) A. This will be a distribution of means of samples of the number of subjects in the sample being studied. B. Its mean is the same as the mean of the population of individual cases from which the samples are taken: pM= p. C. Its standard deviation is the square root of the result of dividing the variance of the distribution of the population of individual cases by the number of cases in the samples being selected: o,= .d(02/N). (N is the number of cases in each sample.) D. Its shape. 1.Normal if the population is normal. 2.Very close to normal if N is greater than 30, regardless of population shape. 3.Otherwise, unirnodal and symmetrical,but not normal. ChapterSeven 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3.) A. Decide on the significance level (1% or 5%). B. Determine the percentage of cases between the mean and where the appropriatepercentage begins on the normal curve. 1. If a one-tailed test, this is 50% minus the significancelevel. 2. If a two-tailed test, this is 50%minus 112the significance level. C. Look up the Z corresponding to this percentage in the % Mean to Z column in the normal table-this is the cutoff Z. IT:'. BefePF,iEethe sa--,p!efs se=re==$hec=--,parisc=distrfbntie=. (step 4.) A. Conduct the study and compute the mean of the scores in the sample studied (or note the result as given to you in the problem). B. Convert the raw score result to a Z score on the comparison distribution (the distribution of means): Z = (M- p)/o,. V. Compare the scores obtained in Steps 3 and 4 to determinewhether or not to reject the null hypothesis. (Step 5.) A. If the actual sample's Z score (from Step 4) is more extreme than the cutoff Z score (from Step 3). 1.The null hypothesis is rejected. 2.Thus, the research hypothesis is supported. B. If the actual sample's Z score (from Step 4) is not more extremethan the cutoff Z score (from Step 3). 1.The null hypothesis is not rejected. 2.Thus, the experiment is inconclusive. ChapterSeven Outline for Writing Essays for Hypothesis Testing Problems Involving a Single Sample of More than One Subject and a Known Population The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place for those of you who are better at words than numbers to shine. And for those better at numbers to develop their skills at explaining in words.) Thus, ro do weii you need to be sure to do the foilowing in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is an outline for an answer-you must write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself start being just a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesis testing.) A. State in ordinary language the hypothesis testing issue: Does the average of the scores of the group of persons studied represent a higher (or lower or different)mean than would be expected if this group of persons had just been a randomly selected example of people in general-that is does this set of people studied represent a different group of people from people in general? Chapter Seven B. Explain language (to make rest of essay easier to write by not having to repeat long explanations each time), focusing on the meaning of each term in the concrete example of the study at hand. 1.Populations. 2. Sample. 3.Mean. 4.Research hypothesis. 5.Null hypothesis. 6.Rejectingthe null hypothesisto provide supportfor the research hypothesis. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) A. Explain principle that the comparison distribution is the distribution (pattern of spread of the means of scores) that represents what we would expect if the null hypothesis were true and our particular mean werejust randomly sampled from this population of means. B. Note that because we are interested in the mean of a sample of more than one case, we have to compare our actual mean to a distribution not of individual casesbut of means. C. Give an intuitive understanding of how one might construct a distribution of means for samplesof a given size fiom a particular population. 1.Select a random sample of the given size (number of subjects) from the population and compute its mean. 2. Select another random sample of this size from the population and compute its mean. 3.Repeat this process a very large number of times. 4.Make a distribution of these means. 5.Note that this procedure is only to explain the idea, and would be too much work and unnecessary in practice. D. There is an exact mathematical relation of a distribution of means to the population the means are drawn fiom, so that in practice the characteristicsof the distribution of means can be determined directly from knowledge of the characteristics of the population and the size of the samples involved. ChapterSeven E. The mean of a distribution of means. 1.It is the same as the mean (average) of the scores in the known population of individual cases. 2.Statewhat it is in the particular problem you are working on. 3.Explanation. a. Each sampleis based on randomly selectedvalues from the population. b. Thus, sometimes the mean of a sample will be higher and sometimes lower than the mean of the whole population of individuals. c. There is no reason for these to average out higher or lower than the original population mean. d.Thus the average of these means (the mean of the distribution of means) should in the long run equal the population mean. F. The standard deviation of a distribution of means. 1.The distributionof means will be less spread out than the population of individual cases from which the samplesare taken. a.Any one score, even an extreme score, has some chance of being selected in a random sample. b.However, the chance is less of several extreme scores being selected in the same random sample, particularly since in order to create an extreme sample mean they would all have to be scoreswhich were extreme in the same direction. c. Thus, there is a moderatingeffect of numbers. In any one sample, the deviants tend to be balanced out by many more middle cases or by deviants in the opposite direction, making each sample tend towardsthe middle and away from extremevalues. d.With fewer extreme values for the means, the variation among the means is less. 2.The more cases in each sample, the less spread out is the distribution of means of that sample size. With a larger number of cases in each sample, it is even harder for extreme cases in that sample not to be balanced out by middle cases or extremes in the other directionin the same sample. 3.Explain the idea of standarddeviation. a.It is a standard measure of how spread out it is. b.It is roughly the average amount scores vary from the mean. c. Exactly speaking, it is the square root of the average of the squares of the amount that each score differs from the mean. 4.The standarddeviationof the distributionof means is found by a formula that divides the average of squared deviations from the mean of the populationby the number of subjects in the sample (thus making it smaller in proportion to the number of subjects in the sample),and taking the squareroot of this result. 5.State what it is in the particular problem you are working on, describing the steps of computation. ChapterSeven G. The shape of a distribution of means. 1.If your N is greater than 30. a. The distributionof means will be approximately normal. b.Explain that a normal curve is a bell-shaped distribution that is very common in psychology. c. The distribution tends to be normal due to the same basic process of extremes balancing each other out that we noted in the discussion of the standard deviation-middle values are more likely and extremevalues less likely. 2.If the population of individualcases in your situationis normal. a. The distributionof means will be approximately normal. b. This is because the distribution of means will have nothing to distort it from looking like the distri!mti=:: cf kdivid~a!cases. 111. Determine the cutoff score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesis testing.) A. Before you figure out how extreme your particular sample's mean is on this distribution of means, you want to know how extreme it would have to be to decide it was too unlikely that it could have been a randomly drawn mean from this comparison distribution of means. B. Since in this problem the comparison distribution is a normal curve, you can use a table to tell you how many standard deviations from the mean your score would have to be to be in the top so many percent. C. Note that the number of standard deviations from the mean is called a Z score. (By explainingthis term, you make the writing simpler later on.) D. To use these tables you have to decide the kind of situationyou have; there are two considerations. 1.Are you interested in the chances of getting this extreme of a mean that is extreme in only one direction (such as only higher than for people in general) or in both? (Explain which is appropriatefor your study.) 2.Just how unlikely would the extremeness of a particular mean have to be? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no specialreason given for using one or the other, the general rule is to use 5%.) E. With a little manipulation of numbers you can then look up the percentage in the table and find the Z score correspondingto that percentage. F. State the cutoff for your particular problem. ChapterSeven IV. Determine the score of your sample on the comparison distribution. (Step 4 of hypothesistesting.) A. At this point the study would be conducted and the mean score of the sample studied obtained. (Statewhat it is.) B. The next step is to find where your actual sample's mean would fall on the comparisondistribution, in terms of a Z score. C. State this Z score. V. Compare the scores obtained in Steps 3 and 4 to decide whether to reject the null hypothesis. (Step 5 of hypothesistesting.) A. State whether your mean (from Step 4) does or does not exceed the cutoff (from Step 3). B. If your mean exceeds the cutoff. 1.Statethat you can reject the null hypothesis. 2. Statethat by elimination,the research hypothesis is thus supported. 3.State in words what it means that the research hypothesis is supported (that is, the study shows that the particular experimental manipulation appears to make a difference in the particular thing being measured). C. If your score does not exceed the cutoff. 1.Statethat you cannot reject the null hypothesis. 2.State that the experiment is inconclusive. 3. State in words what it means that the study is inconclusive (that is, the study did not yield results which give a clear indication of whether or not the particular experimental manipulationappearsto make a difference in the particular thing being measured). 4.Explicitly note that even though the research hypothesis was not supported in this study, this is not evidencethat it is false-it is quite possible that it is true but it has only a small effect which was not sufficientto produce a mean extreme enough to yield a significant result in this study. Chapter Self-Tests Multiple-Choice Questions 1. When testing an hypothesis in which the group studied is a mean of a sample of scores, the proper comparisondistributionis a.the originalpopulation from which the sample was taken. b.the population that would exist if the null hypothesis were true. c. the distributionof all possible means of the sample size (N) from the known population. d.the distribution of the individual scoresfrom which the sample mean was calculated. ChapterSeven 145 2. The mean for the general population on a particular standardized test is known to be 56 with a standard deviation of 8. What would be the mean of the distribution of means for a study involving this test. 3. The variance of a distribution of means is calculated by a.multiplying the variance of the population of individual cases by the number of subjects in each sample. b. dividing fhe variance of the popuiation of individuai cases by the number of subjects in each sample. c. subtracting the sample's variance from the population variance. d,taking the square-root of the variance of the population of individualcases. 4. In comparison to the original population data, the shape of the distributionof means is a,a bimodal configuration. b,a rectangular shape. c. more spread out than the original population. d.less spread out than the originalpopulation. 5. When calculating the Z score of your sample's mean on a distributionof means, you should a. estimate the variance of the comparison distribution by taking the square root of the sample's variance. b.treat the sample mean like a single score and find its Z score on the distribution of means. c. divide the variance of the distribution of means by the mean of the sample. d.take the square root of the differences between the two means. 6-9An experimenter was interested in the relation of social class to motivation, using a standard test of motivation (for the general population, the mean is 90 and the standard deviation is 10). Based on a theory she had constructed, she expected that those in the upper class would score lower. She administered the test to 20 members of an upper-class community. Her obtained sample mean on the test was 87. 6. In this example, the null hypothesis would be a. upper-class communitieshave higher motivation than the population in general. b. the mean of the population the sample represents is no lower than the mean of the general population. c. the sample variance will equal the population variance. d. there will be no differencebetween the population and the distribution of means. ChapterSeven 7. The characteristics of the comparison distribution are 8. The cutoff Z score (using the .05 level) is a. *I .64. b. +1.96. C.-1.64. d. -2.33. 9. If the sample mean's score on the comparison distribution is more extreme than the cutoff, what should the researcher conclude? a. Upper class people are more motivated. b. Upper class people are less motivated. c. Upper class people are neither more nor less motivated, but have approximatelythe same level of motivation. d. The results are inconclusive. 10.Ifyou use the .01 significance level instead of the .05 significancelevel, you increase the chance of a. incorrectly supporting the research hypothesis. b. incorrectly rejecting the null hypothesis. c. a Type I1error. d.a Type I error. Fill-In Questions 1. When a study is being conducted in which a group of people are studied to see if they represent a group that is different from the general population, the comparison distribution is a(n) 2. The three characteristics of a comparison distribution are its mean, variance (or standard deviation), and Clzapter Seven 3-4. p = 40.8, o =8, and there are 16people in the sample being studied, which has a mean of 44.9. 5. In a different study, = 107,oM = 3. If a group of 40 people studied has a mean of 101,what is their Z score on the comparison distribution? 6. If a distribution of means comes from a nonnormal population, its distribution is nevertheless c~nsideredr,cma! fcr a!! practical purpcses as.Icr,g as the samp!e size is at !east 7. The standard error of the mean is the same as 8. If the outcome of carrying out a hypothesistesting procedure is to reject the null hypothesis, this can be either a correct decision or a Type error. 9. A(n) is a standardized figure (such as the mean or standard deviation) for the general population based on testing large numbers of people on some particular test. 10.It is rare that psychologists actually cany out hypothesis testing procedures involving a single sample and a population whose mean and variance is known (the kind of hypothesis tests we are considering in this chapter). But when they do carry out such a procedure and report the result in a research article, it is called a(n) Problems and Essays 1. A psychologist interested in the effects of music administers a measure of logical reasoning ability to 5 subjects (randomly selected from the general population of adults) while they are listening to soothing music. Their scores on the test were 68, 39, 55, 73, and 80. Previous research using this test indicates that in the general population the distribution is normal, with p = 48 and o = 12. (a) Do people do better when listeningto soothingmusic (use the .O1 level). (b) Explain what you have done and your conclusions to a person who has never had a course in statistics. Chapter Seven 2. A learning psychologist was interested in whether providing a stationary pattern of small lights on the ceiling and walls of a darkened laboratory would decrease the time it took for rats to learn a maze. (She was wondering if the rats would be able to use the stationary cues to aid in their navigation.) From her previous research she knew that on the average it took a rat 38 trials, with a standard deviation of 6, to learn a particular maze, and that the distribution is normal. She then tested a group of 10rats in the changed laboratory and found their average number of trials to learn the maze was 36. (a) Was this differencesignificantat the .05 level? (b) Explain what you have done and your conclusions to a person who has never had a course in statistics. 3. For each of tiie foliowiiig iiypoiireticai siuciies, explain what a Type i and Type ii error wouid be, and what each would mean. (a) A study comparingthe mean test score for a class using a new learning technique (expected to improve test scores)with the average score on the same test. (b) A study testing whether chimpanzees injected with a new drug learn to use symbolic communication faster than the normal chimp. 4. A school psychologist was wondering how the preschoolers at the school where he worked compared to other preschoolers who have taken a standardizedproblem-solving test. He therefore administered the test to all the preschool children and found that their average was significantly higher than the mean for the general population (theZ score +1.97) at the .05 level. How would the psychologistinterpretthese findingsand report them to a group of people unfamiliar with statistics? Note. There is no section on using SPSS or MYSTAT with this chapter because none of the procedures covered are easily implemented using standard computerized statistical packages. (The material in this chapter is mainly preparation for carrying out procedures that are widely used, however, and for which the computer can be very helpful.) ChapterSeven Chapter 8 Statistical Bower and Effect Size Learning Objectives To understand, includingbeing ableto carry out any necessary procedures or computations: What is statisticalpower. Alpha, beta, power, and possible correctand erroneousoutcomes of hypothesistesting. H Calculating power for a studywith a singlesample and a known population. Power tables. H The influenceson power. The relation to power of effect size and each of its components (difference between means and population standarddeviation). How to determinethe predicted mean of a population exposed to the experimental manipulation. The computation of effect size and its uses. Effect size conventions. Relation of samplesize to power. H Relation of significancelevel to power. Relation of using one- versus two-tailed tests to power. Role of power when designing a study. H Ways to increasethe power of a planned study (andthe advantages and disadvantagesof each). H Role of power in evaluatingresults of a study (for both when a study is and is not significant). H Issues regardingthe possibility of "proving"the null hypothesis. Bi Issues regardingthe comparative advantage of emphasizingeffect size versus statistical significance in interpretingthe result of a completed study. Meta-analysis and its advantagesand disadvantages over narrativereviews of the literature. II How power and effectsize are discussed in research articles. Chapter Outline I. What Is StatisticalPower? A. The probability that a study will yield significant results if the research hypothesis is true. B. In contrast, if the research hypothesis is not true, one would not want to get significant results (that would be a Type I error). ChapterEight C. Also in contrast, even if the research hypothesis is true, the study may not necessarily give significant results-the particular sample that happens to be selected from the population studied may not turn out to be extreme enoughto provide a clear case for rejecting the null hypothesis (this would be a Type I1 error). D. Alpha, beta, and power. 1.Alpha (a) is the probability of a Type I error (it is the same as the significance level, usually .O1 or .05). 2.Beta (13) is the probability of a Type I1error. 3.Power is the probability of not making a Type I1error-thus power = 1 - 13. E. Review of possible correct and erroneous decisions in hypothesis testing. 1.The research hypothesis is actually true and the hypothesis-testing procedure results in rejecting the null hypothesis (a correct decision). Probability of this outcome = power. 2. The research hypothesis is actually false and the hypothesis-testing procedure results in rejecting the null hypothesis (a Type I error). Probability of this outcome = a . 3.The research hypothesis is actually true and the hypothesis-testing procedure results in failing to reject the null hypothesis (a Type I1error). Probability of this outcome = B. 4.The research hypothesis is actually false and the hypothesis-testing procedure results in failing to reject the null hypothesis (a correct decision). Probability of this outcome = 1- a. 11. Calculating Statistical Power A. The procedures considered below are for studies involving the mean of an actual sample comparedto a known population, and where the distributions of means can be assumed to be normal and to come from populations with the same standard deviation. B. Logic of computingpower. 1.The cutoff on the comparison distribution (determined in the steps of hypothesis testing) is the score or point at which, if a mean of the actual sample were greater than it, this would be grounds for rejecting the null hypothesis. 2.This cutoff is usually stated as a Z score. 3.But one can determine the raw-score equivalent to thisZ score-that is, the raw score at which, if a mean of the actual sample were greater than it, this would be grounds for rejecting the null hypothesis. 4.Now consider the distribution of means predicted by the research hypothesis (this is somethingwe have not considered before). a. It will have a different mean than the comparison distribution-for example, if a higher score is predicted by the research hypothesis, the mean of this distribution will be higher than that of the comparison distribution. b.To compute power one must make an explicit prediction of the mean of this distribution. Chapter Eiglzt 5.Power is the percentage of cases on the distribution of means based on the research hypothesis that fall above the raw-score cutoff value, which was originally computed using the comparison distribution (which is based, of course, on the null hypothesis). This same raw-score cutoff point is thus positioned in quite different places on the two distributions. 6.Note that the computation of power has nothing to do with the actual outcome of the experiment-in fact, it is ordinarily computed in advance of actually conductingthe study to help determine whether the study has enough power to be conducted or whether the procedures of the proposed study should first be adjusted in some way to make it more powerful. C. Systematicsteps of computing power. 1.Gatherthe needed bfnrmation. a.The mean (pM)and standarddeviation (oM)of the comparisondistribution. b. The predicted mean of the population that receives the experimentalintervention (how to do this is discussed later in the chapter). 2.In the usual way, find the cutoff (which is always found in tables and is in Z-score terms) needed on the comparison distribution in order to reject the null hypothesis. This is the same as you do in the first three steps of hypothesistesting. 3. Convert this Z score to a raw score using the mean and standard deviation (still of the comparison distribution): raw cutoff = (Z)(D~)+(pM). 4.Determine the Z score corresponding to this raw-score cutoff on the distribution of means for the populationthat receives the experimentalmanipulation. That is, Z = (raw cutoff - mean of predicted distribution)/ oM. 5.Using the normal curve table, determine the probability of getting a score more extreme than this new Z score;this is power. 6.Beta is the remaining probability (1 - power). D. Power tables. 1.The procedures required for more complicated hypothesis-testing situations (the situations covered in the remainder of this text) follow the same logic but involve considerable additional work to carry out by hand-nor is it usually feasible to do it by computer. 2.However, statisticianshave prepared power tables that much simplifythe process. 3.The fundamental logic on which these tables are based is exactly what you have learned here, and using the tables requires exactly the same information that is needed to compute power directly. 4.In later chapters of this text, whenever you learn a new hypothesis-testingprocedure, you will also be furnished with power tables (and instructions how to use them). (An index of thesetables is provided in Appendix B of the text.) ChapterEight III.Influences on Power A. Primary influences. 1.Effect size, which has two elements. a. One element is the magnitude of the difference between the comparison and the predicted means. b.The other element is the standard deviation of the populations of individual cases. 2. Sample size. B. Secondary influences. 1.Level of statisticalsignificance(a). 2.Two-tailed versus one-tailed tests. 3.Type of hypothesis-testingprocedure used. IV.Effect Size A. It can be thought of as the amount the two population distributions do not overlap. B. The larger the effect size, the greater the power. C. The larger the differencebetween the two means (and thus the more offset the two distributions are fiom each other, minimizing their overlap), the greater the effect size. D. How to determine the predicted mean of the distribution based on the research hypothesis. 1.Estimate it fiom previous similar research. 2.Calculate it fiom a precise theory. 3.Determine the smallest difference that would be practically or theoretically interesting (method of the minimum meaningful difference). E. The smaller the population standard deviation (and thus the less overlap because each distribution is narrower), the greater the effect size. F. One measure of effect size (4 is the difference between the two means dividedby the population standard deviation: d = (y, - y2)/ 0. ChapterEiglzt G. The general importanceof effect size. 1.Note that the computation of effect size does not use the standard deviation of the distributionof means (oM),but of the originalpopulation of individualcases (0). 2.Dividing the mean difference by the standard deviation of the population of individual cases standardizes the difference in the same way that a Z score gives us a standard metric for comparisonto other scores, even other scores on differentscales. 3.Because it provides a standard metric for comparison, especially by using the standard deviation of the population of individual cases, we bypass the dissimilarity fiom study to study of different sample sizes, making comparison even easier and effect size even more of a standard metric. 4.Thus, knowing the effect size of a study permits us to compare results with effect sizes f~lmc!b~~ f i e rstdies, eve3 ether stdies using different s-aple sizes. 5.Equally important, knowing effect size can permit us to compare studies using different measures which may have scaleswith quite differentmeans and variances. 6.Even within a particular study, we can apply our general knowledge of what is a small or large effect size. H. Effect size conventions: Developed by Cohen, based on what is typically found in psychologyresearch. 1.Smalleffect size. a.About 85% overlap of the two populations. b. d = .2: That is, the predicted mean is about two-tenths of a standard deviation higher than the mean of the known population. c.An example is the differencein height between 15- and 16-year-old girls. 2.Medium effect size. a.About 67% overlap of the two populations. b. d = .5: That is, the predicted mean is about half a standard deviation higher than the mean of the known population. c.An example is the difference in heightbetween 14- and 18-year-old girls. 3.Large effect size. a.About 53% overlap of the two populations. b. d= .8: That is, the predicted mean is about eight-tenths of a standard deviationhigher than the mean of the known population. c.An example is the difference in height between 13- and 18-year-old girls. I. If you know the effect size (for example, based on Cohen's conventions) and the population standard deviation, it is possible to compute the expected mean difference by solvingthe effect size formula for (p,- p2). ChapterEight V. Sample Size A. The largerthe sample size, the greater the power. B. This is because the variance of the distribution of means is based on the population variance divided by the sample size-the larger the sample size, the smaller the variance, and the smaller the variance, the less overlap of the distributionsof means. C. Determining needed number of subjectsto attain a given level of power. 1.One reason the influence of sample size on power is so very important is that the number of subjectsis somethingthe researcher can often controlprior to the experiment. 2.The number of subjects needed for a given level of power can be found by turning the steps of computing power on their head. a.Begin with a desired level of power (often 80% is used). b. Then calculate how many subjects are needed to get that level of power. 3.In practiceresearchers use specialtables for this purpose (subsequent chapters of the text provide such tables for each new hypothesis testingprocedure). VI. Other Influences on Power A. Significance level (a). 1.The less stringentthe significancelevel (for example, .05 versus .0I), the more power. 2.This is because the cutoff for a less stringentsignificance level will not be as extreme. B. Two-tailed versus one-tailedtests. 1.One-tailedtests have more power (for results in the predicted direction). 2.This is because the cutoff in the predicted direction is less extreme (since all the a percentage is at that end, instead of being divided in half). C. Type of hypothesis-testingprocedure. 1.Sometimes a researcher has a choice of more than one statisticalprocedure to apply to a given set of results, and eachhas its own power. 2.Sometimes there may be more than one type of research design available, such as between-subjects versus within-subjects (see Appendix A), and each of these also has its own effect on power. VII.Role of PowerWhen Designing a Study A. If a researcher checks and findsthat the power of a planned experimentis low, it is clear that even if the research hypothesis is true that this study is not likely to yield significantresults in support of that research hypothesis. Thus the researcher must seek practical ways to modify the study to increase the power to an acceptablelevel. ChapterEight B. What is an acceptablelevel of power? 1.80% is a widely used convention. 2.If a study is very difficult or costly to conduct, a researcher might want even higher levels (such as 90% or even 95%) before undertaking the project. 3.If a study is very easy and inexpensive to conduct, a researcher might be willing to take a chance with a somewhat lower level of power (such as 60% or 70%). 4.The acceptable level of power also depends on how difficult and costly it is to increase power. VIII. How to Increase the Power of a Planned Study A. Increasing expected difference between population means. 1.If the original prediction is the most accurate available, arbitrarily changing it would undermine the accuracy of the power calculation. 2.However, it is sometimes possible to change the way the experiment is being conducted (for example, by increasing the intensity of the experimental manipulation) so that the researcher would have reason to expect a larger mean difference. 3.Disadvantages of this approach. a. Can be difficult or costly to implement. b.Can create circumstances implementing the experimental treatment that are unrepresentative of those to which the results are intended to be generalized. B. Decreasing the population standard deviation. 1.Conduct the study using a population that is less diverse than the one originally planned-however, this limits the scope of the population to which the results can be generalized. 2.Use conditions of testing that are more constant (such as controlled laboratory conditions) and measures that are more precise-this is a highly recommended approach. C. Increasing sample size. 1.The most commonly used, straightforward way to increase power. 2. In some cases, however, there may be limits to the number of subjects available or great costs in recruiting or testing additional numbers. D. Using a less stringent significance level-however, this increases the risk of a Type I error, and thus should be used cautiously. E. Using a one-tailed test-however, one runs the various risks discussed in Chapter 6 with one-tailed tests, most notably the possibility of having to deal with opposite-to-predictedresults. F. Using a more sensitive hypothesis-testing procedure-when choices are available (and there are no offsetting disadvantages), one should always use the procedure that gives greatest power. Chapter Eigltt IX.Role of Power in EvaluatingResults of a Study A. When a result is significant. 1.Statistical significance is a necessary prerequisite to considering a result as either theoretically or practically important. 2.For a result to be practically important, however, in addition to statistical significance, it should be of a reasonable effect size. 3. It is easy for a study with a very small effect size, having little practical importance, to still come out significant-if the study has reasonable power due to other factors, especially a large sample size. 4.But if the sample was small, you can assume a significant result is probably also practically important. 5 .Winen comparing two studies, the effect sizes and not the significance ieveis obiained should be compared (since the significance levels could be due to different sample sizes and not to different underlying effects in the populations). B. When a result is not significant. 1.If the power of the study was low. a. Failing to get a significantresult is especially inconclusive. b.The nonsignificant outcome could be because the research hypothesis was false, or it could be because the research hypothesis was true but the study had too little power to come out significant. 2.If the power of the study was high. a. Failing to get a significant result suggests more strongly that the research hypothesis (as specified with a specificmean difference) is false. b.This does not mean that all versions of the research hypothesis are false (it is possible that the experimental manipulation does make a difference, but a much smaller one than was hypothesized when computing power). X. Controversiesand Limitations A. Can you "prove" the null hypothesis? 1.The traditional principle is that when the results are not strong enough to reject the null hypothesis, the implication is that the results are inconclusive-not that the null hypothesis is true. 2.This is because, no matter how much power an experiment has, it is always possible that there is a real effect. 3.Thus, in general, it is best to avoid conducting studies in which your prediction is that the populations do not differ. Chapter Eight 4.However, you can use statistical results to provide a convincing case that whatever difference exists must be very, very small. a. Specify a specificsmall amount of difference. b.Use that small differenceas the predicted effect size. c. Conduct a studythat has very high power in spite of that small effect size. d.If the study fails to achieve significance even with this amount of power, then you have what amounts to a significance at the level of 1 minus your power that if there is any difference in the populations, it is less than the small difference you initially specified. 5.This procedure is rarely practical because of the very large number of subjects needed for a high power experiment with a small effect size. B.Effect size versus statisticaisignificance. 1.Some psychologistsarguethat significancetests are misleading for several reasons. a. They are highly influenced by sample size, so that a large sample can produce significance with an unimportant underlying effect and a small sample can fail to give significance even with a large underlying effect. b.They give an all-or-noneoutcome based on an arbitrary a. 2.These opponents to current procedure instead offer several reasons for emphasizing effect size. a. It has neither of the above disadvantages. b.It directly indicatesthe importance of the underlying effect. c. It permits direct comparison among (and the accumulationof results across) studies. 3.However, there are also counterarguments in favor of significancetests. a. If a result is not significant, it should not be taken seriously regardless of its effect size, since we are not sure the effect did notjust ariseby chance. b. There are times when even a very small effect size is important. c. In theoretically-orientedresearch, usually what matters most is our confidence in the pattern of results being consistent with a theory and not the effect size of those results (which may depend largely on the details of the setup of the experiment). 4.In general, psychologists in applied areas should and usually do give more emphasis to effect size (but still use significance tests as well), while psychologists in more theoretical areas tend to rely mainly on significance, though occasionally making use of effect size to compare results across studies. C. Meta-analysis is a statistical method for combining the results of independent studies. 1.It is used primarily in articles which review the experiments conducted in a particular area of research. 2.A meta-analyticreview of the literature involves two steps. a. One first locates all the studies conducted on a given topic. b. One then statisticallycombinesthe results (usuallyusing effect sizes) of these studies. Clzapter Eigltt 3.Reviews of the research literature using meta-analysis are an alternative to the traditional "narrative" literature-review article that describes and evaluates each study and then attempts to draw some overall conclusion. 4.The number of articles using meta-analysis has increased dramatically in recent years. 5.They are most common in the more applied areas of psychology. 6.Proponents argue that meta-analysis is a substantial improvement over the traditional approach to reviewing literature for three main reasons. a. It is more objective. b.It demands greater rigor of the reviewer. c. It gives precise information never before available (such as average effect size over many studies or differences in effect size for studies using different methods or p ~ p u l ~ t i ~ i i ~j. 7.Proponents of narrative reviews complain that meta-analysis is too mechanical. a. It tends to give equal weight to well-done and poorly done studies. b. It is highly dependent on what studies the researcher was able to locate (this is especially problematic since studies that do not show significant effects are rarely published, biasing the available studiestowards those with significant effects). 8.Proponents of meta-analysis offer counterarguments. a. In meta-analysis, there are statistical techniques for taking into account the reviewers' evaluation of how well a study was conducted, and these techniques are superior to the subjective, impressionistic method of the usual literature review. b.Narrative reviews are also likely to be overinfluenced by those studies that are available. 9.Currently, both types of reviews continue to be published in major psychology journals, but meta-analysis seems to be on the upswing. XI. Power and Effect Size as Discussed in ResearchArticles A. Power is not often mentioned directly in research articles (its greater role is in the planning of research and in interpretingresearch results). B. Power is occasionally mentioned in the context of justifying the number of subjectsused in a study. C. Authors are also likely to mention effect size when comparing results of studies or parts of studies (or in meta-analysis articles). Clzapter Eigltt Formula Effect size (d) Formula in words: The differencebetween the hypothesized and known population means, dividedby the population standard deviation. Formula in symbols:d = p1- p2I a (8-1) p1 is the mean of Population 1 (the hypothesized mean for the population that is exposedto the experimentalmanipulation). p2 is the mean of Population 2 (which is also the mean of the comparison distribution). o is ihe stzi~dzirdd ~ i ~ t i o i iof P~pkitioii2 (md assirii~iebt~ be &e standarddeviation of both populations). How to Compute Power for a Study Involving a Single Sampleof More than One Subject and a Known Population I. Gather the needed information. A. The mean (pM,same as p,) and standard deviation (0,) of the comparison distribution (these can be computed from p2and 0 or c2,which are usually given as part of the problem). B. The predicted mean of the population that receives the experimental intervention can be determinedin one of severalways: 1.It may be given in the problem. 2.One can determine the minimum meaningful difference. 3.One can use theory or previousresearch. 4.One can use the effect size conventions. a. Determinewhether effect size is small (.2), medium (.5), or large (3). b. Substitutethis effect size (.2, .5, or .8), the mean of the known population (p,), and the standarddeviationof the known population (0) into the formula for effect size-d = (pl - p2)/o-and solve for pI. C. Draw a picture of the two distributionsof means, one above the other, with the means of each at the appropriateplaces. ChapterEight 11. Determine the cutoff point, in raw-score terms, needed on the comparison distribution in order to reject the null hypothesis. A. Find the cutoff in 2-score terms (that is, carry out Steps 1 to 3 of the usual hypothesis-testingprocedure). 1.Identify the characteristicsof the comparisondistribution. a. Its mean is the same as the mean of the population of individual cases from which the samples are taken: pM' p2. b. Its standard deviation is the square root of the result of dividing the variance of the distribution of the population of individual casesby the number of cases in the samples being selected: CTM' d(cr2/~).(Nis the number of cases in each sample.) c. Its shape must be normal or approximatelynormal (which will be the case if either the population is normal or the sample size is greater than 30), or this problem can't be done. 2.Decide on the significancelevel (1% or 5%). 3.Decide about whether this is a one-tailed or two-tailedtest. 4,Determine the percentage of cases between the mean and where the appropriate percentage begins on the normal curve. a. If a one-tailedtest, this is 50% minus the significancelevel. b.If a two-tailedtest, this is 50% minus 112the significance level. 5.Look up the Z correspondingto this percentage in the% Mean to Z column in the normal table-this is the cutoff Z. B. Convert this to a raw score (or to two raw-scores if a two-tailed test is used) using the mean and standard deviation of this distribution: raw cutoff = Q(c,> + (PM). C. Draw in the raw cutoff on Population 2's distribution of means-label the area more extremethan it as alpha (a). 1II.Determine the Z score corresponding to this raw-score cutoff on the distribution of means for the population that receives the experimental manipulation. A. Z = (raw cutoff - mean of predicted distribution) 10,. B. If a two-tailed test, be sure to do this for both cutoffs. C. Draw in the cutoff on Population 1's distribution of means; label the area beyond it in the predicted directionas power. IV.Using the normal curve table, determine the probability of getting a score more extreme than that Z score. ChapterEight Outline for Writing Essays for ComputingPower for StudiesInvolving a Single Sample of more than One Subject and a Known Population The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the p z ~ i c ~ ! ~ study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear justwhy that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someonewho understands right up to whatever point you yourself start beingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Explain the logic of hypothesis testing and associated terminology (as per the outline for essays in Chapter 7 of thisStudy Guide),focusing on the particular studyyou are working on. 11. Explain the concept of power-the probability of getting significant results if the research hypothesisis true. III.Computationof power. A. State the Z score cutoff on the comparison distribution (determined in the steps of hypothesis testing, as you have explained in describing those steps), being sure to explain how it was arrived at. B. Convert the cutoff Z to a raw score and draw a picture of the distribution showing the raw-score cutoff and the area beyond it (shading this area and labeling it "area in which the result would be significant"). C. Explain the idea of a distribution of means predicted by the research hypothesis (that is, the distribution expected for the group exposed to the experimentalmanipi~lation). 1.Notethat it will have a different mean than the comparison distribution and state what your prediction is. 2.Describe your basis for predicting that mean. a. Given in the problem. b.Based on theory or previous research. c.Based on minimum meaningfuleffect size. 3.Note that we assume it will have the same standard deviation and will also be a normal curve. 4.Draw a diagram of the predicted distribution, drawing it above the comparison distribution, with the scale of numbers matching and in the same location across the page as that of the distribution of means below (forcing the center of the curve you draw above to be offset). D. Mark the raw-score cutoff on the distribution of means for the population exposed to the experimentalmanipulation. E. Power is the area more extreme in the predicted direction than this cutoff. (Shade this area in the upper curve, and label it power.) F. Since this upper distribution is a normal curve, and the proportion of cases above any point on a normal curve is known and available on a table, this proportion can be computed. 1.Find the Z-score equivalent to the raw-score cutoff using the mean and standard deviation for the distribution of means based on the research hypothesis. (This is necessary since the normal curve table uses Z scores.) 2.Using the normal curve table, find the percentage of cases more extreme than this Z score. 3.Stateyour percentage and mark it on the graph. IV.Indicate whether this level of power would be sufficient to conduct the study or whether modifications should be made to increase power. (Use 80% as a benchmark.) CItapter Eiglzt Chapter Self-Tests Multiple-Choice Questions 1. Statisticalpower can be defined as a.the effect that the result of the study will have on the area of applied psychology. b.the probability of rejecting the null hypothesis if in fact the null hypothesis is true. c. having a large enough effect size in orderto always get a significant result. d.the probability that the study will yield a significantresult if the research hypothesis is true. 2. You can calculatepower by a. comparing the population distribution to a distribution made up of samples drawn from the rejection region in the predicted direction. b. determiningthe probability of a Type I error and subtracting it from 1. c. creating a hypothetical comparison distribution which would be false if the null hypothesis were true and plotting the sample mean on this distribution. d.calculatingthe area of the distribution expected under the research hypothesis that corresponds to the rejection region on the distribution expectedunder the null hypothesis. 3. In real-life research today, when psychology researchers determine power they usually use a. a standard computerprogram. b. a computationalformula. c. a power table. d.the computationalprocedure described in this chapter. 4. If in a planned experiment the population distribution expected under the research hypothesis and the known population have almost no overlap at all, the planned experimenthas a. a large effect size. b. a moderate effect size. c. a small effect size. d.no statisticalimportance. 5. How does the number of subjects affect power? a.By allowing the experimenter to remove results that are not extreme and so to significantly decrease variance. b.By reducing the amount of variance in each of the distributions of means, and thereby further separatingthese two distributions. c. By addingthe effect of extreme scoresto the population variance. d.By limitingthe difference between the sample and population means. Clzapter Eiglzt 6. When the standard deviation of the population of individualcases is low, this makes power a,low. b.high. c. irrelevant-the chances of getting a significant result are almost nil. d.none of the above-the standard deviation of the population of individual cases has nothing to do with power. 7. Two similar studies each investigated the effectiveness of a differentjob training program. If you wanted to compare the effectiveness of the two programs based on the results of these studies, it would be best to compare their a,power. b. effect size. c. significance levels. d.Z scores on the comparison distribution. 8. What effect does using a .05 level of significance instead of a .O1 level have on power? a. It increases power. b.It decreasespower. c.It has no effect on power. d.It increases power if a one-tailed test is used but has no effect on power if a two-tailed test is used. 9. Consider the relationship of power to a nonsignificant result. If the power of the experiment was high, a. a nonsignificant result does not give us any indication regarding the research hypothesis. b,the researcher has probably spent a great deal of effortin altering the effect size. c. it is likely that there is only a small effect size, if any. d.it is likely that there was an error in calculatingpower. 10.Research designed to test an abstract theory is MORE likely than applied research to rely exclusively on a.effect size. b,statistical significance. c.the minimum meaningful difference. d.Cohen's kappa for estimating effect size. ChapterEight Fill-In Questions 1. is the probability of rejectingthe null hypothesiswhen in fact it is true. 2. is the probability of failing to reject the null hypothesiswhen in fact it is false. 3. In a particular planned study, the cutoff Z score for significance has a raw score of 16 on the comparison distribution (that is, scores of 16 or higher would be suEcient to reject the null hypothesis). The distribution of means for Population 1 is predicted by the researcher to have a mean of 19 and a standard deviation of 3. What is the power of this study? (Use the normal curve approximationrules.) 4. In another planned study, the cutoff Z score for significance has a raw score of 138 on the comparison distribution. The distribution of means for Population 1 is predicted by the researcher to have a mean of 152 and a standard deviation of 7. What is the power of this study? (Use the normal curve approximation rules.) 5. To determine the power of a planned study of a new procedure for increasing typing speed, a researcher uses the ,figuring that unless the procedure being studied increases averagetyping speed by at least 10words per minute, it is not worth using. 6. d is a symbolfor 7. If the predicted mean of the known population is 9, the hypothesized mean for the population exposed to the experimental manipulation is 15, and the standard deviation of the population of individual cases is 12, accordingto Cohen's conventions, this is a effect size. 8. Using more accurate measurement in a study increases power by its direct effect in reducing 9. Increasing the number of subjectsin a study increasespower by reducing 10. is a procedure used to combine results of studies reported in many separate research studies. ChapterEight Problems and Essays 1. A consumer psychologist is planning a study of children's buying habits of candy. At a particular summer camp where this psychologist has previously done research, it is known that the amount spent on candy at the camp store by the children during a one-week period is roughly normally distributed with a mean of $3.80 and a standard deviation of $1.50. (The camp store is the only source of candy anywhere near the camp and records of all purchases are automatically kept because children have an account at the store that they use rather than actually canying money.) In this study 25 randomly selectedchildren at the camp will be given a speciallecture on the effects of candy on teeth and general health. Then how much candy they buy at the camp store during the following week will be analyzed from the store records. The researchers are realistic and expect that wch a !mare wi!! h~ve,2t best, OI?!~ z mcdest effect. Bgt laless the effect is zt least ZE averagereduction of $.25, such lectures would be consideredineffective. (a) Assuming the researcher will use the .05 significancelevel, what is the power of this study? (b) Explain your answerto a person who has never had a course in statistics. (c) What is the predicted effect size? (d) Describe four things the researchers could do to try to increase the power of the study and say why each might work. 2. A psychologist is interested in whether memory is affected by sadness. On the particular memory task the researcher is planning to use, it is known that scores of college students are approximately normally distributed with a mean of 68 and a standard deviation of 12. In this study the researcher plans to test a group of 30 students after they have just viewed a very sad movie. Based on a theory that people will concentrate on the memory task in order not to think about the sad movie, the researcher predicts that this group will score about 6 points higher than the usual subjects (that is, that they will score about 74 on the average). (a) Assuming the researcher will use the .O1 significancelevel, what is the power of this study? @) Explain your answerto a person who has never had a course in statistics. (c) What is the predicted effect size? (d) Describe four things the researchers could do to try to increase the power of the study and say why each might work. 3. A study is reported in which baseball players are compared to the general public on how much they like chewing gum. The result was that the null hypothesis was not rejected; no significant difference was found. In this study 500 baseball players were tested, and in general the power was very high. What should you conclude from all this about whether baseball players like chewing gum more than people in general? Why? ChapterEiglzt 4. A study reports that a particular kind of training workshop reduces burnout among school counselors. This was a large study, involving thousands of school counselors, and the study employed very accurate measures and generally had very high power. The result was that the null hypothesis was rejected at the .05 level of significance (one-tailed). What should you conclude from all this aboutthe impact of the workshop? Why? Note. There is no section on using SPSS or MYSTAT with this chapter because in general computers are not used in the computationof power. Insteadpower is ascertained from tables (which will be provided in the text for the procedures covered in Chapters 9 through 14). Chapter 9 The t Test for Dependent Means Learning Objectives To understand, includingbeing ableto cany out any necessary procedures or computations: H Estimated population variance based on scores in the sample. H Purpose of the t distribution and how it is differentfrom a normal curve. H Thet table. H The t test for a single sample and a known populationmean. H The t test for dependentmeans. E The normal-population-distribution assumption for the t test for dependent means and the conditions in which it is safe and not safe to violate it. Effect size for a study using a t test for dependent means, including whether it is a small, medium, or large effect,based on Cohen's conventions. H Power table for the t test for dependentmeans. H Number-of-subjectstable for the t test for dependentmeans. H t tests for dependentmeans as reported in psychology research articles. H Limitations of a pretest-posttest design. Chapter Outline I. The t Test for a Single Sample A. Hypothesis testing with a single sample and a population for which the mean is known but not the variance works the same way as you learned in Chapter 7, except that you estimate the population variance (for Step 2) and you use a different table to determinethe cutoff point (Step 3). ChapterNine B. Estimating population variance from the sample information. 1.Since a samplerepresents its population, its variance is representativeof the population's variance. 2.However, the variance of a random sample, on the average, will be slightly smaller than the variance of the population from which that sample is taken. (That is, the sample's variance is a biased estimator of the population's variance.) 3.To compute an unbiased estimate of the population variance, divide the sum of squared deviationsin the sampleby the number of scores in the sampleminus one. 4.The formula is S= Z(X-M)2I(N-1) = SSI(N-1). 5.The number you divide by in computing the estimated population variance (N-1) is the degrees of freedom. Thus S= SSldJ: 6.Once y=n h~F,YOU coo;pcte the stmdmi deviztioi; of :be coiqxiriisoii distiib-itioiiiii the usual way, except for using Sinstead of 02.That is SMZ=SIN;szdsM2. C. Shape of the comparison distribution when using an estimated population variance: The t distribution. 1.When carrying out the hypothesis testing process using an estimated population variance, here is less true information and more room for error. 2.Thus, extreme scores are more likely to occur in the distribution of means than would be found in a normal curve. (And the smallerthe N, the more likely.) 3.The appropriate comparison distribution follows instead a mathematically defined curve called a t distribution. 4.t distributionsdiffer according to the degrees of freedom when figuringF. 5.The more degrees of freedom on which the t distribution is based, the closer it is to a normal curve. D. Determining the cutoff sample score for rejecting the null hypothesis: Using the t table. 1.Appendix B of this book (and most statistics books) gives a simplified table oft distributionswhich includes only the crucial cutoff scores. 2.To use the t table you need to know the degrees of freedom, the significance level, and whether it is a one- or two-tailed test. E. Determining the Score of Your Sample Mean on the ComparisonDistribution: The t Score. 1.Step 4 of the hypothesistesting process, determining the score of your sample's mean on the comparison distribution,is done exactly the sameway with a t test. 2.However, the resulting score is called a t score instead of a Z score. 3.Thus, t = (M- p)/SM ChapterNine 11. The t Test for Dependent Means A. A more common situation for a t test involves studies in which there are two scores for each of several subjects--oftena before and after score or when each subject is tested under two different'circumstances. 1.These are calledrepeated-measuresor within-subjects designs. 2.The hypothesis testing procedure is called a t test for dependent means. B. A t test for dependent means is conducted in exactly the same way as the t test for a single sample (above), except you use difference scores and you assume the population mean (of difference scores)is zero. C. Difference scores: 1.A difference score is computed by subtracting,for each subject, one score from the other (for example, the before score from the after score). 2.Using difference score converts two sets of scores into one. 3.Once the difference score has been computed for each subject, the entire hypothesis testing procedure is carried out using difference scores. D. The population of difference scores (Population 2, the one to which the population represented by your sample will be compared) is ordinarily assumed to have a mean of zero. (This makes sense because we are comparing our sample's population to one in which there is, on the average, no difference.) III.Assumptionsof the t Test A. The comparison distribution will be a t distribution only if the population of individual cases (or difference scores, if conducting a t test for dependent means) fiom which we drew our samplefollows a normal curve. B. Otherwise the appropriate comparison distribution will follow some unknown, other shape. C. Thus, a normal population distribution is an assumption of the t test. D. Unfortunately, it is rarely possible to tell whether the population is normal based on the informationin your sample. E. Fortunately, results are reasonably accuratewhen the population distributionis fairly far fiom normal. (The t test is said to be robust over moderate violations of the assumption of a normal population distribution.) F. Thus, psychologists use the t test so long as neither of the following is the case: 1.There is reason to expect a very large discrepancyfrom normal. 2.The population is highly skewed and a one-tailedtest is being used. ChapterNine IV.Effect Size and Power for the t Test for Dependent Means A. Effect size: I.Effect size for the t test for dependent means is computed in the same way as we did in Chapter 8: t = (pl-p2)Io. 2.Since the mean of Population 2 is ordinarily assumed to be zero and the standard deviation (o) is of the populations of difference scores, the formula reduces to t = p,/o, with both terms relating to difference scores. 3.Note: To compute effect size you divide by o (or its estimate S), and not by oM(or SM). 4.Cohen's rules of thumb for effect sizes are the same as for the situation in Chapter 8: A small effect size is .20, a medium effect size is .50, and a large effect size is .8O. B. Power: 1.The text provides a table (Table 9.8) that gives the approximate power for the .05 significancelevel for small,medium, and large effect sizes and one- or two-tailed tests. 2.This power table is especially useful when interpreting the practical importance of a nonsignificantresult in a published study. C. Planning sample size: The text provides a table (Table 9.9) that gives the approximate number of subjects needed to achieve 80% power for estimated small, medium, and large effect sizes using one- and two-tailed tests for the .05 significance levels. (Eighty percent is a common figure used by researchers for the minimum power needed to make it worth conducting a study.) D. Studies using difference scores often have considerably larger effect sizes for a given amount of expected differencethan other kinds of research designs. V. Controversiesand Limitations A. The main controversies about the t test have to do with its relative advantages and disadvantages in comparison to various alternatives. These alternatives are discussedin Chapter 15. B. Research designs in which the same subjects are tested before and after some experimental intervention, without any kind of control group that does not undergo the same procedure, is a weak research design. (Even if such a study produces a significant difference, it leaves many alternative explanations for why that difference occurred. VI.How t Tests for DependentMeans Are Described in Research Articles A. Research articles may describe t tests in the text following a standard format. For example: t(24) = 2.8,p < .05. B. Alternatively, they may present the means of the different groups on a table, often using starsto indicatethe level of significancefor each comparison. ChapterNine Formulas I. Unbiased estimate of the population variance (F)for a t test for a single sample or a t test for dependent means Formula in words: Estimated population variance is the sum of deviations of the scores in the sample from the sample's mean, divided by the degrees of freedom (the number of cases in the sample minus one). Formula in symbols:S = Z(X-M)21(N-1)= SSIN-1=SSldf Z(X-M)Zor SS is the sum of squared deviations from the mean of the sample. N is the number of scores in the sample (or number of pairs of scores). df is the degrees of freedom. 11. Degrees of freedom (dJ for a t test for a single sample or a t test for dependent means Formula in words: Degrees of freedom are the number of difference scores minus one. Formula in symbols:df;N-1 III. Estimated population standard deviation(S') Formula in words: Estimatedpopulation standard deviation is the square root of the estimatedpopulationvariance. Formula in symbols: S = 4 s IV. Variance of the distribution of means based on an estimated population variance (S;) Formula in words: Variance of the distribution of means based on an estimated population variance is the estimated population variance divided by the number of scores (or differencescores) in the sample. Formula in symbols:SM2= W N S is the variance of the population of individual scores (or of difference scores in a t test for dependentmeans). N is the number of scores (or difference scores) in the sample. Chapter Nine V. Standard deviation of the distribution of means based on an estimated population variance (S,) Formula in words: Standard deviation of the distribution of means based on an estimatedpopulation variance is the square root of the variance of the distribution of means based on an estimated population variance. Formula in symbols: S,= dsM2 VI. t score for a t test for a single sample or a t test for dependent means Formula in words: t score is the difference between the mean of the sample and the known population mean, divided by the standard deviation of the distribution of means based on an estimated population variance. Formula in symbols: t = (M- p)/S, M is the mean of the sample of scores (or the mean of the sample of difference scores). p is the mean of the population to which the sample's population is being compared (Population 2). In the case of at test for dependent means, p is usually assumedto be 0. VII. Effect sue (d)for t test for a single sample or for a t test for dependent means Formula in words: A standard measure of effect size (t) is the difference between the hypothesized and known population means, divided by the population standard deviation. Formula in symbols: d= (pl - p,) I o. p1 is the hypothesized (or, if the study is completed, actual) mean of the population which the sample represents. (In a t test for dependent means, pl is a mean of difference scores.) p, is the known mean of the population of scores (or difference scores) to which the sample's population is to be compared. (In a t test for dependent means, p2is usually considered to be 0.) o is the standard deviation of the scores (or difference scores) in the population. It may be estimated as S. VIII.Alternative, simplified formula for effect size (d) for a t test for dependent means when p, is considered to be 0 Formula in words: A standard measure of effect size (t) for the t test for dependent means can be computed by taking the hypothesized population mean difference score, divided by the standard deviation of the population of difference scores. Formula in symbols: d= pl / IS ClznpterNine How to Conduct a t Test for a Single Sample (Based on Table 9.3 in the Text) I. Reframe the question into a null and a research hypothesis about populations. 11. Determine the characteristics of the comparison distribution: A. The mean is the same as the known population mean. B. The standard deviation is computed as follows: 1.Compute estimatedpopulation variance: S=SSldf 2.Compute variance of the distributionof means: SM2=S/N. 3.Compute standard deviation:S, = dsM2. C. Shapewill be a t distribution with N-1 degrees of freedom. 1II.Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine the degrees of freedom, desired significance level, and whether to use a one-tailed or two-tailedtest. B. Look up the appropriate cutoff on a t table. IV.Determine the score of your sample's mean on the comparison distribution: t = (M-p)lSM. V. Compare the scores in 3 and 4 to decide whether or not to reject the null hypothesis. How to Conduct a t Test for Dependent Means (Based on Table 9.7 in the Text) I. Reframe the question into a null and a research hypothesis about populations. 11. Determine the characteristics of the comparison distribution: A. Convert each subject's two scores into difference scores. Carry out the remaining steps using these difference scores. B. Computethe mean of the differencescores. C. Assume the population mean is zero--p=O. D. Computethe estimatedpopulation variance of difference scores: S=SSIdJ: E. Compute the variance of the distribution of means of difference scores: SM2=S/N. ChapterNine 177 F. Compute the standard deviation of the distribution of means of difference scores: sM=dsM2. G.Note that it will be a t distributionwith df =N-1. 1II.Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine desired significance level and whether to use a one-tailed or two- tailed test. B. Look up the appropriate cutoff on a t table. IV.Determine the score of your sample on the comparison distribution: t = !M-p)ISw V. Compare the scores in 3 and 4 to decide whether or not to reject the null hypothesis. Outlines for Writing Essays The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much--your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explainingin words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear justwhy that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structureyour essays. There are other completely correct ways to go about it. And this is anoutline for an answer--you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understandsright up to whatever point you yourself startbeingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. 178 ChapterNine t Testfor a Single Sanzple (in which each subject is measured once) I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesistesting.) A. State in ordinary language the hypothesis testing issue: Does the average of the scores of the group of persons studied represent a higher (or lower or different) mean than would be expected if this group of persons had just been a randomly selected example of people in general--that is, does this set of people studiedrepresent a different group of people from people in general? B. Explain language (to make the rest of essay easier to write by not having to repeat long expianationsj, focusing on the meaning of each term in the concrete example of the study at hand. 1.Populations. 2. Sample. 3.Mean. 4.Researchhypothesis. 5.Null hypothesis. 6.Rejectingthe null hypothesis to provide support for the research hypothesis. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesistesting.) A. Explain the principle that the comparison distribution is the distribution (pattern of spread of the means of scores) that represents what we would expect if the null hypothesis were true and our particular mean were just randomly sampled fiom this population of means. B. Note that because we are interested in the mean of a sample of more than one case, we have to compare our actual mean to a distribution not of individual cases but of means. C. Give an intuitive understanding of how one might construct a distribution of means for samples of a given size fiom a particular population. 1.Select a random sample of the given size (number of subjects) from the population and compute its mean. 2.Select anotherrandom sample of this size from the population and compute its mean. 3.Repeat this process a very large number of times. 4.Make a distributionof these means. 5.Note that this procedure is only to explain the idea and would be too much work and unnecessary in practice. ChapterNine D. There is an exact mathematical relation of a distribution of means to the population the means are drawn from, so that in practice the characteristicsof the distribution of means can be determined directly from knowledge of the characteristicsof the population and the size of the samples involved. E. The mean of a distribution of means. 1.It is the same as the mean (average) of the scores in the known population of individual cases. 2. State what it is in the particularproblem you are working on. 3.Explanation. a. Each sample is based on randomly selectedvalues from the population. b. Thus, sometimes the mean of a sample will be higher and sometimes lower than the mean of the whole population of individuals. c. There is no reason for these to average out higher or lower than the originalpopulation mean. d.Thus the average of these means (the mean of the distribution of means) should in the long run equalthe populationmean. F. The standard deviation of a distributionof means. 1.The spread of a distribution of means will be less spread out than the population of individual cases from which the samples are taken. a.Any one score, even an extreme score, has some chance of being selected in a random sample. b.However, the chance is less of several extreme scores being selected in the same random sample, particularly since in order to create an extreme sample mean they would have to be scoreswhich were extreme in the same direction. c. Thus, there is a moderating effect of numbers. In any one sample,the deviants tend to be balanced out by middle cases or by deviants in the opposite direction, making each sampletend towards the middle and away from extreme values. d.With fewer extreme values for the means, the variation among the means is less. 2.The more cases in each sample, the less spread out is the distribution of means of that sample size: With a larger number of cases in each sample, it is even harder for extreme cases in that sample not to be balanced out by middle cases or extremes in the other direction in the same sample. 3.Explain the idea of standarddeviation. a. It is a standardmeasure of how spread out a distributionis. b. Roughly, it is the average amount scores vary from the mean. c. Exactly speaking, it is the square root of the average of the squares of the amount that each score differs from the mean. 4.The standarddeviation of the distribution of means is found by a formula that divides the average of squared deviationsfrom the mean of the populationby the number of subjects in the sample (thus making it smaller in proportion to the number of subjects in the sample), and taking the squareroot of this result. ChapterNine 5.Computing this standard deviationrequires knowing the variation in the population, and this is not known. But it can be estimated. a. Whatever the distributionyour particular scores come from, it is reasonableto assume that the variation among your particular group is representativeof the variation in that larger distributionof scores. b.A sample's variation is on the average slightly less than the population it comes from because it is less likely to include scoresthat are far from its mean. c. Thus, a special adjustment is made that exactly corrects for this: Instead of taking the average of the squared deviations--the sum of squared deviations divided by the number of subjects--one instead divides the sum of squared deviations by one less than the number of subjectsin the sample. 6.Describe the steps of computing the estimated population variance and the standard deviation of the distributionof means for your example, statingthe fmal result. G. The shape of the distribution of means. 1.The distribution tends to be bell-shaped, with most cases falling near the middle and fewer at the extremes, due to the same basic process of extremes balancing each other out that we noted in the discussion of the standard deviation--middle values are more likely and extreme values less likely. 2. Specifically, it can be shown that it will follow a precise shape called a t distribution. 3.Actually there are different t distributions according to the amount of information that goes into estimating the variation in the distribution from the sample, the number you divide by in making the estimate (the number of subjectsminus one). 4.Als0, the shape of the comparison distribution is only a precise t distribution if the population of individual scores follows a precise shape called a normal curve (also bell shaped) that is widely found in nature. Note that in the problem you are told that the distribution of the population is a normal curve so that this conditionis met in your case. 1PI.Determine the cutoff sample score on the comparison distribution at which the null hypothesisshould be rejected. (Step 3 of hypothesistesting.) A. Before you figure out how extreme your particular sample's mean is on this distribution of means, you want to know how extreme it would have to be to decide it was too unlikely that it could have been a randomly drawn mean from this comparison distribution of means. B. Since the shape of the comparison distribution follows a mathematically defined formula, you can use a table to tell you how many standard deviations from the mean your score would have to be to be in the top so many percent. C. Note that the number of standard deviations from the mean on this t distribution is called a t score. (Explaining this term makes writing simpler later on.) ClzapterNine D. To use these tables, you have to decide the kind of situation you have; there are two considerations. 1.Are you interested in the chances of getting this extreme of a mean, one that is extreme in only one direction(such as only higher than for people in general) or in both? (Explain which is appropriate for your study.) 2.Just how unlikely would the extremeness of a particularmean have to be? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study--if no figure is stated in the problem and no specialreason given for using one or the other, the general rule is to use 5%.) E. State the cutoff for your particular problem. IV.Determine the score of your sample on the comparison distribution. (Step 4 of hypothesistesting.) A. The next step is to find where your actual sample's mean would fall on the comparison distribution, in terms of a t score. B. State this t score. V. Compare the scores in 3 and 4 to decide whether or not to reject the null hypothesis. (Step 5 of hypothesistesting.) A. State whether your mean (from Step 4) does or does not exceed the cutoff (from Step 3). B. If your mean exceedsthe cutoff. 1.Statethat you can reject the null hypothesis. 2.Statethat by elimination,the research hypothesis is thus supported. 3. State in words what it means that the research hypothesis is supported (that is, the study shows that the particular experimentalmanipulation appears to make a difference in the particularthing being measured). C. If your score does not exceedthe cutoff. 1.State that you can not reject the null hypothesis. 2. Statethat the experimentis inconclusive. 3. State in words what it means that the study is inconclusive (that is, the study did not yield results which give a clear indication of whether or not the particular experimental manipulation appears to make a differencein the particular variable being measured). 4.Explicitly note that even though the research hypothesiswas not supported in this study, this is not evidencethat it is false--it is quite possible that it is true but it has only a small effectwhich was not sufficientto produce a mean extreme enough to yield a significant result in this study. ChapterNine t Testfor Dependent Means I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesis testing.) A.Note that the entire problem involves difference(or change) scores. B. State in ordinary language the hypothesis testing issue: Does the average of the difference scores of the group of persons studied represent a higher (or lower or different) mean amount of difference than would be expected if this group of persons had just been a randomly selected sample of people in general in whom there is no difference? C. Expiain ianguage (to make rest of essay easier to write by not having to repeat long explanations), focusing on the meaning of each term in the concrete example of the study at hand. 1.Populations. 2. Sample. 3.Mean. 4.Researchhypothesis. 5.Null hypothesis. 6.Rejectingthe null hypothesis to provide supportfor the research hypothesis. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) A. Explain the principle that the comparison distribution is the distribution (pattern of spread of the means of difference scores) that represents what we would expect if the null hypothesis were true and our particular mean were just randomly sampled from this population of means of difference scores. B. Note that because we are interested in the mean of a sample of more than one case, we have to compare our actual mean to a distribution not of individual cases but of means. C. Give an intuitive understanding of how one might construct a distribution of means for samples of a given size from a particular population. 1.Select a random sample of the given size (number of subjects) from the population and compute its mean. 2. Select anotherrandom sample of this sizefrom the population and compute its mean. 3.Repeat this process a very large number of times. 4.Make a distributionof these means. 5.Note that this procedure is only to explain the idea and would be too much work and unnecessary in practice. ChapterNine D. There is an exact mathematical relation of a distribution of means to the population the means are drawn from, so that in practice the characteristics of the distribution of means can be determined directly from knowledge of the characteristics of the population and the size of the samples involved. E. The mean of a distribution of means. 1.It is the same as the mean (average) of the scores in the known population of individual cases--which is presumed to have a mean of zero because it represents a population in which there is on the averageno difference, and no differencemeans zero difference. 2.Explain why the mean of the distribution of means is the same as the mean of the population of individual cases. a. Each sample is based on randomly selected values from the population. b. Thus, sometimes the mean of a sample will be higher and sometimes lower than the mean of the whole population of individuals. c. There is no reason for these to average out higher or lower than the original population mean. d.Thus the average of these means (the mean of the distribution of means) should in the long run equal the populationmean (which is zero). F. The standard deviation of a distribution of means. 1.The spread of a distribution of means will be less spread out than the population of individual cases from which the samples are taken. a. Any one difference score, even an extreme score,has some chance of being selected in a random sample. b.However, the chance is less of several extreme difference scores being selected in the same random sample, particularly since in order to create an extreme sample mean they would have to be difference scores which were extremein the same direction. c. Thus, there is a moderating effect of numbers. In any one sample, the deviants tend to be balanced out by middle cases or by deviants in the opposite direction, making each sample tend towards the middle and away from extremevalues. d.With fewer extreme values for the means, the variation among the means is less. 2.The more cases in each sample, the less spread out is the distribution of means of that sample size: With a larger number of cases in each sample, it is even harder for extreme cases in that sample not to be balanced out by middle cases or extremes in the other direction in the same sample. 3.Explain the idea of standard deviation. a. It is a standard measure of how spread out a distributionis. b.Roughly, it is the average amount difference scores vary from the mean. c. Exactly speaking, it is the square root of the average of the squares of the amount that each difference score differs from the mean of the difference scores. 4.The standard deviation of the distribution of means is found by a formulathat divides the average of squared deviationsfrom the mean of the population by the number of subjects in the sample (thus making it smaller in proportion to the number of subjects in the sample) and taking the square root of this result. ClznpterNine 5.Computing this standard deviation requires knowing the amount of variation in the population, and this is not known. But it can be estimated. a. Whatever the distributionyour particular difference scores come from, it is reasonable to assume that the variation among your particular group is representative of the variation in the larger distribution of difference scores. b.A sample's variation is on the average slightly less than the population it comes from because it is less likelyto include difference scores that are far from its mean. c. Thus, a special adjustment is made that exactly corrects for this: Instead of taking the average of the squared deviations--the sum of squared deviations divided by the number of subjects--oneinstead dividesthe sum of squared deviations by one less than the number of subjects in the sample. 6.Describe the steps of computing the estimated population variance and the standard deviation of the distributionof means for your example, statingthe fmal result. G. The shape of the distribution of means. 1.The distribution tends to be bell-shaped, with most cases falling near the middle and fewer at the extremes, due to the same basic process of extremes balancing each other out that we noted in the discussion of the standard deviation--middle values are more likely and extreme values less likely. 2. Specifically, it can be shown that it will follow a precise shape called a t distribution. 3.Actually there are different t distributions according to the amount of information that goes into estimating the variation in the distribution from the sample, the number you divide by in making the estimate (the number of subjectsminus one). 4.Als0, the shape of the comparison distribution is only a precise t distribution if the population of individual difference scores follows a precise shape called a normal curve (also bell-shaped) that is widely found in nature. Note that in the problem you are told that the distribution of the population is a normal curve so that this condition is met in your case. IILDetermine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesistesting.) A. Before you figure out how extreme your particular sample's mean difference score is on this distribution of means of difference scores, you want to know how extreme it would have to be to decide it was too unlikely that it could have been a randomly drawn mean from this comparison distribution of means. B. Since the shape of the comparison distribution follows a mathematically defined formula, you can use a table to tell you how many standard deviations from its mean (of zero) your difference score would have to be in order to be in the top so many percent. C. Note that the number of standard deviations from the mean on this t distribution is called a t score. (Explaining this term makes writing simpler later on.) ClzrrpterNine 185 D. To use these tables you have to decide the kind of situationyou have; there are two considerations. 1.Are you interested in the chances of getting this extreme a mean that is extreme in only one direction (such as only higher than for people in general) or in both? (Explain which is appropriate for your study.) 2.Just how unlikely would the extremeness of a particular mean have to be? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study--if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) E. Statethe cutoff for your particular problem. IV.Determine the score of your sample on the comparison distribution. (Step 4 of hypothesistesting.) A. The next step is to find where your actual sample's mean difference score would fall on the comparison distribution,in terrns of a t score. B. Statethis t score. V. Compare the scores in 3 and 4 to decide whether or not to reject the null hypothesis. (Step 5 of hypothesistesting.) A. State whether your mean (from Step 4) does or does not exceed the cutoff (from Step 3). B. If your mean exceeds the cutoff. I. Statethat you can reject the null hypothesis. 2.State that, by elimination,the research hypothesisis thus supported. 3.State in words what it means that the research hypothesis is supported (that is, the study shows that the particular experimental manipulation appears to make a difference in the particularthing being measured). C. If your score does not exceed the cutoff. 1.State that you can not reject the null hypothesis. 2.State that the experiment is inconclusive. 3.State in words what it means that the study is inconclusive (that is, the study did not yield results which give a clear indication of whether or not the particular experimental manipulationappears to make a differencein the particularthing being measured). 4.Explicitly note that even though the research hypothesis was not supported in this study, this is not evidence that it is false--it is quite possible that it is true but it has only a small effect which was not sufficient to produce a mean extreme enough to yield a significant result in this study. ChapterNine Chapter Self-Tests Multiple-Choice Questions 1. In the formula SSIN-1, "N-1" is known as a. the transformation coefficient. b.the transformed denominator. c.the degrees of freedom. d.the denominator transformationterm (DTT). 2. When estimatingthe variance of a population from the sample, you divide by the sample size minus one, because using the sample size directly a. does not correct for squaring the deviations. b.underestimatesthe population variance. c. fails to take into accountthe sample size. d.createstoo little "bias." . When testing the null hypothesis for a study with a single sample with 10 scores and an unknown population variance, the cutoff score on the comparison distribution will be the cutoff on a normal curve. a. more extreme than b.the same as c. less extremethan d.either more or less extremethan or the same as (depending on the population variance) 4. In a study of memory, an experimenter found that 10 subjectswho used imagery to learn a list of 20 words remembered on average 18.4 words. If the sum of squared deviations from the sample's mean is 6,what is the estimatedpopulation variance? a. 1.6118.4 = 0.09. b. 619 = 0.67. c. 18.4110 = 1.84. d.There are not enough data to estimate the population variance. 5. If a study yields a t score of 2.46, and the cutoff t score was 2.36, should you reject the null hypothesis? Why or why not? a. Yes, because the computed t score is more extreme than the cutoff score. b.No, because the computed t score is more extreme than the cutoff score. c. No, because the computed t score is too close to the cutoff t scoreto be significant. d.This can not be determinedwithout knowing the degrees of freedom. ChapterNine 6. A group of students take the SAT, then take an SAT prep class, then take the SAT again. To test the null hypothesis that there is no difference in students' SAT scores from before to after taking the prep class, which population mean would be used? a. Zero. b. The original (before) SAT score of the test group. c. The second (after) SAT score of the test group. d.The national mean SATscore. 7. A counselor claims that after attending three sessionswith her, clients score higher on a Satisfaction With Life scale than they do before counseling. Her null hypothesis is that there is no difference in clients' scores after counseling. If the cutoff t score is 2.0 and the standard deviation of the cemp~isezdistribut;.er?is 1.5, by h~v;=lmy p ~ h t sd~ c!iefits' sccres hzve te c h ~ g eh ,n,=:de:te justify rejectingthe null hypothesis? a. 1.5. b. 2.0. c. 2.5. d.3.0. 8. Which of the following is an assumptionyou must make before you can use the t test? a. The sampleis normally distributed. b. The sampleis skewed. c. The population is normally distributed. d.The population is skewed. 9. Suppose you are doing research on the general intelligence of people with eidetic memory (that is, people who have exact visual memory), and subjects are so hard to come by that you are limited to 10 subjects. If in fact there is a medium effect size in the population, and you are testing your hypothesis with a t test for dependent means, two-tailed, at the .05 level, what is the power of this study? Refer to Table 9.8 in the text for your answer. 10 A research articlereports results of a studyusing a t test for dependentmeans as "t(16) = 2 . 6 7 , ~< .05." This means a.the result is not significant. b. there were 16subjects. c.the t scorewas 16.. d.the t score was 2.67. ChapterNine Fill-In Questions 1. When estimating the population variance from the sample variance, the amount of information in the samplethat is free to vary is called the 2. When conducting a t test for a single sample, in the formula (M- p)IS,,,,, p is the 3. When figuring the variance of the distribution of means in a t test problem, you divide the estimatedpopulation variance by 4. If an assumption which is necessary to conduct a statistical analysis can be violated without seriously jeopardizing the results of the analysis, the test is said to be over that assumption. 5. With a t test for dependent means, an effect size of is consideredto be large. 6. Studiesusing difference scores often have effect sizes than other types of studies. 7. A research design in which subjects are tested and then re-tested is weak unless the group studied is compared to a(n) 8. A t distribution differs from a normal distribution in that there the tails contain a proportion of the cases. 9. To find the score at which the null hypothesis will be rejected, you look on a table oft distributions. But first you need to know whether it is a one- or two-tailed test, the degrees of freedom in the sample, and the 10 With sample sizes of 30 or more, a t distribution becomes almost indistinguishable from a(n) ClzapterNine Problems and Essays 1. A psychologist is interested in whether people who live in noisy parts of a city have worse hearing. To test this, she administershearing tests to six randomly selected healthy eighteen-year-olds who have grown up in one of the noisiest parts of the city. Their scores on the hearing test are 16, 14, 18, 18, 20, and 16. This test was designed so that healthy 18-year-olds should score 20. (The variance is not known.) What should the psychologist conclude? Explain your computations and the logic of what you have done to a person who has never had a course in statistics. 2. A cognitive psychologist theorizes that people will be able to form sentences with pleasant words quicker than with unpleasant words. To test her hypothesis, she creates a list of equal numbers of p!e~smt md cnpleasmt words, trjhg te select v~erdsef ~~p:exL~xite!jreqz! diEcu!ty, p~ttiigthe words in random order in the list. Subjects are then asked to form a sentence using each word, as soon the word is shown to them on a computer screen. A special timing device records how long it takes from the time the word appears on the screen until the subject starts speaking the sentence. Five subjects are used, and the average time it takes each subject to do the pleasant and unpleasant words are shown in the table below. Is the time it takes subjects to come up with a sentence different for the two kinds of words? Explain your procedures and the logic of what you have done to a person who has never had a course in statistics. Mean Reaction Time (in Seconds) Subject Pleasant Unpleasant 1 .44 .51 2 .33 .51 3 .60 .74 4 .59 .77 5 .68 .55 3. A developmental psychologisthas created a special exerciseprogram intended to improve hand-eye coordination of toddlers. As a first test of the effectivenessof this set of exercises, he arranges for a group of 38 toddlers to participate in the program, testing their hand-eye coordination on a standard test before and after several weeks using the program. In the report of the results of this study, he writes: "Mean scores for the 38 toddlers increased from 61.32 to 68.93, t(37)= 3.21,p < .01, one- tailed." Explain this result, including the underlying computationsthat went into it, to a person who has never had a course in statistics. ChapterNine 4. An instructor of a speed-reading course claimed that students could increase their reading speed without lowering their reading comprehension. A skepticalstudent decided to test this. She tested a group of 9 volunteers on their comprehension of a story before and after the class. She also noted their reading speeds, which an analysis showed to be significantly greater after the class, t(8)=3.5, p<.01. Their scores on the comprehension tests are listed below. Do the data support the claim of the instructor,that reading comprehension does not decrease? Subjects'Reading ComprehensionScores SUBJECT BEFORE AFTER 1 87 85 2 84 82 3 75 71 4 80 '78 5 97 91 6 92 91 7 88 89 8 67 71 9 70 70 Explain to a person who has never had a course in statistics: (a) the meaning of the reading speed result and (b) how you computed (including its logic) and what you found, in the analysis of the comprehensionscores. Using SPSS/PC+ StudentwarePlus with this Chapter If you are using SPSS for the first time, before proceeding with the material in this section, read the Appendix on Getting Started and the Basics of Using SPSS/PC+StudentwarePlus. You can use SPSS to cany out a t test for dependent means. (It is also possible to cany out a t test for a single sample using SPSS, but as noted in the chapter,this procedure is rarely used in practice and was covered mainly as a step towards introducing you to the more widely used t tests.) You should work through the example, followingthe procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to try the suggestions for using the computer to deepenyour understanding. I. Example A. Data: Hand-eye coordination scores for nine surgeons, each measured under both quiet and noisy conditions (fictional data), from the example in the text. The scores for the surgeons, under quiet conditions first, are 18, 12; 21, 21; 19, 16;21, 16; 17, 19;20, 19; 18, 16; 16, 17; and20, 16. ChapterNine 191 B. Follow the instructions in the SPSS Appendix for starting up SPSS and be sure the cursor is in the ScratchPad window. C. Enter the data as follows. 1.Type DATA LIST FREE 1QUIET NOISY. and press Enter to go to the next line. 2.Type BEGIN DATA. and press Enter to go to the next line. 3.Type 18, a space, and 12 (the first surgeon's scores for the quiet and noisy conditions, respectively. Then press Enter to continue to the next line. 4.Type 21 21 (the quiet score and the noisy scopre for the second surgeon) and press Enter. 5.Type the scores for the remaining subjects, on each line one subject's quiet and noisy score, in that order. 6.Type END DATA. and press Enter to move to the next line. D. Carry out a t test for dependent means as follows. 1.Type T-TEST PAIRS QUIET NOISY. and press Enter. The screen should now appear as shown in Figure SG9-1. Figure SG9-1 2.Tell SPSS to carry out your instructions--use the arrow keys to move to the start of the first line (DATA LIST ...), press F10, and press Enter. The result should look like Figure SG9-2. Clzapter Nine Chapter 10 The t Test for Independent Means Learning Objectives To understand, includingbeing able to conduct any necessary computations: El The principle of the distributionof differencesbetween means. E The mean of the distributionof differencesbetween means. H The variance of the distributionof differencesbetween means. H The shape of the distributionof differencesbetween means. The steps in conducting a t test for independentmeans. Assumptions for the t test for independent means (and the conditions under which it is safe to violate them). El Effect size of a study using the t test for independentmeans. E Power of a study using the t test for independentmeans. The problem of too many t tests. How results of studies using t tests for independentmeans are reported in research articles. Chapter Outline I. The t Test for Independent Means A. Used in studies with two samples. B. Follows the usual steps of hypothesistesting. C. Only differences from a t test for dependent means is that the comparison distribution is a distribution of differences between means and the t score is based on this distribution. II. The Distribution of Differences Between Means A. It is the comparison distribution in a t-test for independent means. B. It can be understood as being constructed in four steps. 1.Construct a distributionof means for each of the two populations. 2.Randomly select one mean fiom each distribution of means. 3.Subtractone fiom the other. 4.Repeat a large number of times to create a distribution of these differences. C. If the null hypothesis is true, its mean is zero-since the two populations, and hence the two distributions of means, have the same mean, differences between means randomly drawn from these should average out to zero. Chapter Ten 201 D. Its variance. 1.A single overall estimate is made of the variance of the populations, since it is assumed the population variances are equal. a. If sample sizes are equal, this pooled population variance estimate is the average of the estimatesfor the two populations-Sp2= (S, +S,) 12. b.If samples sizes are different, pooled population variance estimate is based on an average which first weights each estimate by the degrees of reedom on which it is based: SP2= [dfil(dfi+df,)l [Sl21+ [df,l(dfi+df,>l [S,'I. 2.The variance of each distribution of means is the pooled estimate divided by the corresponding sample size: SMZ= SPZI Nl and Smz = Sp21N,. 3.Variance of the distribution of differences between means is the sum of the variances of the two distributions of means (becausethe variance of each contributes to its variance): sDIF2= SMZ+sM2:. 4.The standard deviation of the distribution of the difference between means is the square root of its variance: SDE= dSDIFz. E. Its shape. 1.A t distribution (because we are using estimatedpopulation variances). 2.Its degrees of freedom are the sum of the degrees of freedom for the two samples (because each contributesto the pooled estimate of the variance): dfT = dfi +df,. F. The t score on this distribution is the difference between the two sample means divided by the standard deviation of this distribution:t = (MI- M,)IS,,,. III.Assumptions of the t Test for IndependentMeans A. Normal population distributions. 1.Violation of this assumption is a problem mainly if the two populations are thought to have dramatically skewed distributions,and in opposite directions. 2.The test is especially robust to violations if a two-tailed test is used and if the sample sizes are not extremely small. B. Equal population variances. 1.The test is fairly robust to even fairly substantial differences in the population variances if there are equal numbers in the two groups. 2.If the two estimated population variances are quite different, and if the samples have different numbers of cases, a modification of the usualt test procedure (not covered in this text) is sometimes used. C. It is difficultin practice to determinewhether the assumptionshold. D. Some procedures that can be applied when the assumptions are clearly violated are described in Chapter 15. Chapter Ten IV.Effect Size and Power for the t Test for IndependentMeans A. Effect size. l.d=(p, - p 2 ) / ~ . 2.For a completed study, estimate effect size as d= (MI -M2)/ S,. 3.Cohen's convention: Small d= .20;medium d =.50; large d = 30. (Same as fort test for dependent means.) B. Power. 1.Table 10.6 in the text gives approximate power for the .05 significance level for small, medium, or large effect sizes and one- or two-tailed tests. 2.Power when sample sizes are not equal. a.For any given number of subjects,power is greatest when subjects are divided into two equal groups. b.The power is equivalent to a study with equal sample sizes in which each of those equal sizes is the harmonic mean of the actual unequal sample sizes: N = [(2)(Nl)(N2)I/[Nl+N21. C. Planning sample size: Table 10.7in the text gives the approximate number of subje~t~neededto achieve 80% power for estimated small, medium, and large effect sizes using one- and two-tailed tests, all using the .05 significancelevel. V. Controversiesand Limitations: Too Many t Tests A. If you conduct a large number oft tests, the chance of any one of them coming out significantat, say, the 5% level is really greaterthan 5%. B. The controversy is about how to deal with this problem. C. Solutions are complicatedin most research situations for two reasons. 1.Each test is not independentof the others. 2. Some tests are more importantthan others. VI.How t Tests for Independent Means are Described in ResearchArticles A. They are usually reported in the article text by giving the two sample means (and sometimes the SDs), followed by the standard format for any t test-for example: t(38) = 4.72,p < .Ol. B. Sometimes they are reported in tables which give the means (and sometimes SDs), using starsto indicate significancelevels. Chapter Ten Formulas I. Pooled estimate of populationvariance (Sp2) Formula in words: Weighted average of estimates for each population; that is, each estimate contributes the proportion to the overall estimate that its degrees of freedom are to the total degrees of freedom. Formula in symbols: SPZ= [dfil(dfi+df,)][Slz]+[~&l(dfi+df~)][S~~] SIZ is the unbiased estimate of the variance of Population 1. S,Z is the unbiased estimate of the variance of Population2. Note: Each S=C(X-Wldf= SSldf (see Chapter 9). ~$5is the degrees of freed~iiif ~ rthe sziiple fiom Pupuiation i. df, is the degrees of freedom for the sample fromPopulation 2. Note: Each df =N-1. 11. Variance of each population's distributionof means (S> and SM23 Formula in words: The pooled population variance estimate divided by the correspondingsample size Formulas in symbols: Sm2 = SP21N1 and Sm2 = SPZI Nz Nl is the number of subjects in the sample representingPopulation 1. N, is the number of subjects in the sample representingPopulation 2. 111. Variance of the distribution of differences between means (SDIF2) Formula in words: Sum of the variances of the two distributionsof means. Formula in symbols:SDIFZ= SmZ +Sm2 TV. Standard deviation of the distribution of the differences between means (SDIF) Formula in words: The squareroot of the variance of the distribution of differences between means. Formula in symbols:S,, = dsDFZ V. Degrees of freedom for a t test for independent means (dfT) Formula in words: Sum of the degrees of freedom in the two samples. Formula in symbols: dfT= dfi +dfi VI. t score for a t test for independent means Formula in words: the difference between the two sample means divided by the standard deviation of the distribution of differences between means. Formula in symbols:t = (MI- M2)/SDE MI is the mean of the sample representingPopulation 1. M, is the mean of the sample representingPopulation 2. Chapter Ten VII. Effect size for a t test for independent means (4 Formula in words: the hypothesized difference between the population means divided by the population standard deviation. Formula in symbols: d= (p, - y2)1 o y, is the mean of Population 1. y2 is the mean of Population 2. o is the standard deviation of each of the populations. VIII.Estimated effect size for a t test for independent means (d) for a completed study Formula in words: the observed difference between the sample means divided by the pooled estimate of the population standard deviation. Formula in symbols: d= (MI- M2)1S, S, is the pooled estimate of the population standard deviation (S, = dsP2). IX. Harmonic mean sample size (A") Formula in words: Twice the product of the two sample sizes divided by the sum of the sample sizes. Formula in symbols:R'= [(2)(N1)(N2)]/ [N,+N2] CIzapter Ten How to Conduct a t Test for IndependentMeans (Based on Table 10.4in the Text) I. Reframe the question into a null and a research hypothesis about populations. 11. Determine the characteristics of the comparison distribution. A. Its mean will be 0. B. Its standard deviationis computed as follows. 1.Compute estimated population variances based on each sample (that is, compute two est;Jr,tes). 2.Compute pooled estimate of population variance: Sp2 = [df,/(df,+df,)][S12] + [dfil(dfi+dfi)] [S;]. Note: dfi=Nl-1 and dfi=N2-1. 3.Compute variance of each distribution of means: SM2= Sp21NIand SM2= SP2/N2. 4.Compute variance of distribution of differencesbetween means: SDIFZ=SM2+Sm2. 5.Compute standard deviation of distribution of differencesbetween means: SDF= .IsDIF'. C. Determine its shape: It will be a t distribution with df, degrees of freedom (dfT = dA+dfi). 1II.Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine the degrees of freedom (df,), desired significance level, and whether to use a one- or two-tailed test. B. Look up the appropriate cutoff on a t table; if exact df is not given, use df below. IV.Determine the score of your sample on the comparison distribution: t = (M, - M2)1sD1P V. Compare the scores in 3 and 4 to decide whether or not to reject the null hypothesis. Chapter Ten Outline for Writing Essays for Hypothesis Testing Problems Involvingthe Means of Two Independent Samples (t test for Independent Means) The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place for those of you who are better at words than numbers to shine. And for those better at numbers to develop their skills at explaining in words.) Thus, to do well you need to be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structureyour essays. There are other completely correct ways to go about it. And this is an outline for an answer-you must write the answer out in paragraph form. Examples of full essays are in the answersto Set I PracticeProblems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One--- shortcut you may see on a test is to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself start being just a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire whole course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellentway to study. I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesistesting.) A. State in ordinary language the hypothesistesting issue: Do the two groups of persons studied represent larger groups or "populations" of people whose averages are different? Chapter Ten B. Explain language (to make rest of essay easier to write by not having to repeat long explanations each time), focusing on the meaning of each term in the concrete example of the study at hand. 1.Populations. 2.Sample. 3.Mean. 4.Research hypothesis. 5.Null hypothesis. 6.Rejecting the null hypothesisto provide supportfor the research hypothesis. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) A. Explain principle that the comparison distribution is the distribution (pattern of spread of differences between means of scores of two groups) that represents what we would expect if the null hypothesis were t p e and our particular two groups each consist of randomly selected scores fiom populations that have the same mean. B. Thus, the comparison distribution is the distribution of differences between means of two groups where the two groups really are not froin two different populations, but the two populations are one and the same. C. Note that because we are working with a difference between means of two groups, we have to compare our actual difference between the means of two groups to a distribution not of individual cases, but of differences between means of two groups. D. Give an intuitive understanding of how one might construct a distribution of differencesbetween means of two groups. 1.Create a distributionof means based on the populationthe first grouprepresents. a. Select a random sample of the size (number of subjects) of your first group from the population it represents and computeits mean. b. Selectanotherrandom sampleof this size from this population and compute its mean. c. Repeat this process a very large number of times. d.Make a distributionof these means. 2.Create a distributionof means based on the populationthe second group represents. 3. Select one mean from each distribution of means and fmd the difference (subtract the one from the other). 4.Repeat this process many times. 5.Make a distribution of these differences. 6.Note that this procedure is only to explain the idea, and would be too much work and unnecessary in practice. E. There is an exact mathematical relation of a distribution of differences between means to the population the means are drawn fiom, so that inpractice 208 Chapter Ten the characteristics of the distribution of means can be determined directly fiom knowledge of the characteristicsof the population and the size of the samples involved. F. Since we are still assuming the null hypothesis is true, the mean of a distributionof differencesbetween means is zero because the distributions of means from each will have the same mean, and differences between means taken fiom them should average out to zero. G. The standard deviation of a distribution of differences between means is a measure of the amount of spread or variation in the differences. (Define variance and standard deviation in lay language.) This standard deviation is computed in steps. 1.Estimate the variation in each of the populations which each grouprepresents. a. Whatever the distribution a particular group's scores come fiom, it is reasonable to assume that the variation amongthe scores in your particular group is representative of the variation in that larger distribution of scores. b.A sample's variation is on the average slightly less than the population it comes fiom because it is less likely to include scoresthat are far from its mean. c. Thus, a special adjustment is made that exactly corrects for this: Instead of taking the average of the squared deviations-the sum of squared deviations divided by the number of subjects-one instead divides the sum of squared deviationsby one less than the number of subjects in the sample. d.Describethe computations (and state results) for your two groups. 2.Averagethe estimatesto get a more accuratepooled estimate. a.Normally we assumethe two populationshave the same amount of variation. b.The averaging is done so as to give weight to each estimate in proportion to the informationit contributes (which is the number of cases minus one). c. Describethe computations and state the results. Chapter Ten 3.Compute the variance of each distribution of means. a. The spread of each distributionof means will be less spread out than the population of individual cases from which the samples are taken because of the followingreasoning. i. Any one score, even an extreme score, has some chance of being selected in a random sample. ii.However, the chance is less of very many extreme scoresbeing selected in the same random sample (what is required to create an extreme sample mean), particularly since scoreswould have to be extreme in the same direction. iii. Thus, there is a moderating effect of numbers: In any one sample the deviants tend to be balanced out by middle cases or by deviants in the opposite direction, making each sampletend towards the middle and away from extreme values. iv. With fewer extreme vali.~esfor the means, the vaiatltinn zmong fie means Is !ess. b.The more cases in each sample, the less spread out is the distributionof means of that sample size: With a larger number of cases in each sample, it is even harder for extreme cases in that samplenot to be balanced out by middle cases or extremes in the other directionin the same sample. c. The variance of each distribution of means is found by a formula that divides the estimated population variance by the number of subjects in the group representing it (thus making it smaller in proportion to the number of subjectsin the group). d.Describethe computationsand state the results. 4.Find the standard deviation of the distribution of differencesbetween means. a. The variance in each distribution of means contributes to the variance in the differencesbetween the means. b.The variance of the distribution of means, in fact, comes out to the sum of the variances of the two distributionsof means. Its standard deviationis the square root of this result. c. Describe the computationsand statethe results. H. The shape of the distributionof differencesbetween means. 1.The distribution tends to be bell-shaped, with most cases falling near the middle and fewer at the extremes, due to the same basic process of extremes balancing each other out that we noted in the discussion of the standard deviation-middle values are more likely and extremevalues less likely. 2.Specifically,it can be shownthat it will follow a precise shape called a t distribution. 3.Actually there are different t distributions according to the amount of information that goes into estimating the variation in the distribution from the sample, which is the sum of the numbers you divide by in making the estimates (one less than the number of subjects in the first group plus one less than the number of subjectsin the second group). 4.But the shape of the comparison distribution is only a precise t distribution if both populations of individual scores follow a precise shape called a normal curve (also bell- shaped) that is widely found in nature. Note that in the problem you are told that the distributions of the populations are normal curves so that this condition is met in your case. Clzapter Ten 1II.Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesistesting.) A. Before you figure out how extreme the particular difference between the means of your two groups is on this distribution of differences between means, you want to know how extreme your difference would have to be to decide it was too unlikely that it could have been a randomly drawn mean from this comparisondistribution. B. Since the shape of the comparison distribution follows a mathematically defined formula,you can use a table to tell you how many standard deviations from the mean your score would have to be in order to be in the top so many percent. C. Note that the number of standard deviations from the mean on this t distribution is called a t score. (Explainingthis term makes writing simpler as you go along.) D. To use these tables you have to decide the kind of situation you have, and there are two considerations. 1.Are you interested in the chances of getting this extreme of a difference between means that is extreme in only one direction (such as only higher for one group than the other) or in either? (Explain which is appropriatefor your study.) 2. Just how unlikely would the extremenessof a particular mean have to be? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) E. Statethe cutoff for your particular problem. IV.Determine the score of your sample on the comparison distribution. (Step 4 of hypothesis testing.) A. Find where the'actual difference your two groups' means would fall on the comparisondistribution, in terms of a t score. B. Statethis t score. V. Compare the scores obtained in Steps 3 and 4 to decide whether to reject the null hypothesis. (Step 5 of hypothesis testing.) A. State whether your difference between means (from Step 4) does or does not exceed the cutoff (from Step 3). Chapter Ten B. If your difference between means exceeds the cutoff, you write out the following. 1.You can reject the null hypothesis. 2.By elimination,the research hypothesis is thus supported. 3. Say what it means that the research hypothesis is supported. (That is, the study shows that the particular experimental manipulation appears to make a difference in the particular thing being measured, or that people in general of the kind represented by one of your groups are probably really different or have been changed on the thing being measured from people in general of the kind represented by your other group.) C. If your score does not exceed the cutoff,- write out these points. 1.You cannot reject the null hypothesis. 2.Tie experiment is inconciusive. 3. Say what it means that the study is inconclusive. (That is, the study did not yield results which give a clear indication of whether or not the particular experimental manipulation appears to make a difference in the particular thing being measured; or that it is not clear based on this study whether people in general of the kind represented by one of your groups are really different or have been changed on the thing being measured from people in general of the kind represented by your other group.) 4.Explicitly note that even though the research hypothesis was not supported in this study, this is not evidencethat it is false-it is quite possibly true but the thing studied has only a small effect,not sufficientto produce a mean extreme enough to yield a significant result in this study. Chapter Self-Tests Multiple-Choice Questions 1. A distinguishing feature of the t test for independent means is a. dependent populations are treated as if they are unrelated. b.the difference between the means of two independent samples is evaluated. c. variance is not used in this procedure. d.the variance of the parent populations is unrelated to the variance of the samples. 2. When conducting a t test for independent means using a two-tailed test, the null hypothesis typically states that a. the mean of Population 1is the same as the mean of Population2. b.the mean of Population 1is different from the mean of Population 2. c. the variance of Population 1is less than or the same as the variance of Population 2. d.the variance of Population 1 is different from the variance of Population 2. Chapter Ten 3. In a t test for independent means, because there are two samples we end up with two estimates of the population variance. If the sample sizes are different, the two estimates are combinedby a. directly averagingthe two estimates into one number. b.finding a weighted average. c. pooling the raw data of each sample, then finding the variance of the new super sample. d.fmding the difference of the two estimates (that is, the estimate for Population 1 minus the estimate for Population 2), and using a special table to look up the new estimate based on that difference. 4. The distributionof differences between means has a mean of a.the population variance divided by N. b,the pooied mean of the sample means. c. 0. d. 1. 5. The variance of a distribution of differences between means is a.the smallest of the two variances of the two distributionsof means. b.the largest of the two variances of the two distribution of means. c.the average of the variances of the two distributions of means. d.the sum of the variances of the two distributions of means. 6. In the formula t = (M, -M2)/SDIF,the "Ms" refer to a.the means of the two populations. b.the means of the distribution of means. c.the hypothesizedmeans of the two population. d.the means of the two samples. 7. A study compares the scores of men and women on a scale measuring how important beauty is to the subject. If there are 20 women and 15 men, what proportion of the pooled estimate of the population variance will come from men? a. 15/20= 75% b. (15-1)/(20-1) = 14/19= 74% c. 50% d.(15-1)/((20-1)+(15-1)) = 14/33= 42%. 8. When conducting a t test for independent means, if the assumption of normality is seriously violated, you should a.not be concerned because the t test for independent means is highly robust even under extreme violations of the assumptions. b.use a procedure other than the t test for independentmeans. c.proceed, but interpretyour results with caution. d.proceed ONLY if the population variances are not skewed. Chapter Ten 9. In a study with 30 subjects total (divided into a control group and an experimental group), which of the following cases would be the most powerful? a. The experimental group has 20 subjects and the control group has 10. b.Both groups have 15 subjects. c. The control group has 20 subjects and the experimental group has 10. d.The control group has 29 subjects and the experimental group has 1 subject. 10.Which of the following is part of the process of computing a t test for independent means? a. Each sample's standard deviation is divided by its sample size to find the standard deviation of its population's distribution of means. b. The population variance, which is known, is used to find the variance of the two samples. c. The popuiation variance is esrimared, rhen that estimate is used to iind the variance of each oTthe distributions of means. d.An estimate of the population mean, based on pooled sample means, is translated into a t score, and then compared to a t distribution. Fill-In Questions 1. In a study of the effects of a particular drug on creativity, subjects were evaluated while taking part in a creative task. During the task 10 subjects were under the influence of the drug and 10 subjects were not. A t test for would be conducted to analyze the data. 2. "Sp2= (S,+S,) I 2" is a formula used for the pooled estimate of the population variance when sample sizes are 3. The formula, "Sp21N,," is used to compute the variance of Sample 2's 4. When the variances of the distribution of means for both samples are added together, the result is the variance of 5. In the formula "t= (MI - M,)ISD,," SD, is the of the distribution of differences between means. 6. & 7. The assumptions for a t test for independent means require that the populations are both and have the same 8. When computing the effect size (d)of a completed experiment, based on the actual observed data, the difference between the observed means is divided by Clzapter Ten 9. There is (more, less, about equal) power in a study in which there are 5 subjects in Sample 1 and 15 in Sample2 than in a study with 10subjects in each sample. .A study with 15 subjects in one condition and 10 in the other, using a t test for independentmeans, yielded a t of 3.21, which was significant at the .O1 significance level, one-tailed. Write these results in the standard format (using appropriate symbols, etc.) as they would be reported in the text of a research article. Problems and Essays 1. As a senior thesis a psychology major examines the effects of self-defense training on self- confidence. (This is a new program and it is not clear whether it will increase self-confidence, decrease it, or make no difference.) Five of ten volunteers are randomly selected to receive self- defense training. The other five receive no special training. At the end of the training period, all subjects complete a self-confidence questionnaire. (a) Is there a difference in self-confidencebetween the two groups, according to the data below (use the .O1 significance level)? (b) Explain your analysisto a person who has never had a course in statistics. Self-confidence Scores For Subjects WhoDo and Do Not Receive Self-Defense Training TRAINING NO TRAINING 15 14 18 16 14 19 17 18 15 13 Clzapter Ten A social psychologist conducted a study of whether she could produce a placebo effect on intelligence. (A "placebo"is an inactive drug or a fake treatment. A "placeboeffect" occurs when a subject reacts to a placebo as if it were a real drug or treatment.) She randomly divided seven subjects into two groups. All seven were given pills (known to have no true effect) to take at the start of the experiment in which they were told that they would first be given some questionnaires, including an intelligence test, for background information, and then would undergo some physiological testing to measure effects of the "vitamin" on levels of red blood cells. Three of the subjects were randomly assigned to be told that these pills would take an hour to have any effect, and if they noticed anything at all from them even then, it would be some tingling in the feet. The other four were told that this vitamin has been found to enhance alertness and mental agility during the first hour and then to have no special effect except possibly some tingling in the feet. The table 1--~~1oi.vshoiiisthe scores oiithe iztte!!igence test. (a) Based on these data, is intelligence test performance increased by a placebo pill? (Use the .05 significance level.) (b) Explain your conclusion and procedureto a person who has never had a course in statistics. Intelligence Scores ofPlacebo Group and Control Group PLACEBO CONTROL 85 89 97 76 105 99 74 3. Do people who are health-conscious get better grades? To address this question, a researcher first assessed the degree of health consciousness (using a questionnaire) of a group of college students. Of those, the top 15 and the bottom 15 were selected, forming the High Health-Conscious (HHC) group and the Low Health Conscious (LHC) group, respectively. The researcher reported: "The HHC group was not found to have a significantly higher GPA than the LHC group (HHCM=3.2, SD =.33; LHC M=3.0, SD =.68; t(28) = 1.06)." Explain and interpretthese results to a person who is not familiar with statistics. Chapter Ten 4. A personality psychologist is interested in whether introverts and extroverts differ in their sensitivityto light. Forty introverts and forty extraverts are each measured in a series of visual tasks and the results are shown in the table below. (Each is a measure of sensitivity, with higher numbers indicating greater sensitivity.) Explain and interpretthese resultsto a person not familiarwith statistics. Means of Introverts and Extroverts on Perceptual Sensitivity Measures Introverts Extroverts t Light Intensity-A 4.16 13.21 0.78 Light Intensity-B 1422.12 1238.63 4.21** Color Contrast 107.16 94.16 1.91* Adaptation-A .0024 .0013 2.68** P.dqt&i~n-I? !.!?I !.9? -!.07 Chapter Ten Using SPSS/PC+ Studentware Plus with this Chapter If you are using SPSS for the first time, before proceeding with the material in this section read the Appendix on Getting Started and the Basics of Using SPSS/PC+StudentwarePlus. You can use SPSS to cany out a t test for independent means. You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to try the suggestions for using the computerto deepen your understanding. Example A. Data: Employee performance after several months on the job of seven subjects randomly assigned to a special job skills program and seven subjects randomly assigned to the standardjob skills program (fictional data), from the example in the text. The scores for those receiving the special program are 6, 4, 9, 7, 7, 3 and 6. The scores for those receiving the standard program are 6, 1, 5, 3, 1, 1,and 4. B. Follow the instructions in the SPSS Appendix for starting up SPSS and be surethe cursor is in the ScratchPad window. C. Enter the data as follows. 1.Type DATA LIST FREE 1PROGRAM PERFORM. and press Enter to go to the next line. 2.Type BEGIN DATA. and press Enter to go to the next line. 3.Type 1, a space, and a 6; then press Enter. The 1 stands for the specialprogram, and the 6 stands for the performance score of the first subject in that special program. (This system is used because SPSS assumes that all scores on the same line are for the same subject-thus, you do not lay out the data as a column for each condition.) 4.Type 1 4 (the program and performance score for the second subject) and press Enter. 5.Type the program and performance scores for the remaining subjects in the special program. 6.Type 2 6 (the 2 is being used to stand for the standard program, the 6 is for the performance score of the first subject in the standardprogram. 7.Typethe scores of the remaining subjects. 8.Type END DATA. and press Enter to move to the next line. D. Carry out the t test for independentmeans as follows. 1.Type T-TEST GROUPS PROGRAM(1,2) 1 VARIABLE PERFORM. and press Enter. Figures SGIO-1 and SG10-2 show the entire set of typed lines. Chapter Ten Chapter 11 Introduction to the Analysis of Variance Learning Objectives To understand, including being able to conduct any necessary computations: When it is appropriate to use an analysis of variance. The within-group estimate of the population variance. The between-group estimate of the population variance. The Fratio. The Fdistribution and using an Ftable. The steps in conducting a one-way analysis of variance. Assumptions for the analysis of variance (and the conditions under which it is safe to violate them). Effect size V) of a one-way analysis of variance. Power of an experiment using a one-way analysis of variance. Effects on power of matching by groups. How results of studies using analysis of variance are reported in research articles. Chapter Outline I. Basic Logic of the Analysis of Variance A. Used in studies with three or more samples. B. Follows the usual steps of hypothesis testing. C. The null hypothesis is that the three or more populations being compared all have the same mean. D. Population variance can be estimated by averaging variance estimates from all samples. This is called the within-groupestimate of the population variance. E. Population variance can also be estimated based on variation among means of samples. This is called a between-groupestimate of the population variance. 1.When the null hypothesis is true. a. All populations have same mean. b.Any variation among means of your particular samples thus can only represent variation among individual scores in the populations. c. Thus, variance of individual scores can be estimated from variation among samples. 2. When the research hypothesis is true. a. The populations have different means. b.Any variation among means of your particular samples thus can represent both variation among individual scores in the populations plus variation between the population means. ChapterEleven 229 c. When a between-group estimate is computed in this case, it reflects both sources of variation. F. Decisions regarding the null and research hypotheses involve comparing the within-groupand between-group estimates of the population variance. 1.When the null hypothesis is true, the two should be about the same. 2.When the research hypothesis is true, the between-group estimate should be larger than the within-group estimate. 3.In terms of proportions (between-group divided by within-group), the proportion should . be about 1when the null hypothesisis true and greaterthan 1when it is not. 4. This proportion is called the FRatio. G. The Fdistribution and the Ftable. 1.Statisticians have developed the mathematics of an F distribution-the probabilities of gettingFs of various sizes under the null hypothesis. 2.Tables are available indicating how extreme the F ratio calculated from your samples has to be in order to reject the null hypothesis at various standardsignificancelevels. 3.You can think ofF in terms of an analogy to the signal-to-noiseratio in engineering, in which the signal is like the variation among means and the noise like the variation of individual scores. 11. Analysis of Variance Procedures A. Within-group population variance estimate. 1.Computepopulation variance estimates from the data in each sample-S=SSldJ: 2. Since all populations are assumed to have equal variances, if sample sizes are equal then these estimatescan be combined by a straightforwardaveraging. 3.This is calledthe within group variance or mean squares within and is symbolizedas Sw2 or MSw. B. Between-group population variance estimate. 1.Estimate the variance of the distributionof means, based on the means of your particular samples: Treating each mean as a number, apply the usual formula for estimated population variance. 2.Extrapolate from the estimated variance of the distribution of means to an estimated variance of the population of individual scores: Multiply by the size of each sample (just the opposite of the dividingyou do when going from a population of individualcases to a distributionof means). 3.This is called the between-group variance or mean-squares between and is symbolized as SB2or MSB. C. The F ratio is the between-group variance estimate divided by the within- groupvariance estimate. Chapter Eleven D. The Fdistribution. 1.Think of it as constructed by taking one random sample from each of several populations with the same mean, computing an F ratio, repeating this process many times, and constructing a distribution of these Fratios. 2.In practice, there is an exact mathematical Fdistribution. 3. The F distribution is not symmetrical, but is positively skewed because it is a ratio of variances (which must always be positive). 4.Using the F table requires a numerator degrees of freedom (number of groups minus 1) and a denominator degrees of freedom (sum of degrees of freedom over all the groups-number of subjects in each group minus one). 111. Hypothesis Testing with the Analysis of Variance: Follows same standard five steps. IV. Assumptions in the Analysis of Variance A. Same as in t test: normal populations with equal variance. B. Also as with the t test, the analysis of variance of the kind considered in this chapter is generally robust to moderate violations. C. Violation of normality is a problem when there is reason to believe populations are strongly skewed in different directions or your sample size is quite small. D. Violation of equal variances is a problem when the largest variance estimate of any group is 4 or 5 times that of the smallest. E. If assumptions are seriously violated, alternative procedures (described in Chapter 15)are available. V. Effect Size and Power for the Analysis of Variance A. Effect size. l.f= 0 ~ 1 0 . 2.For a completed study, estimate effect sizes asf= SJSW. 3.Cohen's conventions: Smallf = .lo;mediumf = .25;largef = .40. 4.f can also be computed for a published study which provides only the F and the number of subjects in each group: f= (dF)l(dn). B. Power. 1.Main determinants of power are effect size, sample size, significance level, and number of groups. 2.Table 11.7in the text gives approximate power for small, medium, and large effect sizes, for 3,4, or 5 groups, all using the .05significance level. C. Planning sample size: Table 11.8in the text gives the approximate number of subjects needed to achieve 80% power for estimated small, medium, and large effect sizes, for 3,4, or 5 groups, all using the .05 significance level. VI. Controversy: Random Assignment Versus Systematic Selection A. Matching groups of subjects prior to assignment to groups artificially reduces the natural variation among samples but not the variation within groups. Thus, on the average the Fwill be reduced and the power is lower. B.Recent researchers have noted, however, that under certain conditions-particularly where there are large group differences-power is actually increased by this kind of matching. VI. How Analyses of Variance are Described in Research Articles: They are usually reported in the article text by giving the sample means (and sometimes the SDs), followed by a standard format for reporting Fs-for example, F(3,67) = 5.21, p < .01. The first number in parenthesis is the numerator degrees of freedom, the second number is the denominator degrees of freedom. Formulas I. Within-group estimate of the population variance when sample sizes are equal (Sw2or MS,) Formula in words: Average of population variance estimates computed from each sample. Formula in symbols: S,Zor MSw= (SI2+SZ2+ ....+SLas:) I NG S,2 is the unbiased estimate of the variance of Population 1. SZ2 is the unbiased estimate of the variance of Population 2. S is the unbiased estimate of the variance of the population corresponding to the last group. Note: Each F=Z(X-M)2/df = SSldf (see Chapter 9) refers to where you are supposed to fill in the corresponding figures for the populations between 2 and the last. NG is the number of groups. 11. Variance of the distribution of means estimated from sample means (S2) Formula in words: Sum of squared deviations of each sample's mean minus the overall mean of all subjects, divided by the degrees of freedom for this estimate (the number of groups minus one). Formula in symbols: S,2 = C(M-GM)/dfB M is the mean of each sample. GM is the grand mean-the overall mean of all scores (also, when samplesizes are equal,the mean of the means-GM= ZMING). df, is the degrees of freedom in the between-group estimate-the number of groupsminus 1 (that is, df, =NG-I). ChapterEleven 111. Between-group estimate of the population variance when sample sizes are equal (SBZorM S B ) Formula in words: Variance of the distribution of means times the sample size. Formulas in symbols:SB2or MSB= (SM2)(n) n is the number of scores in each sample. IV. The Fratio Formula in words: The between-group estimate of the population variance divided by the within-group estimate of the population variance. Formulas in symbols: F=SB2/SW2Or F = MSB/MSW V. Numerator degrees of freedom or between-groupsdegrees of freedom (dfB) Formula in words: The number of groupsminus 1. Formula in symbols:df, = NG-I I ! VI. Denominator degrees of freedom or within-groups degrees of freedom (df,) Formula in words: The sum of the degrees of freedom for all of the groups. Formula in symbols:dfw = df,+df,+ .. .+df,,,, df, is the degrees of freedom for the first sample. dfi is the degrees of freedom for the second sample. df,,,, is the degrees of freedom for the last sample. Note: Each df =N - 1(see Chapter 9). 1 VII.Effect size for analysis of variances V) Formulain words: The hypothesized standard deviation of the distribution of means divided by the hypothesized standard deviation of the population of individualscores. Formula in symbols: f = oM/0 oM is the standarddeviation of the distributionof means. o is the standard deviation of each of the populations (which are assumed to all have the same standard deviation). VIII.Estimated effect size for analysis of variance V) for a completed study when calculations are available Formula in words: the computed estimate of the standard deviation of the distribution of means divided by the computed within-group estimate of the standard deviation of the population of individualscores. Formula in symbols:f = SM/Sw SM is the estimated standarddeviationof the distribution of means. Sw is the within-group estimate of the standard deviation of each of the populations of individualscores. ChapterEleven IX. Estimated effect size for analysis of variance V) for a completed study when only the Fratio and size of groups are available Formula in words: The Fratio divided by the square root of the number of subjects in each group. Formula in symbols: f = (dn/(dn) How to Conduct a One-Way Analysis of Variance (with Equal Sample Sizes) (Based on Table 11-6 in the Text) I. Reframe the question into a research hypothesis and a null hypothesis about populations. 11. Determine the characteristics of the comparison distribution. A. The comparison distribution will be an Fdistribution. B. The numerator degrees of freedom is the number of groups minus 1:df, =N, - 1. C. The denominator degrees of freedom is the sum of the degrees of freedom in each group (the number of cases in the group minus 1): dfw= dJ;+df, + . . . + dLast. 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine the desired significancelevel. B. Look up the appropriate cutoff on an F table, using the degrees of freedom calculated above. IV. Determine the score of your sample on the comparison distribution. (This will be an Fratio.) A. Compute the between-groupspopulation variance estimate (SB2or MSB). 1.Compute the means of each group. 2.Compute a variance estimatebased on the means of the groups: S,Z = C(M-GM)/dfB. 3.Convert this estimate of the variance of a distribution of means to an estimate of the variance of a population of individual scores by multiplying by the number of cases in each group: SB2or MSB= (SMz)(n). B. Compute the within-groupspopulation variance estimate (Sw2or MSW). 1. Compute population variance estimates based on each group's scores: For each group, S=SS/df: 2.Average these variance estimates: Sw2or MSw= (S,2+S,2+....+SLas:)/ NG. C. Compute the Fratio: F= SB2/Sw2(or F = MSBIMSw) 234 ChapterEleven V. Compare the scores in I11 and IV to decide whether or not to reject the null hypothesis. Outline for Writing Essays on the Logic and Computations for Conducting a One-Way Analysis of Variance (with Equal Sample Sizes) The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explainingin words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular I study you are analyzing; (c) state the various formulas in nontechnical language, I because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) I look back and be absolutely certain that you made it clear justwhy that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structureyour essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One shortcut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same shortcut in these practice problems (maybe writing for someone who understandsright up to whateverpoint you yourself start beingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Reframe the question into a null and a research hypothesis about populations. (Step 1 of hypothesis testing.) State in ordinary language the hypothesis testing issue. A. The interest in these groups is as representatives, or "samples," of larger groups, or "populations," of particular types of individuals (such as those exposed to various experimental manipulations). I ChapterEleven B. Thus you construct a scenario in which the populations do not differ, then do computations based on that scenario to see how likely it is such populations would produce samples of scores whose averages are as different from each other as are the averages of the particular samples in this study. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesistesting.) A. Explain logic of overall approach. 1.We make and compare two estimates of the variation within these populations (which are assumed to be the same). 2.If the scenarioof no differenceis true, then an estimate based on the variation among the averages of the samples should give the same result as an estimate based on the average variation within each sample. 3.But if the scenario is false (and the population averagesdiffer), then this will increasethe estimate based on the differences among the averages of the samples but will not affect the estimatebased on the variation within them. 4.Thus, if the scenario of no difference is true, the ratio of the two estimates(one based on differences divided by one based on variation within) should be about 1. If the scenario is false, the ratio should be greaterthan one. B. Explain Fdistribution. 1.Statisticianshave determined the probability of getting samples which produce ratios of different sizes under the conditionsin which the scenarioof no difference is true. 2.The probabilities depend on how many groups there are, and how many subjects within each group. 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesis testing.) A. Procedure: Determine significancelevel and look up cutoff on an Ftable. B. Explanation. 1.Begin by figuring out how large this ratio of the two variation estimates would have to be in order to decidethat the probability was so low that it is unlikely that the scenario of no difference could be true. 2.There are standardtables that indicate the size of these ratios associatedwith various low probabilities. ChapterEleven 3.To use these tables you have to decide the kind of situation you have. There are two considerations. a.How many groups and how many subjects in each group. (State the numbers for your study.) b. Just how unlikely would a ratio have to be to decide the whole scenario on which these tables are based (the scenario of no difference) should be rejected? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) , 4. Statethe cutoffFratio for your situation. IV. Determine the score of your sample on the comparison distribution. (Step 4 of hypothesis testing.) A. Estimate the populations' variancesbased on scores within the samples. 1.Procedure: Compute S=SSldf for each, then S,Z = (SI2+S2Z+ ... +S,,,,) 1NG. I 2.Explanation. I a. The variation in a sample oughtto be representative of the population it comes from. b. State variance formula in lay terms, noting a reason for squaring-to eliminate signs I that would cancel each other out. c. State unbiased variance formula in lay terms, noting a reason for dividing byN-1 I instead of N-to adjust for the tendency of sample variance to be smaller than population variance. d.In doing this kind of problem, we assume that all populations have equal variation (unlesswe have reason to think otherwise). e. Thus we can average the estimates of the variation of the populations to get a better overall estimate. B. Estimate the variance of the distribution of means. 1.Procedure: SM2= C(M-GWldfB. I 2.Explanation. a. Purpose: Intermediate step to computing the population's variation based on the variation among averages of samples. b. Think of a population of averages of samples taken at random from a population. c. The variation among the averages of these samples ought to reflect the variation among the individual cases in the population (the more variation in one, the more I variation in the other). d.If the scenario of no difference among population averages is true, then taking one sample from each population is the same as taking the samples all from the same population. e. Thus you can make an estimate of the variation in this population based on the variation among the averages of your sample. f. Remind the reader of the variance formula and unbiased variance formula already described. ChapterEleven C. Estimate the population'svariances based on variation among the averages of the samples: 1.Procedure: SBZ= SM2X n. 2.Explanation. a. The variation in a distribution of averages is less than in a distribution of individual scores (note that the exact relation is in proportion to the number of scores in each sample). b. Thus to estimate a population's variation based on the variation in a distribution of averages of samplestaken from it, you multiply by the size of each sample. D. Compute the ratio of the two variance estimates. 1.Procedure: F= SB2/SW2 2.Explanation and purpose: This is where you compute the crucial ratio of variation estimates for your particular samples by dividing the estimate based on variation between groups by the estimatebased on the variation within. V. Compare the scores in Steps 3 and 4 to decide whether or not to reject the null hypothesis. (Step 5 of hypothesis testing.) A.Note whether or not your F ratio exceeds the cutoff and draw the appropriate conclusion. 1.Reject the null hypothesis: The variation among the averages of your particular samples is so great that it seems unlikely that their populations are the same. So they seem to be different and the null hypothesis seems untrue. 2.Fail to reject the null hypothesis: The result is inconclusive. On the one hand, these results were not extreme enough to persuade you that the variation was due to the populations being different. On the other hand, it is still possible that they really are different, but because of the people who happened to be selected to be in your samples from the populations,this difference did not show up. B. Be sure to state your conclusion in terms of your particular measures and situation, so it is clear to a lay person just what the real bottom line of the study is. Chapter Self-Tests Multiple-ChoiceQuestions 1. When conducting an analysisof variance, a.the null hypothesis is that the populations have the same means. b.the sample variances are assumed to be the same. c.population variances must differ by no more than I SD. d.preliminary t tests are often conductedbetween the differentpopulations. 238 ChapterEleven 2. Which of the following is true about the within-groupestimate of the population variance in analysis of variance: a. it is unaffected by whether or not the null hypothesisis true. b. it reflects the variance caused by experimental conditions. c. if the research hypothesis is true, it is largerthan the true variance. d.its size is a reliable indicator of effect size. 3. When calculatingan analysisof variance, if the research hypothesis is in fact true, then a. the Fratio will always be significant. b.the between-group variance estimate is likely to be bigger than the within-group variance estimate. c. the within-groupvariance estimateis likely to be bigger than the between-group estimate. d. small sample sizes are sufficientto detect small differencesamong the variance of samplemeans. 4. Suppose IQ was tested with children divided into three groups according to their parents' parenting styles. Within the group of children of each parenting style,the IQ scoreswould a. be equal, because within-group variance is always 0. b.be equal, because the IQ's of different children within a parenting style grouping are more alike than they are different. c. vary, due to differencesamong the parenting styles. d.vary, reflecting variation normally found within the population of children of each parenting style. 5. When conducting an analysis of variance with groups of equal size, the overall estimate of the population variance,based on the variation within groups, a. can only be calculatedwhen the variance of the samples are assumed to be equal. b. is used to determinethe between-group variance. c. is estimatedby averagingthe individualestimates within each group. d. is exactlythe square root of the samplesize minus one. 6. Which of the following is the correct formula for S,2? ChapterEleven 7. If there are 5 cases in each sample, and the estimated variance of the distributionof means is 8, then MSB(or SB2)is a. 815 = 1.6. b.8X5=40. c. fI2/5= 12.8. d. 81(5-1)= 2. 8. One characteristicof the Fdistributionis that a.there is an inherentbias toward increased positive findings for an alpha level of .Ol. b.the degrees of freedom of the F ratio's numerator solely determines which F distribution is used as a comparison distribution. c, its range is -1 to +a,. d.it is positively skewed (the long tail to the right). 9. For the analysis of variance, effect size V) is determined by a. the degrees of freedom of the numerator of theF ratio divided by the degrees of freedom of the denominator. b. dividing the population standard deviation by the number of groups. c. dividing the differencebetween the means of the two groups with the most different means by the estimatedpopulation standarddeviation. d.dividing the standard deviation of the distribution of means by the standard deviation within populations. 10.After conducting an analysis of variance, if a researcher wanted to present his findings for publication and he had used 8 subjects in each group, what should the " " be in "F(3,) = 4.93, p < .05"? a. 8. b. 11. c. 28. d.32. Fill-In Questions 1. When conducting an analysisof variance, the null hypothesisis that population means are . 2. When conducting an analysis of variance, the -group estimate of the population variance should alwaysbe pretty accurate, regardless of whether the null hypothesis is true or not. 3. When conducting an analysis of variance and the null hypothesis is true, then the ratio of the between-group variance estimateto the within-group variance estimate should be about . ChapterEleven 4. In the signal-to-noise analogy used to clarify analysis of variance, the between-group variance is likenedto 5. In a one-way analysis of variances, "Z(M-GM)ldfB" is the formula for 6. In a one-way analysis of variance the between-group variance is computed by first finding the estimate of the variance of the ,then multiplyingthis estimateby the sample size. 7. When conducting an analysisof variance, one assumption that must be met is that the variance is the same in each 8. When calculatingthe effect size V) for a one-way analysis of variance, the standard deviation of the distribution of means (oMor SM)is dividedby 9. According to Cohen's conventionsfor effect size of one-way analysis of variance, a small effect size I V) is 10.Althoughthere are circumstanceswhere the reverse is true, in general when subjects are selected so that the averages of the groups come out the same (on such variables as intelligence, age, etc.), the power of such a design is ("greater than," "lessthan," or "the same as") the power of a study in which the subjects arejust randomly assigned to groups regardless of their scores on these variables. ChapterEleven Problems and Essays A social psychologist interested in the effects of media violence arranged to have 15 children watch a popular children'stelevisionprogram that included a lot of violence. Five were randomly assigned to watch the show on a small-screentelevision set, five on a standard screen set, and five on a large- set. Immediately after, each child was lefl in a room with various toys, and the number of seconds of play (during a short observationperiod) with violent toys was systematicallyrecorded. (a) Do the data below (which are fictional)suggestthat the size of screen on which children watch a violent show makes any difference in the subsequent amount of play with violent toys? (Use the .05 significance level.) (b) Explain your analysisto a person who has never had a course in statistics. TimePlaying with Violent ToysAfter Watchinga Violent Program on Televisionsof Dzfferent-SizedScreens SMALL STANDARD LARGE 65 76 45 68 54 58 59 59 56 54 60 64 57 52 48 2. A researcher was interested in how different groups perceive psychologists. She asked college students, psychologists, lawyers, and people in the general public how important they thought psychologistswere to the currentUnited States society (l=Not important, lO=Very important). (a) Do the data below (which are fictional)suggest that the different groups have differing opinions? (Use the .O1 significancelevel.) (b) Explain your analysisto a person who has never had a course in statistics. Importance of Psychologists According to College Students, Psychologists, Lawyers, and General Public COLLEGE PSYCHOLOGISTS LAWYERS PUBLIC 4 6 5 4 3 7 6 4 6 5 7 5 ChapterEleven 3. A health psychologist wanted to examine the relation of social support provided by people in different categories of relationship to the ability of a person to cope with a serious illness. The researcher identified 80 women, all of about the same age and suffering from the same serious illness, 20 of whom during the illness only had had contact with their husband; 20, only contact with their children; 20, only contact with a close woman friend; and 20, only contact with their parents. The (fictional) mean reported level of coping was 4.21 (S=3.0) for the husband-only group, 3.81 (S=2.0) for the children-only group, 5.29 (S=4.0) for the friend-only group, and 2.16 (S=3.0) for the parent-only group. (a) According to these data, is the type of relationship with the person who provides social support associated with different amounts of coping? (Usethe .05 level.) (b) Explain your analysisto a person who has never had a course in statistics. 4. A personality psychologist hypothesized that working adults in the general population, college students, and high school students would differ in their levels of satisfaction with life. She administered a Satisfaction With Life Index to 20 subjects in each group. She reported her results as follows: "The means were 6.52 (SD = 37) for working adults, 5.47 (SD = 1.13) for college students, and 4.80 (SD = 1.87)for the high school students. The difference was significant,F(2, 57) = 8.04, p < .01." Explain and interpret these results to a person who has never had a course in statistics. Using SPSS/PC+ StudentwarePlus with this Chapter If you are using SPSS for the first time, before proceeding with the material in this section read the Appendix on Getting Started and the Basics of Using SPSS/PC+StudentwarePlus. You can use SPSS to carry out the kind.of analysis of variance described in this chapter (as well as more advanced analyses of variance). But SPSS requires the raw data to conduct the analysis of variance-it is not set up to handle a problem in which you already have the means and standard deviations. I You should work through the example, following our procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestionsfor Additional Practice. Finally, you may want to try the suggestions for using the computer to deepen your understanding. Chapter Eleven Example A. Data: Ratings of a defendant's guilt by subjects randomly assigned to groups that receive either information that the defendant has a criminal record, information that the defendant has a clean record, or no information about the defendant's criminal record (fictional data), from the example in the text. The ratings for those in the Criminal Record group are 10, 7, 5, 10,and 8; for those in the Clean Record group, 5, 1, 3, 7, and 4; and for those in the No Information group, 4, 6,9, 3, and 3. B. Follow the instructions in the SPSS Appendix for starting up SPSS and be sure the cursor is in the ScratchPad window. C. Enter the data as follows. 1.Type DATA LIST FREE I INFOTYPE GUILT. and press Enter to go to the next line. 2.Type BEGIN DATA. and press Enter to go to the next line. 3.Type 1, a space, and a 10;then press Enter. The 1 stands for the Criminal Record group (which is arbitrarily labeled 1, and the 10 is the rating of guilt given by the first subject in the Criminal Record group). 4. Type the data in the same way for each of the remaining subjects, using 1 for those in the Criminal Record Group, 2 for those in the Clean Record Group and3 for those in the No Information group. 5.Type END DATA. and press Enter to move to the next line. D. Carry out the analysis of variance. 1.Type ONEWAY GUILT BY INFOTYPE(1,3) I STATISTIC 1. and press Enter. Figures SG11-1and SG11-2 show the entire set of t v ~ e dlines. DATA LIST FREE / INFWTYPE WILT. BEGIN DATA. 1 10 1 7 1 5 1 10 1 8 2 5 Figure SGl1-1 ChapterEleven Chapter 12 The Structural Model in the Analysis of Variance Learning Objectives To understand, includingbeing able to conduct any necessary computations: W The structural model of dividing each score's deviation from the grand mean into its deviation from its group's mean and its group'smean's deviation from the grand mean. The within-group estimate of the population variance determined using the structural model approach. W The between-group estimate of the population variance determined using the structural model approach. 1 Analysis of variance tables. W The relation of the structuralmodel approach to the Chapter 11method. The analysisof variance with unequal sample sizes. Planned comparisonsand the Bonferroniprocedure. Post hoc comparisons. Proportion of variance accounted for (R2) as a measure of effect size in analysis of variance, including its relation to$ W Uses of planned comparisons,including linear contrasts, versus overall, diffuseFtests. How multiple comparisonsare reported in research articles. Chapter Outline I. Principles of the StructuralModel A. Dividing up the overall deviation of each score from the grand mean into two parts: 1.Deviation of the score from the mean of its group. 2.Deviation of the mean of the score's group from the grand mean. B. The sum of squared deviations of each score from the grand mean equals the sum of the squared deviations of each score from its group's mean plus the sum of the squared deviations of each score's group's mean from the grand mean. Chapter Twelve C. Dividing each sum of squared deviation by the appropriate degrees of freedom gives the population variance estimates. 1.The between-groups population variance estimate (SBZorMSB) is the sum of squared deviations of each score's group's mean from the grand mean (SSB) divided by the degrees of freedom on which it is based (dh-the number of groups minus 1). 2.The within-groups population variance estimate (Sw2or MS,) is the sum of squared deviations of each score from its group's mean (SS,) divided by the total degrees of freedom on which this is based (df,, which is the sum of the degrees of freedom for all the groups). D. Logic of analysis of variance. 1.If the null hypothesis is true, the division of the overall deviation into two parts should be random, making population estimatesproducing an Fratio of about 1. 2. If the research hypothesisis true, the deviations of the group means from the grand mean should be greater than the deviations of the scores from their group's mean, making population estimatesproducing an Fratio greater than 1. 11. Analysis of Variance Tables A. These tables are used to show analysis of variance results and are based on the structural model approach. B. The columns give the following information. 1. Source (type of variance estimateldeviation score). 2. SS(sum of squared deviations). 3. df (degrees of freedom). 4. MS (mean squares-that is, population variance estimates). 5. F ratio. C. Each row refers to one of the variance estimates: between, within, and total. 111. Analysis of Variancewith Unequal Sized Groups A. Can not be done with the Chapter 11method. B. The structural-model approach automaticallymakes the necessary adjustments for unequal sample sizes. IV. Multiple Comparisons A. The overall analysis of variance does not test which specific population means are different from which. B. Multiple 'comparisons are procedures for significance testing for comparisons among specific population means. C. A major problem is keeping overall probability of falsely rejecting any null hypothesis at an acceptable level while testing many comparisons. Chapter Twelve D. Planned comparisons. 1.These are a subset of all possible comparisons that the researcher specifies in advance of the study. 2.A common approach to planned comparisons, the Bonferroni procedure, uses a more stringent significancelevel for each comparison, so that the overall chance of any one of the comparisonsbeing significantis still reasonably low. E. Post hoc comparisons. 1.These are all possible comparisons among groups to explore all possible differences. 2.Various procedures are used for post hoc comparisons which attempt to keep overall chance of falselyrejecting the null hypothesis low while maintaining adequatepower. V. Assumptions for an Analysis of Variancewith Unequal Sample Sizes A. Same as with equal sample sizes. B. Less robust to violations of equal population variances (than with equal sample sizes). VI. Effect Size: The Proportion of Variance Accounted for (R2) A. Sum of squared deviations of each score's group's mean from the grand mean, divided by sum of squared deviations of each score from the grand mean. B. Minimum 0, maximum 1. Square root is a correlation. C. Same as R2in multiple regression. D. More familiar indicator of effect size to most researchersthan$ E. Effect size conventionsfor R2: 1.Small= .O1. 2.Medium = .06. 3.Large = .14. VII.Controversy: OverallFVersus Targeted Planned Comparisons A. A controversial recommendation is to ignore the overall analysis of variance results in favor of specificplanned comparisons. B. Linear contrasts are often used in this context. They test a more complex particular predicted relationship, such as one that specifies the pattern of means expected for several groups. VIII. Multiple Comparisons as Described in Research Articles A. Planned comparisonsand linear contrasts are usually described directly. B. A common procedure with post hoc comparisons is to report means in tables with subscripted letters such that those having the same letter are not significantlydifferentfrom each other. Chapter Twelve Formulas I. Between-group estimate of the populationvariance (SB2orMSB) Formula in words: Sum of squared deviations of each score's group's mean from the grand mean, divided by the degrees of freedom on which it is based (the number of groupsminus 1). Formulas in symbols:SB2= C(M- GM)'/dfB or MSB= SSB/dfB ((12-2) M is the mean of each sample. GM is the grand mean-the overallmean of all scores. df, is the degrees of freedom in the between-group estimate-the number of groups minus 1(that is, dfB=N,-I). SSB is the sum of squared deviations of each score's group's mean from the grand mean-C(M-GA@. 11. Within-groupestimate of the populationvariance (Sw2or MSW) Formula in words: Sum of squared deviations of each score from its group's mean, divided by the total degrees of freedom on which this is based (the number of scores in each group minus 1, summed over all groups). Formula in symbols:Sw2= C(X-A@/dfw or MSw= SSw/dfw (12-3) X is each score. df, is the degrees of freedom in the within-group estimate-the number of scores in each group minus 1, summed over all groups-& = 4 +dfi + ...+ SSw is the sum of squared deviations of each score minus the mean of its gr~up-C(X-M)~. 111. Proportion of variance accounted for (R2) Formula in words: Sum of squared deviations of each score's group's mean from the grand mean, divided by the sum of squared deviations of each score from the grand mean. Formula in symbols:R2= SSBISST (12-4) SS, is the sum of squared deviations of each score from the grand mean-C(X-GM)2. Alternate formula for computing R2using F and degrees of freedom reported in a research article:R2= (F')(dfB)/([l;l[dfB]+dfw).(12-5) Chapter Twelve How to Conduct an Analysis of Variance Using the Structural Model Based Method (Based on Table 12-5 in the Text) I. Reframe the question into a research hypothesis and a null hypothesis about the populations. 11. Determine the characteristicsof the comparisondistribution. A. The comparison distribution will be an Fdistribution. B. The numerator degrees of freedom is the number of groups minus 1: df, = N, -1. C. The denominator degrees of freedom is the sum of the degrees of freedom in each group (the number of cases in the group minus 1): df, = df; +df, + ...+ I dfLast. D. Check the accuracy of your computationsby making sure that df, and df, sum 1 to df,(which is the total number of cases minus 1). I 111. Determine the cutoff sample score on the comparison distribution at which I the null hypothesis should be rejected. I A. Determine the desired significancelevel. B. Look up the appropriate cutoff in an Ftable. IV. Determine the score of the sample on the comparison distribution.(This will be an Fratio.) A. Compute the mean of each group and the grand mean of all scores. B. Compute the following deviations for each score. 1.Its deviation from the grand mean (X- GM). 2.Its deviation from its group's mean (X- M). 3.Its group's mean's deviationfrom the grand mean (M- GM). C. Square each of these deviation scores. D. Compute the sums of each of these three types of deviation scores (SST,SS, and SS,). E. Check the accuracy of your computations by making sure that SS, + SS, = ssT. F. Compute the between-groupvariance estimate: SS,/df,. G. Compute the within-group variance estimate: SSwldfw. H. Compute the Fratio: F= SB2/Sw2. V. Compare the scores obtained in Steps 3 and 4 to decide whether to reject the null hypothesis. Chapter Twelve 257 Analysis of VarianceTableand Symbolsfor the One-WayAnalysis of Variance Source SS & MS F Between SSB d f ~ MSB(or SBZZ)F Within ssw dfw MSw (or SwZ) Total SST d f ~ Formulasfor Each Section of theAnalysis of VarianceTable Source SS a MS E Between Z(M-GM)' NG-1 SSB/dfB MsB/MSW Within Z(X-M)2 * SSw~dfw Total C(X-GM)Z N-1 S S T ~ ~ T Definitions of Basic Symbols C is the usual sum sign-which here refers to adding up the appropriatenumbers for all cases M is the mean of a score's group GM is the grand mean NG is the number of groups X is each score N is the total number of cases in the study Chapter Twelve Outline for Writing Essays on the Logic and Computations for Conducting a One-Way Analysis of Variance Using the Structural Model Approach The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One short cut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same short cut in these practice problems (maybe writing for someonewho understandsright up to whatever point you yourself startbeingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesistesting.) State in ordinary language the hypothesis testing issue. A. The interest in these groups is as representatives, or "samples," of larger groups, or "populations," of particular types of individuals (such as those exposed to various experimental manipulations). B. Thus, you construct a scenario in which the populations do not differ and then do computationsbased on that scenarioto see how likely it is such populations Chapter Twelve 259 would produce samples of scores whose averages are as different fiom each other as are the averages of the particular samples in this study. 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) A. Explain logic of overall approach. 1.We make and comparetwo estimates of the variance within these populations (which are assumedto be the same). 2.If the scenario of no difference is true, then an estimatebased on the variation among the averages of the samples should give the same result as an estimate based on the average variation within each sample. 3. But if the scenario is false (and the population averages differ), then this will increasethe estimate based on the differences among the averages of the samples but will not affect the estimatebased on the variation within them. 4. Thus, if the scenario of no difference is true, the ratio of the two estimates (one based on differences divided by one based on variation within) should be about 1. If the scenario is false, the ratio should be greaterthan one. B. ExplainF distribution. 1.Statisticianshave determinedthe probability of getting samples which produce ratios of different sizes under the conditionsin which the scenario of no difference is true. 2.The probabilitiesdepend on how many groupsthere are and how many subjects. 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. (Step 3 of hypothesis testing.) A. Procedure: Determine significancelevel and look up cutoff on an Ftable. B. Explanation. 1.Begin by figuring out how large this ratio of the two variation estimates would have to be in order to decidethat the probability was so low that it is unlikely that the scenario of no difference could be true. 2.There are standardtables that indicate the size of these ratios associatedwith various low probabilities. 3.To use these tables you have to decide the kind of situation you have. There are two considerations. a. How many groups and how many subjectsoverall. (Statethe numbers for your study.) b.Just how unlikely would a ratio have to be to decidethe whole scenario on which these tables are based (the scenario of no difference) should be rejected? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) 4. Statethe cutoff Fratio for your situation. Chapter Twelve IV. Determine the score of your sample on the comparison distribution. (Step 4 of hypothesis testing.) A. Estimate the populations' variation based on scores within the samples. 1.Divide each score's deviation from the overall average into two components-its deviation from its group's average and its group's average's deviation from the overall average. These two deviations provide the basis for making the two estimates of the overall variation in the populations. 2.The estimate based on the variation of each score from its group's average is influenced only by variation within each group. It is computed by squaring each such deviation (to eliminate signs) and finding a kind of average of these squared deviations. (An ordinary average would underestimate the population variation, so instead one divides the total by the number of scores in each group minus 1.) 3.The estimate based on the variation of each score's group's average from the overall average is influenced both by the variation within each of the groups and any variation between groups. It is computed by squaring each score's group's average's deviation I from the overall average and finding a kind of average of these deviations (in this case, you divide by the number of groups minus one-again, this is an adjustment for I estimatingthe population variation from information in samples). B. Compute the ratio of these two estimates of the population variation. V. Compare the scores obtained in Steps 3 and 4 to decide whether to reject the null hypothesis. (Step 5 of hypothesis testing.) A.Note whether your F ratio exceeds the cutoff and draw the appropriate conclusion. 1.Reject the null hypothesis: The variation among the averages of your particular samples is so great that it seems unlikely that their populations are the same. So they seem to be different and the null hypothesisseems untrue. 2.Fail to reject the null hypothesis: The result is inconclusive. On the one hand, these results were not extreme enough to persuade you that the variation was due to the populations being different. On the other hand, it is still possible that they really are different, but because of the people who happened to be selected to be in your samples from the populations,this difference did not show up. B. Be sure to state your conclusion in terms of your particular measures and situation, so it is clear to a lay person just what the real bottom line of the study is. Chapter Twelve Chapter Self-Tests Multiple-ChoiceQuestions 1. All of the following are advantages of understanding the structural model of the analysis of variance, EXCEPT a. unequal sample sizes can be easily dealt with. b. deeper insights into the underlying logic of the analysis of variance can be obtained. c. the results of the analysisare more accurate even when using equal sample sizes. d.it will be easier to understandthe results laid out in computer printouts. 2. In the structural model of the analysis of variance, the deviation of a score from the grand mean is divided into a. an Fdistribution and a distributionof deviationsfrom the Fdistribution. b,the Master ContributingFactor and the Lesser Contributing Factor. c. the within-group median and the between-group median. d.the deviation of the score from the mean of its group and the deviation of the mean of its group from the grand mean. 4. The within-group population variance estimate is all of the following,EXCEPT a. C(X-GM)2+C(M-GM)Z. b.the sum of squared deviations of each score from its group mean, divided by the total degrees of freedom for the groups. c.SSildf,. d.MS,. 5. In an analysis of variance, when the null hypothesisis rejected,then a,the groups with the two most extreme means are likely to represent populations with different means, although that is not guaranteed. b.the groups with the two most extreme means always come from populations with two different means, but the other groups may or may not come from populations with differentmeans. c. all of the groups come from populationswith differentmeans. d.all of the groups come from populationswith the same mean. Chapter Twelve 6. In an analysis of variance in which you conduct four planned comparisons, each at the .O1 level, if you make no special adjustments, what is the approximate overall chance that at least one of the comparisons will be significant by chance? a. .0025. b. .01. c. .04. d. .20. 7. Compared to post-hoc comparisons, planned comparisons a. never have more power. b. almost always have less power. c. almost always have more power. d.have more power unless the planned comparison is a linear contrast. 8. When conducting an analysis of variance with unequal sample sizes, a. violations of equal population variances are more serious than when there are equal sample sizes. b. violations of equal population variances are less serious than when there are equal sample sizes (provided the assumption of normal population variances is met). c. the assumption of normal population distributions does not apply. d.the assumption of equal population variances does not apply. 9. The degree to which variation in the dependent variable is explained by the independent variable (RZ) a. indicates the degree to which the dependent variable causes an effect in the independent variable. b. is determined by the Bonferroni procedure. c. is the same as the degree to which a subject's particular score is related to which group the subject is in. d. depends mainly on the size of the samples. 10.If a study yields 100 as a sum of squared deviations of scores from the grand mean and 25 as a sum of squared deviations of scores from their group means, then the proportion of variance accounted for is a. 100/(100-25) = 1.3. b. (100-25)/100 = .75. c. 251100 = .25. d. 100125= 4.0. Chapter Twelve Fill-In Questions 2-4. For questions 2-4, refer to the followingdata: GROUP A GROUP B X (X-GM)' (X-M)' (M-GNz X (X-GM)Z (X-M)' (M-GM)' 5. Fill in the missing numbers: Source SS df MS F Between 40 4 - - Within 50 25 - Total - - 6. In the , a specific planned comparison, the total significance level is kept low by dividing alpha equally between the comparisons. 7. In ,all possible comparisons have to be considered in evaluating the chance of getting any one of them significant. 8. For RZ, is a medium effect size. Chapter Twelve 9-10. A psychology professor studied the effects of coffee on first year college students. All of the students in his class volunteered for his study, and he randomly divided them into three groups: a Caffeine group which drank two cups of regular coffee; a Decaf group which drank two cups of decaffeinated coffee (without knowing it was decaffeinated); and a No-Coffee group. Immediately afterwardsa pop-quiz was given. The professor recorded the quiz scores and also the students' responses to a measure of nervousness. Use the results below to answer questions 9 and 10. MEASURE NO COFFEE CAFFEINE DECAF F(2,57) Quiz Score 6.39, 8.21, 7.11, 5.75** Nervousness 68.04, 90.03, 92.06, 3.48* *pC.05 **p<.Ol Note: Within each row, means with different subscripts differedat the .05 level, based on the Newman- I Keuls test. 1 9. The Quiz Score was significantlydifferentover all groups, at the significance level. 10.TheCaffeine group's nervousness was significantlydifferent from the Problems and Essays group. 1. A clinical psychologist had developed a new way of measuring depression and tested it by administering it to three groups of subjects. Three subjects who had been diagnosed as having severe depression scored 8.7, 9.6, and 7.5; three subjects who had been diagnosed as having mild depression scored 6.6, 8.5, and 8.4; and four subjects from the general public scored 2.4, 4.7, 3.9, and4.1. (a) Based on these data, is there a significant difference among the three groups on scores on the new measure of depression(use the .01 level)? (b) Explain your conclusion and procedure to a person who has never had a course in statistics. 2. A perceptual psychologist was studying figure-ground reversal. Does the amount of working with imagery in daily life affect the rate of figure-ground reversal? Specifically, do art, math, and literature students reverse figure and ground at different rates on ambiguous pictures? Math students reported reversals at 2.6, 3.0, and 4.7 seconds; literature students at 4.3, 3.8, and 2.9 seconds; and art students at 2.9,3.2, 1.0, 1.8,and 2.1 seconds. (a) Based on these data, is there a significant difference in speed of figure-ground reversal for students in these three majors? (Usethe .05 level.) (b) Explain your conclusionand procedureto a person who has never had a course in statistics. Chapter Twelve 265 3. A developmental psychologist devised an experiment to test the learning skills of children. All children performed a task, but some of the children received no information about the task (the "None" condition), some were told a subsequenttask would depend on what they learned in the first task (the "Important" condition), and some were specificallytold they would not need to know the information later (the "Useless" group). Two studies were conducted, one on five year olds and the other on eight year olds. Below are the results. Answer questions 3 and 4 with these results in mind. Mean Scores on TaskPerformance as a Function of Information Condition, for 5 Year Olds and 8 Year Olds Information Received Age None Useless Important F(2,3 1) 5 44.33, 40.33, 44.32, 4.39* 8 63.80, 52.08, 75.61, 18.65** *p<.05 **p<.O1 NOTE: In each row, means with different subscripts are different at the .05 level, Tukey's HSD test. Based on these data, describe the effect of the type of information received on the task performance for five year olds to a person who is not familiar with statistics. In your description include a discussion of the logic of the structural model of the analysis of variance and of multiple comparisons. 4. Using the eight year olds as an example, explain the underlying logic of the structural model of the analysis of variance and calculate RZ,describing its underlying meaning to a person who is not familiar with statistics. Chapter Twelve Chapter 13 Factorial Analysis of Variance Learning Objectives To understand, including being able to conduct any necessary computations: W Basic logic of factorial designs. W Terminology for describingfactorial designs. W Recognizingand interpretinginteractionand main effects from a table of means. Graphing and interpretinggraphs of results of factorialdesign studies. W Application of the structuralmodel to the two-way analysis of variance. Assumptionsfor factorial analysisof variance. W Proportion of variance accountedfor (R2) in the two-way analysisof variance. W Power in the two-way analysisof variance. W Extensions of two-way analysis of variance to analyses involving more than two dimensions, unequal sample sizes, and repeated-measures. W How results of a two-way analysisof variance are presented in research articles. Chapter Outline I. Basic Logic of FactorialDesigns and Interaction Effects A. A factorial design is a study in which the influence of two or more independent variables is studied at once by constructing groupings that include every combination of the levels of these variables. B. A factorial design with two independent variables can be diagrammed as a two-dimensional chart with the levels of one variable arrayed across the rows and the levels of the other variable arrayed across the columns. C. A factorial design is more efficient because it permits using all subjects in the study to test hypotheses for each independentvariable. D. A factorial design permits the researcher to test interaction effects-the situation in which the influence of one independent variable differs according to the level of the another independent variable. II. Terminology of FactorialDesigns A. The number of independent variables is referred to as the number of dimensionsor "ways" (e.g., a "two-wayfactorial design"). B. A main eflect is an effect of one variable averaged across the levels of the other independent variable(s). C. In a two-way design there are two possible main effects and one possible interaction effect. Chapter Thirteen 273 D. A factorial design can be characterized by the number of levels in each independentvariable (e.g., "a 2 X 3 factorial design"). E. A cell mean is the mean of scores in a particular combination of levels of the independent variables. F. A marginal mean is the mean of scores in a particular level of one of the independentvariables. l.A row mean is the marginal mean for one of the levels of the independent variable whose levels are arrayed vertically in the diagram of the factorial design. 2.A column mean is the marginal mean for one of the levels of the independent variable whose levels are arrayed horizontally in the diagram of the factorial design. G. Differences in marginal means for a particular independent variable indicate a main effect. H. One must inspect the cell means to identify an interaction effect. 111. Recognizingand InterpretingInteraction Effects A. An interaction effect is described verbally in terms of different effects of each independent variable accordingto the level of the other independent variable. B. In a two-way factorial design study interaction effects are identified numerically from the pattern of cell means. 1.An interaction arises when the pattern of differences across columns is not the same in each row. 2.Equivalently, an interaction arises when the pattern of differences across columns is not the same in each row. C. Graphing results of a two-way factorial-designstudy. 1.The vertical axis of the graph represents the values of the dependentvariable. 2.The levels of one independentvariable is arrayed acrossthe horizontal axis of the graph. 3.For each level of the other independent variable, a line is put in the graph connecting dots representing the cell means for its combination with each level of the other independentvariable. D. Interpretinga graph of the results of a two-way factorial design study. 1.An interaction is indicatedby the lines not being parallel. 2.A main effect for the independent variable whose levels are represented by different lines is indicatedby the lines having differentaverageheights. 3.A main effect for the independent variable whose levels are arrayed across the horizontal axis is indicatedby the lines having different average slopes. E. Any combinationof main and interactioneffects is possible. IV. Basic Logic of the Two-WayAnalysis of Variance A. The two-way analysis of variance is the statistical procedure used to test hypotheses about main and interaction effects in a two-way factorial design study. Chapter Thirteen B. In a two-way analysis of variance there are three F ratios (one for each main effect and one for the interaction effect). C. Each F ratio represents a between-group variance estimate for its corresponding main or interaction effect, divided in each case by the same within-group variance estimate, which is based on the variation within each cell. D. Between-group variance estimates for main effects. 1.For the row independentvariable, based on the variation amongthe row means. 2.For the column independent variable, based on the variation among the column means. E. Between-group variance estimate for interaction effect is based on the variation among combinations of cells other than those in the same columns or rows. 1.In a 2 X 2 design, these are the combination of the two cells in one diagonal versus the combinationof the two cells in the other diagonal. 2.In a design with three or more levels on either dimension, there is more than one combinationof cells to be considered. V. StructuralModel for the Two-Way Analysis of Variance A. Each score's deviation from the grand mean is divided into four components: 1.Score's deviation from mean of its cell (used for computing within-group variance estimate). 2.Deviation of the score's row's mean from the grand mean (used for computing row variable's between-group variance estimate). ?.Deviation of the score's column's mean from the grand mean (used for computing column variable's between-group variance estimate). 4.A remaining deviation (used for computing the interaction between-group variance estimate). B. Each variance estimate is computed by squaring each deviation, summing them over all subjects, and dividing by the appropriate degrees of freedom. C. Computing degrees of freedom. 1.Df for each main effect: Number of levels of the variable minus 1. 2.Df for interaction effect: Number of cells, minus df for row main effect, minus df for column main effect,minus 1. 3.Df for within-group variance estimate: Sum of degrees of freedom over all cells (for each cell, df is cell'snumber of cases minus 1). D. Table for two-way analysis of variance is like that for one-way analysis (as in Chapter 12),except there is a row for each of the three between-group effects. E. Hypothesis testing procedure is the same as with a one-way analysis of variance, except there are research and null hypotheses to be tested for each main effect and the interaction effect. VI. Assumptions in the Two-WayAnalysis of Variance A. Same as with one-way analysis of variance. B. Assumptions apply to the populations correspondingto each cell. VII.Effect Size and Power in the Two-Way Analysis of Variance A. The proportion of variance accounted for (R2)orf can be computed for each main effect and the interaction effect. B. RZis computed for each effect as the SS for that effect (SS,, SS,, or SSJ divided by the portion of SS, remaining after subtracting out the SS for each of the other two effects. C. R2 for each effect can also be computed directly from Fs and degrees of freedom given in a published research report. D. Power (and corresponding power and sample size tables) is influenced by number of levels of the effect and number of levels of the effect with which it is crossed. VIII.Extensions and Special Cases of the Factorial Analysis of Variance A. Three-way and higher factorial designs are a straightforward extensionof two- way logic and procedures. B. Unequal numbers of subjects in the cells. 1.Using standardprocedures (as in this chapter)gives incorrectresults. 2.The preferred procedure, least-squaresanalysis of variance (available on most computer programs), equalizes influence of the different cells on the main and interaction effect computation. C. Repeated-measures analysis of variance. 1.Arises when one or more of the independentvariables represents different measures on the same individuals(such as the same test given to the same subjectsbefore, during, and after some procedure). 2. Sometimes a repeated-measures variable is crossed with an ordinary between-subjects variable. 3.Requires specialprocedures. 4.There is some controversy over the appropriate procedures to use because the more traditional approachrequiresrigid assumptionsthat are often violated in practice. IX. Controversy: How to Think About Interaction Effects A. Traditionally psychologists have often interpreted interaction effects by inspecting the pattern of cell means. B. Technically the pattern of the interaction effect per se represents the pattern of cell means only after removing the effects on the cell means of row and column differences. Chapter Thirteen X. Factorial Analysis of Variance Results as Described in Research Articles A. Cell means and Fs are given in tables or text. B. If it is a complex analysis, a partial analysis-of-variancetable may be given. C. Graphs showing the pattern of cell means are often provided when there is a significantinteraction effect. Formulas I I. Between-group estimates of the populationvariance for rows (SR2orMSR) Formula in words: Sum of squared deviations of each score's row's mean from the grand mean, divided by the degrees of freedom on which it is based (the number of rows minus 1). Formulas in symbols: SR2= C(MR- GM)ZldfRor MSR= SSR/dfR (13-1) MR is the mean of each row. GM is the grand mean-the overallmean of all scores. dfR is the degrees of freedom in the between-group estimate for rows. It is the number of rows minus 1: df, =NR-1. (13-8) SSR is the sum of squared deviations of each score's row's mean from the grand mean: Z(MR-GN2. 11. Between-group estimates of the population variance for columns (Sc2 or MSC) Formula in words: Sum of squared deviations of each score's column's mean from the grand mean, divided by the degrees of freedom on which it is based (the number of columns minus 1). Formulas in symbols: SC2= C(MC- GIV)~/~~Cor MSC= SSCldfC (13-2) Me is the mean of each column. dfc is the degrees of freedom in the between-group estimate for columns. It is the number of columnsminus 1: dfc =Nc-1. (13-9) SSc is the sum of squared deviations of each score's column's mean from the grand mean: Z(Mc-GM)Z. Chapter Thirteen 111. Between-group estimates of the population variance for the interaction (S,2 or MS,) Formula in words: Sum of squares of each score's remaining deviation from the grand mean (after subtracting out its deviation from its cell's mean, its row's mean's deviation from the grand mean, and its column's mean's deviation from the grand mean), divided by the degrees of freedom on which it is based (the number of cells minus the number of degrees of freedom for rows, minus the number of degrees of freedom for columns, minus 1). Formulas in symbols: S,Z = C[(X-GM)-(X-M)-(MR-GM)-(MCCGM)]2/df,or MSR = SS1Jdh (13-3) X is each score. df, is the degrees of freedom in the between-group estimate for interaction. It is the number of cells minus the number of degrees of freedom for rows, minus the number of degrees of freedom for columns, minus 1:df, = NceII,- dfc - dfR-1, where NcellSis the number of cells. (13-10) IV. Within-group estimate of the population variance (Sw2or MSW) Formula in words: Sum of squared deviations of each score from its cell's mean, divided by the total degrees of freedom on which this is based (the number of scores in each cell minus 1, summed over all cells). Formula in symbols: Sw2= C(X-M)Zldfwor MSw = SSw/dfw (13-4) dfw is the degrees of freedom in the within-group estimate. It is the number of scores in each cell minus 1, summed over all groups: dfw = df, +df, + ...+dfLaSt. (13-11) SSw is the sum of squared deviations of each score minus the mean of its ~ell-C(x-M)~. V. F ratios for row main effect (Fd, column main effect (Fc),and interaction effect (F,) Formula in words: Population variance estimate based on the particular main or interaction effect divided by the within-group population variance estimate. Formulas in symbols: FR= SRZ/SW2or FR= MSRIMSw (13-5) Fc =Sc2/Sw2or Fc = MSc/MSw (13-6) FI= S,Z/Swzor FI = MSI/MSw (13-7) Chapter Thirteen VI. Proportion of variance accounted for by row main effect (RR2),column main effect (RC2),and interaction effect (R:) Formula in words: Sum of squared deviationsof each score's row, column, or row- column-combination mean from the grand mean, divided by the sum of squared deviations of each score from the grand mean after removing sums of squared deviations for the other main or interaction effects. Formulas in symbols: Rc2=SScI (SST- SSR-SS,) (13-13) RR2=SSR1(SST- SSc- SS,) (13-12) R,Z= SSII (SSI- SSc- SSR) (13-14) SST is the sum of squared deviations of each score from the grand mean: C(X-GA@. Alternate formulasfor computing R2using Fand degrees of freedom reported in a research article: RR' = (FR)(@RR)1([F~l[df~l+dfw) (13-15) Rc2= (Fc)(@c) 1([Fcl[dfcl+dfw) (13-16) Rc" (F1)(dJ;)1([F~l[df,l+dfw) (13-17) How to Conduct a Two-Way Analysis of Variance (Based on Tables 13-13and 13-14 in the Text) I. Reframe the question into a research hypothesis and a null hypothesis about the populations for each main effect and the interaction effect. 11. Determine the characteristicsof the comparison distributions. A. The comparison distributionswill be Fdistributions with denominator degrees of freedom equal to the sum of the degrees of freedom in each of the cells (the number of cases in the cell minus 1): df, = dJ; +df, + ....+df,,,,. B. The numerator degrees of freedomfor Fdistributionsvaries: 1.For the columns main effect it is the number of columnsminus 1: dfc =Nc - 1. 2.For the rows main effect it is the number of rows minus 1: dfR=NR- 1. 3.For the interaction effect it is the number of cells minus the degrees of freedom for columns,minus the degrees of freedom for rows, minus 1:df,=NceIl,- dfc - dfR- 1. C. Check the accuracy of your computations by making sure that all of the degrees of freedom add up to the total degrees of freedom: df, = df, + df, + d f R + d!. Chapter Thirteert I 111. Determine the cutoff sample scores on the comparison distributions at which each null hypothesis should be rejected. A. Determine the desired significancelevels. B. Look up the appropriate cutoffs in an F table. IV. Determine the score of the sample on each comparison distribution. (These will be Fratios.) A. Compute the mean of each cell, row, and column, and the grand mean of all scores. B. Compute the following deviations for each score. 1.Its deviation from the grand mean:X-GM. 2.Its deviation from its cell's mean: X-M. 3.Its row's mean's deviation from the grand mean: MR-GM. 4. Its column'smean's deviationfrom the grand mean: Mc-GM. 5.Its deviation fiom the grand mean minus all the other deviations: Interaction deviation = (X-GM) - (X-M)- (MR-GM)- (Mc-GM). (Be sure to compute this deviation using unsquared deviationsand to pay close attention to signs.) C. Square each of these deviation scores. D. Compute the sums of each of these five types of squared deviation scores (SS,, SS,, SSR,SS,, and SS,). E. Check the accuracy of your computations by making sure that the sum of squared deviations based on each score's deviation from the grand mean equals the sum of all the other sums of squared deviations: SS, = SSw +SSR +SSc +SS,. F. Compute the between-group variance estimate for each main and interaction effect (MScor Sc2=SScldf,;MSRor SR2=SSR/dfR;MSIor S,2=SSIld&. G. Compute the within-group variance estimate (MS, or Sw2=SSwldfw). H. Compute the F ratios for each main and interaction effect (Fc = Sc21Sw2or MScIMSw;FR= sR2/sw2or MsR/Msw;FI= s:/sw2or MSIIMSw). V. Compare the scores obtained in Steps 3 (111) and 4 (IV) to decide whether to reject the null hypotheses. Chapter Thirteen I,I Analysisof Variance Table andSymbolsfor a Two-WayAnalysis of Variance I Source ss df MS F Between Columns SSc MSc (or ScZ) Fc / Rows SSR @R MSR(or SRz) FR Interaction SS, df; MSI(or St) F, Within ssw df, MSw(or SwZ) I Total SST df~ I Formulasfor Each Section of theAnalysis of Variance Table Source SS df MS F I Between I I Columns Z(Mc-GM)Z Nc- 1 SScldfc MSclMSw I ROWS C.(MR-GM)' NR-1 SSR/dfR MSRIMSw Interaction Z[(X-GM)-(X-M)- Ncees-Nc- SSIldf; MSI/MSw (MR-GM)-(Mc-GM)Iz NR-1 Within C(X-M)z d!+df,+ Ssw/dfw ...+dfLast Total Z(X-GwZ N-1 I Definitions of Basic Symbols C. is the usual sum sign-which here refers to adding up the appropriate numbers for all cases (not all cells) M is the mean of a score's cell MR is the mean of a score's row Mc is the mean of a score's column GM is the grand mean of all scores Ncell,is the number of cells N, is the number of rows Nc is the number of columns X is each score N is the total number of cases in the study Chapter Thirteen Outline for Writing Essays on the Logic and Computations for Conducting a Two-Way Analysis of Variance The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defmed it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One short cut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same short cut in these practice problems (maybe writing for someonewho understands right up to whateverpoint you yourself start beingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Reframe the question into a research hypothesis and a null hypothesis about populations. (Step 1 of hypothesistesting.) A. Make a chart and explain the logic of the factorial design set up. Chapter Thirteen B. State in ordinary language the basic logic of hypothesistesting. 1.The interest in these groups is as representatives, or "samples," of larger groups, or "populations," of particular types of individuals (such as those exposed to various experimentalmanipulations). 2.Thus, you construct a scenario in which the populations do not differ and then do computations based on that scenario to see how likely it is such populations would produce samples of scores whose averages are as different from each other as are the averages of the particular samplesin this study. C. In this case, there are three research hypotheses and three corresponding null hypotheses. 1.Explain differentways of combininggroups for row and column main effects. 2.Explain logic of interaction effects as patterns of differences across columns varying accordingto level of the column variable. , 11. Determine the characteristics of the comparison distribution. (Step 2 of hypothesis testing.) I A. Explain logic of overall approach for testing each null hypothesis. I 1.We make and comparetwo estimates of the variance within these populations (which are i assumed to be the same). 2.If the scenario of no difference is true, then an estimate based on the variation among the averages of the samples (as grouped for the particular hypothesis) should give the same result as an estimate based on the average variation within each sample (in this case, within each cell). 3.But if the scenariois false (and the population averages differ),then this will increasethe estimate based on the differences among the averages of the samples but will not affect the estimatebased on the variation within them. 4.Thus, if the scenarioof no differenceis true, the ratio of the two estimates (the one based on variation among averages divided by the one based on variation within each cell) shouldbe about 1. If the scenario is false, the ratio should be greaterthan one. B. Explain F distribution. I.Statisticianshave determined the probability of getting samples which produce ratios of different sizes under the conditionsin which the scenario of no difference is true. 2.The probabilities depend on how many groupings are involved in a particular comparison and how many subjects within each cell. 111. Determine the cutoff sample score on the comparison distribution at which I the null hypothesis should be rejected. (Step3 of hypothesis testing.) i A. Procedure: Determine significance level and look up cutoff on an F table I (using the appropriate numerator and denominator dJ). I I Chapter Thirteen B. Explanation. 1.The next step for each hypothesisis to figure out how large this ratio of the two variation estimates would have to be in order to decide that the probability was so low that it is unlikely that the scenario of no differencecould be true. 2.There are standardtablesthat indicate the size of these ratios associated with various low probabilities. 3.To use these tables you have to decide the kind of situation you have. There are two considerations. a. How many groupings and how many subjectsin each cell. (Statethe numbers for your study.) b. Just how unlikely would a ratio have to be to decidethe whole scenario on which these tables are based (the scenario of no difference) should be rejected? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study for each hypothesis-if no figure is stated in the problem and no special reason given for using one or the other, the generalrule is to use 5%.) 4. Statethe cutoffF ratios for your situation. IV. For each hypothesis determine the score of your sample on the comparison distribution. (Step 4 of hypothesistesting.) A. Estimate the variation in the populations based on the various ways of grouping the scores. 1.Divide each score's deviation from the overall average into four components (each of which provides a basis for making an estimate of the population variance). a. Its deviation from its cell's average (basis of the estimate of the variation in the population using the variation within each cell). b. Its row's average's deviation from the overall average (basis of the estimate of the variation in the population using the variation among the rows). c. Its column's average's deviation from the overall average (basis of the estimate of the variation in the population using the variation among the columns). d.The remaining deviation of the score from the overall average after subtracting out each of the above (basis of the estimate of the variation in the population using the interactionof row and column effects). 2.The estimate based on the variation of each score from its cell's average is influenced only by variation within each cell. a. It is computed by squaring each such deviation (to eliminate signs) and finding a specialkind of average. b. An ordinary average would slightly underestimate the population variation because a sample is less likely to have extremes from its average than is the population it represents. Thus, instead of dividing the total by the number of scores in each group, one divides by the number of scores in each group minus 1. Chapter Thirteen 3. The estimates based on the variation of each score's row's, column's, or remaining deviation (for the interaction) from the overall average is influenced both by the variation among scores overall plus any systematic variation between rows, columns, or interaction groupings, respectively. a. Each such estimate (row, column, interaction) is computed by squaring each score's deviation of the appropriate type (row, column, interaction) and finding a special kind of average. b. In this case, the special kind of average involves dividing the total of the squared deviations by the number of groupings minus one. B. Compute the ratios of each estimate of population variation based on row, column, or interaction deviations to the within-group variance estimate. State each of these Fratios for your study. V. Compare the scores obtained in Steps 3 and 4 of the hypothesis testing process to decide whether or not to reject each null hypothesis. (Step 5 of hypothesis testing.) I A. For each of your F ratios, note whether it exceeds the cutoff and draw the I appropriate conclusion. 1.Reject the null hypothesis: The variation among the averages of your particular samples (as divided in this way into rows, columns, or row-column combinations) is so great that it seems unlikely that their populations are the same. So they seem to be different and the null hypothesis seems untrue. 2.Fail to reject the null hypothesis: The result is inconclusive. On the one hand, these results were not extreme enough to persuade you that the variation among the averages of your particular samples (as divided in this way into rows, columns, or row-column combinations) was due to the populations being different. On the other hand, it is still possible that they really are different, but because of the people who happened to be selected to be in your samples from the populations,this difference did not show up. B. Be sure to state your conclusions in terms of your particular measures and situation, so it is clear to a lay person just what the real bottom line of the study is. Chapter Self-Tests Multiple-Choice Questions 1. Factorial designs a. have replaced all other group analyses in modem psychology. b,are preferred by physiologicalpsychologists, but social psychologistsprefer a series of t-tests. c. construct groupings which include every combinationof the levels of the independent variables. d.yield less accurate results than analysis of variance. Chapter Thirteen 2. An interaction effect a. occurs when the combined influence of two independent variables could not be predicted by knowing about the influenceof each separately. b. represents the sum of the individualinfluences of two independentvariables and can be computed from knowing each effect separately. c. represents the concordanceof the dependent variables. d.represents the lack of concordanceof the dependent variables. 3. A 3 X 5 X 2 factorial design has a. 30 possible main effects. b. 30 possible interactioneffects. c. one independentvariable with three levels, one with five levels, and one with two levels. d.one independent variable with three levels and one with five levels, and one dependent variable with two levels. 4. To identify whether there are main effects in a factorialdesign,you look at a. interaction effects. b. cell means. c. column means and row means. d.t tests. 5. Comparingthe pattern of cell means across one row to the pattern of cell means across another row is a method of a. avoiding lengthy computations. b. identifying interaction effects. c,checking on whether assumptions of equal population varianceshave been met. d.computing marginal means. 6. A psychologist assignedpeople to recall words after either a short or long delay, and with or without an incentive (a payment for each word recalled). The mean numbers recalled were as follows: Incentive Yes No Delay Short 8 12 10 Long 6 2 4 7 7 Assuming all differencesare significant, a.both the main effects k d the interaction effect are significant. b. both the main effectsbut not the interaction effect are significant. c. the main effect for delay and the interaction are significant,but not the main effect for incentive. d.the main effect for incentiveand the interaction are significant,but not the main effect for delay. 286 Chapter Thirteen 7. When the cell means of a 2 X 2 factorialdesign study are graphed, a. a main effect on the variable whose levels are across the horizontal axis is shown by the lines being parallel. b.a main effect for the variable whose levels are represented by the different lines is shown by the lines crossing. c. an interactioneffect is shown by the average of the two lines having differentheights. d.an interactioneffect is shown by the two lines not being parallel. 8. Which of the following deviations is the basis for the between-group estimate for the interaction effect? a.X-MI. b. GM-MI-Mc. c.(X-GM)-(Mc-GM)-(MR-GM) . 1 d.(X-GM)-(X-M)-(Mc-GM)-(MR-GM). I 9. The formula for RcZis 10.Which of the following situations would produce an incorrect result if you applied the methods of this chapter? a.A design in which there are unequal numbers of cases in the cells. b.A design in which there are more rows than columns. c. A design in which there are more columns than rows. d.A design in which the number of rows exceeds the number of columns by a factor of four or greater. I Fill-In Questions 1. A study found that a particular style of teaching increased learning for middle class students but decreased learning for lower class students. This finding is an example of a(n) effect. 1 2. A study on non-U.S. Canadian immigrants considered the effects of region of origin (Asia, Latin i America, Europe, the Middle East, or Other) and the reason for immigrating (Political, Economic, or Other) on the subjects' satisfaction with the country. This study is a two-way factorial I design-specifically, a(n) X factorialdesign. 3. In a graph of the results of an analysis of variance, the vertical axis shows speed at completing a task on the computer while the horizontal axis represents different levels of experience using the Chapter Thirteen 287 computer. Two lines are drawn on the graph, one for each of two conditions-familiar task and unfamiliar task. If the lines are ,then the effect of different levels of experience on speed on the task is the same regardless of whether the person is doing a familiar or unfamiliar task. 4. In the top row of a table of cell means, there was a 10, 12, and 14. In the bottom row, there was a 14, 12, and 10. This pattern indicatesthat there was a(n) 5. In a two-way analysis of variance, there are three F ratios because there are three different -group estimates of the population variance. 6. Using the structural-model approach to compute a two-way analysis of variance requires dividing the deviation of each score's mean from the mean into four components. 8. When calculating for the row effect, it is important to subtract out the proportion of variance accounted for by the column and interaction effects. 9. In a factorial analysis of variance, the power of a main effect is influenced by its effect size, the number of subjects, the number of levels of the variable for that main effect, and the 10.An analysis of variance was conducted in which each subject completed a task 6 times-under each combination of three different temperature conditions and two different task complexities. The analysis of these results would require a analysis of variance. Problems and Essays 1. (a) Create a study using a 2 X 2 design and fill in a table of cell means so that there are two main effects but no interaction effects. (b) Make a graph showing these results. (c) Explain the pattern of the cell means you created. 2. A school district superintendent wanted to know how well the four high schools in her district were teaching students in regular and gifted classes. She conducted a small study in which she randomly selected students from the two kinds of classes at each of the four schools and asked them to take a standardized performance test. (a) Create a table showing all the cells for such a study and make up means for the cells so that an interaction effect exists. (b) Make a graph showing these results. (c) Interpret the pattern of the cell means. 3. Assuming all differences are significant, for the following outcome (a) make a graph, (b) determine marginal means, and (c) indicate which of the effects (main and interaction) are significant. 288 Chapter Thirteen Levels of IndependentVariable I A B C Levels of } Independent } SHORT 6 6 6 Variable I1 } LONG 46 8 4. A group of first-year students at a small private college were asked to rate how important school was to the rest of their lives. The subjects were selected to include three who came fi-om each combinationof low and high income parents and three differentethnic groups. The data are below. (a) Analyze these data using a factorial analysis of variance, testing the Ethnicity main effect at the .05 significance level and the Parent's Income main effect and the interaction effect at the .O1 significancelevel. (Be sure to includethe effect size.) (b) Explain your conclusionsand proceduresto a person who has never had a course in statistics. Importance of Collegeto First-Year CollegeStudents of VavingEthnicity and Levels of Parent's I Income 1 Parent's Income I I Ethnicity Low High Group A 6 2 5 3 5 2 Group B 3 5 2 6 4 6 Group C 2 6 1 9 1 8 Chapter Thirteen 5. A study examined the effects of being under-, over-, or normal weight and living on the west or east coast on subjects' comfort with their body weight. A two-way analysis of variance found a main effect for region, F(1,28) =5.91,p <.05, but not for bodyweight,F(2,28) =.41,p <.05. In addition, an interaction effect was found, F(2,28) =8.59, p <.01. (a) Compute the proportion of variance accounted for by each of the effects. (b) Explain the meaning of these results to a person who has never had a course in statistics. 6. A study reports the followingresults: A two-way analysis of variance was conducted to determine the pattern of influence of being an early or late riser and the amount of TV watched (high, medium, or low) on the degree to which the subjects believed the world was a safe place. Assumptions of normality and homogeneity of variance appeared to have been met. Analyses indicatedthat there was a significant main effect in the amount of TV watched, F(2,24) = 53.49,p < 0.001, but not in the waking patternsF(1,24) = 2.93, ns (see table for cell means). However, a significant interaction effect was found,F(2,24) = 10.28,~< .001 such that.... Subjects' Mean Perceptions of the Degree to which the Worldis a Safe Place as a Function of Amount of TV Watchedand Being an Early or a Late Riser WakingPattern TV Early Late low 20.2 35.8 med 33.4 38.2 high 59.8 50.8 (a) Make a graph showing the results, (b) explain the meaning of the pattern of cell means (including marginal means which you have to calculate) and the significant results, and (c) discuss the interpretation of non-significantresults. Chapter Thirteen Chapter 14 Chi-Square Tests Learning Objectives To understand, including being able to conduct any necessary computations: H Categoricalor nominal variables. H Chi-square statistic. H Expected frequencies in a chi-squaretest of goodness of fit. H Chi-square distributionand using a chi-square table. H Degrees of freedom for the chi-square distribution for the chi-squaretest of goodness of fit. H The steps in conducting a chi-square test of goodness of fit. Contingencytables. H Expected frequenciesfor a contingencytable in a chi-square test of independence. H Degrees of freedom for the chi-squaredistribution for the chi-squaretest of independence. H The steps in conducting a chi-square test of independence. H Assumptionsfor chi-square tests. H Effect size (4 and Cramer's 4) for a chi-square test of independence. H Power and needed sample size for a studyusing a chi-square test of independence. H Issues regarding minimum sample size for a chi-square test. H How results of studiesusing chi-squaretests are reported in research articles. Chapter Outline I. Categorical (Nominal)Variables 11. Chi-Square Test for Goodness of Fit A. Tests the probability that a distribution of observed frequencies in various categories could have arisen from a population with a hypothesized distributionof frequencies in these categories. B. The chi-square statistic reflects the degree of divergence between these observed and expected frequencies. 1,Discrepancy in a category is observed frequency minus expected (based on the proportion of the total observed sample that would be expected to be in this category given the hypothesized population distribution). 2.The discrepancyin each category is squared, in part to eliminatethe problem of signs. 3. The squared discrepancy in each category is divided by the expected frequency to keep these discrepanciesin proportion to the number of cases that would have been expected. ChapterFourteen 305 4.The chi-square statistic is the sum, over the categories, of each squared discrepancy divided by its expected frequency. C. If samples are randomly taken from a population and a chi-square statistic computed on each, these chi-squares follow a mathematically defined distribution(the chi-square distribution). 1.The distributionis skewed with the long tail to the right. 2.The distribution's exact shape depends on the degrees of freedom. 3.Table B-4 (in Appendix B of the text) gives the cutoff chi-squares for various significancelevels and degrees of freedom. D. In a chi-square test for goodness of fit, the degrees of freedom are the number of categoriesminus 1. E. The steps of hypothesis testing are otherwise the same as we have been using all along. 111. Chi-SquareTest for Independence A. Tests the probability that a sample in which people are measured on two categories could have come from a population in which the distribution of frequencies over categories on one variable is independent of the distribution of frequencies over categories on a second variable. B. Data are usually displayed on a contingency table, a two-dimensional breakdown in which the columns represent the categories of one variable, the rows represent the categories of the other variable, and the number in each cell is the frequency for the combinationof categoriesthat cell represents. C. If the two variables are independent,then the expected frequencies for a given cell is the proportion of its row's observed frequency of the total observed frequency, times the observed frequency for its column. (That is, if the two variables are independent, then the distribution of the frequencies among the cells in any particular column should be in proportion to the distribution of frequencies among the rows overall.) D. The chi-square statistic is computed using the observed and expected frequencies in each cell. E. The degrees of freedom for a chi-square test for independence are the number of rows minus 1times the number of columns minus 1. (This is the number of cells whose frequency-data are needed in order to fill in all the other cell's frequency-data, assuming the column and row frequencies are known.) F. The steps of hypothesis testing are otherwise the same as we have been using all along. ChapterFourteen IV. Assumption for Chi-SquareTests A. Chi-square tests are not limited by the kinds of assumptions required for the t test and analysis of variance. B. They do require that each observedcase be independent of all other cases. V. Effect Size and Power for Chi-SquareTests for Independence A. Estimated effect size for a test of a 2 X 2 contingencytable (4). 1.4 = ~(XZIN). 2.4 has the same range, meaning, and conventions for small, medium, and large effect sizes as the correlationcoefficient(r),but is always positive. B. Estimated effect size for a test of a contingency table larger than 2 X 2 (Cramer's (b). 1.Cramer's 4 = d [XZ1(N)(dfs)] where dfs is the degrees of freedom corresponding to the smaller dimension of the contingency table. 2.Cramer1s 4 is interpreted approximately as a correlation coefficient, but the corresponding conventionsfor small, medium, and large effect sizes depend on the value of dfs (see Table 14-11 in the text). C. Power: 1.Main determinantsof power are effect size, sample size, and degrees of freedom. 2.Table 14.12 in the text gives approximate power for various sample sizes using the .05 significance level for small, medium, or large effect sizes, over 1 through 4 degrees of freedom. D. Planning sample size: Table 14.13 in the text gives the approximate number of subjects needed to achieve 80% power for estimated small, medium, and large effect sizes for 1through 4 degrees of freedom using the .05 significance level. VI. Controversy: Minimum Expected Frequency A. Traditionally statisticians recommended a minimum expected frequency of 5 in any cell or category. B. Recent work suggests that from the point of view of protecting against Type I error, even an expected frequency of 1 in a cell or category is acceptable, so long as the total number of subjects is at least five times the number of cells or categories. C. However, low expected cell frequenciescan substantiallyreduce power. VII. Chi-Square Tests as Reported in Research Articles A. All observed frequencies are usually reported. B. Chi-square test results follow a standard format--for example, "X2(2,N=531) = 2 8 . 3 5 , ~< .001." Chapter Fourteen Formulas I. Chi-square statistic (XZ) Formula in words: Sum, over all the categories or cells, of the squared difference between observed and expected frequencies divided by the expected frequency. (0-El2 Formula in symbols:XZ= Z - (14-1) E 0 is the observedfrequencyfor a category or cell. E is the expected frequencyfor a category or cell. 11. Expected frequency of a cell in a contingency table (E) Formula in words: The proportion that a cell's row's observed frequency is of the total observed frequency, times the observed frequency for its column. R Formulas in symbols: E= (-)(C) (14-2) N R is the number of cases observed in this cell's row. N is the number of casestotal. C is the number of cases observed in this cell's column. 111. Degrees of freedom (dfl for a chi-square test for independence Formula in words: The number of columns minus one, times the number of rows minus one. Formulas in symbols:df= (Nc-l)(NR-1) (14-3) Nc is the number of columns. NR is the number of rows. IV. Estimated effect size for a test of a 2 X 2 contingency table ($). Formula in words: Square root of result of dividing the computed chi-square statisticby the number of cases in the sample. Formula in symbols:$ = ~(XZIN) (14-4) ChapterFourteen V. Estimated effect size for a test of a contingency table larger than 2 X 2 (Cramer's $). Formula in words: Square root of result of dividing the computed chi-square statistic by the product of the number of cases in the sample times the degrees of freedom in the smaller dimension of the contingency table. Formula in symbols: Cramer's @ = .\I[XZ/(N)(dfs)] (14-5) dfs is the degrees of freedom corresponding to the smaller dimensionof the contingencytable. How to Conduct a Chi-Square Test for Goodness of Fit I. Reframe the question into a research hypothesis and a null hypothesis about populations. A. Populations. 1.Population 1 are people like those in the study. 2.Population 2 are people who have the hypothesized distributionover categories. B. Hypotheses. 1.Research: Two populationshave differentdistributionsof cases over categories. 2.Null: Two populations have the same distributionsof cases over categories. 11. Determine the characteristics of the comparison distribution: A. The comparison distribution will be a chi-square distribution. B. Its degrees of freedom are the number of categoriesminus 1. 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine the desired significancelevel. B. Look up the appropriate cutoff on a chi-square table, using the degrees of freedom calculated above. IV. Determine the score of your sample on the comparison distribution. (This will be a chi-square statistic.) A. Determine the actual, observed frequenciesin each category. B. Determine the expected frequencies in each category (multiply the proportion this category is expected to have times the total number of cases in the sample). C. In each category compute observedminus expected and square this difference. D. Divide each squared differenceby the expected frequency for its category. E. Add up the results of Step D over the different categories. V. Compare the scores in I11 and IV to decide whether or not to reject the null hypothesis. ChapterFourteen 309 How to Conduct a Chi-Square Test for Independence I. Reframe the question into a research hypothesis and a null hypothesis about populations. A. Populations. 1.Population 1 are people like those in the study. 2.Population 2 are people whose distribution of cases over categories on the first variable is independent of the distribution of cases over categories for the second variable. B. Hypotheses. 1.Research: Two populations are different. 2.Null: Two populations are the same. 11. Determine the characteristicsof the comparison distribution. A. The comparison distribution will be a chi-square distribution. B. Its degrees of freedom are the number of rows (the number of categories in one of the variables) minus 1 times the number of columns (the number of categories in the other variable)minus 1. 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. Determine the desired significancelevel. B. Look up the appropriate cutoff on a chi-square table, using the degrees of freedom calculated above. IV. Determine the score of your sample on the comparison distribution. (This will be a chi-square statistic.) A. Set up a two-dimensional contingency table, placing the observed frequencies in each cell. B. Determine the expected frequenciesin each cell. 1.Find the marginal totals for each column and row. 2.Find the overall total. 3.For each cell multiply the proportion of cases its row represents of the total (the row total divided by the overall total) times the number of cases in its column. C. For each cell compute observed minus expected and square this difference. D. Divide each squared differenceby the expected frequency for its cell. E. Add up the results of Step D over all the cells. V. Compare the scores in I11 and IV to decide whether or not to reject the null hypothesis. Chapter Fourteen Outline for Writing Essays on the Logic and Computations for Conducting Chi-Square Tests The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much--your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explaining in words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear just why that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer--you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One short cut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same short cut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself startbeingjust a little unclear). But every time you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. Chapter Fourteen Chi-Square Testfor Goodness ofFit I. Reframe the question into a null and a research hypothesis about populations. A. Introduce the situation. 1.Statethe given (observed)distribution of cases over categories. 2. State the expectedproportional distribution of cases over categories. 3.Note that there is a discrepancy. B. State in ordinary language the hypothesis testing issue. I. Is this discrepancy so large that we can reject the hypothesisthat our observed cases (our sample) represent a world in which the distribution is true generally (that is, in the population). 2. Thus, you construct a scenario (the null hypothesis) in which the observed distribution is a random sample from a population like that which is expected and see how likely it is under this scenario that just by chance you could have obtained a sample with a discrepancy as large as you actually have. 11. Determine the characteristics of the comparison distribution. A. This step involves figuring out the probabilities of getting different degrees of discrepancy by chance (assumingthe null hypothesis is true). B. The distribution of chance discrepancies (assuming the null hypothesis is true) for a particular way of measuring discrepancy (called chi-square) is called a chi-square distribution. C. The chi-square distribution is mathematically defined and depends only on the number of categories involved (technically,on the number of categories minus 1). 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. In this step one figures out how large an actual discrepancy would have to be in order to decide that the probability of getting such a discrepancy under the null hypothesis is so low that this whole scenario of the null hypothesis being true could be confidently rejected. B. There are standard tables that indicate the size of these discrepancies associated with various low probabilities. C. Just how unlikely would a discrepancyhave to be to decide the whole scenario on which these tables are based (the scenario of no difference) should be rejected? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are 312 Chapter Fourteen using in your study--if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) D. Once this decision is made, the cutoff level of discrepancy can be determined from the table. (Statethe level for your situation.) IV. Determine the score of your sample on the comparison distribution. A. Compute the degree of discrepancy for your actual situation. B. This requires computing a number called a chi-square, which reflects the degree of divergence between observed and expected frequencies over the categories. It is computed in four steps (which you should describe for your example). 1.For each category find the discrepancybetween observed and expected in terms of actual scores. (That is, for the expected,multiply the proportion expected times the number of cases in your sample.) 2.Square this discrepancy. (This eliminates the problem of some discrepancies being I positive and some negative.) I 3.Divide the squared discrepancyin each category by the expected frequency. (This keeps I I these discrepanciesin proportion to the number of cases that would have been expected.) 4.The chi-square statistic is the sum, over the categories, of each squared discrepancy divided by its expected frequency. V. Compare the scores in Steps I11 and IV to decide whether or not to reject the null hypothesis. A. Note whether or not your chi-square exceeds the cutoff and draw the appropriateconclusion. Either: 1.Reject the null hypothesis: The distributions of cases over categories is so discrepant from what you would expect if your sample represents a population with a distribution like that hypothesizedthat you reject this scenario. 2.Fail to reject the null hypothesis: The result is inconclusive. On the one hand, these results were not extremeenoughto persuade you that the discrepancy was due to chance. On the other hand, it is still possible that your sample really does represent a population whose proportional distribution over categories is different from what was expected, but because of the people who happened to be selected to be in your samples from the population,this discrepancydid not show up strongly enough. B. Be sure to state your conclusion in terms of your particular measures and situation, so it is clear to a lay person just what the real bottom line of the study is. Chi-Square Testfor Independence I. Reframe the question into a research hypothesis and a null hypothesis about populations. A. Introduce the situation. 1.Make (if it is not given) a contingency table of the observed data and describe it. 2.Explain the notion of independence: That the distribution of cases over categories on one variable is unrelated to the distribution of cases over categorieson the other variable. 3.Compute the expected frequencies for the cells in your contingency table under the assumptionof independence. (The number in each cell should be the proportion of cases in its column that its row is a proportion of the total.) 4. Note the discrepancybetween observed and expected. B. State in ordinary language the hypothesis testing issue. 1.The hypothesistesting question is whether this discrepancy is so large that we can reject the hypothesis that our observed cases (our sample) represents a world in which the expected distributionof independence is generallytrue (that is, in the population). 2.Thus, you construct a scenario (the null hypothesis) in which the observed distribution is a random sample from a population in which the distributions of cases over categories are independentand see how likely it is that you could have gotten a discrepancy as large as you actuallyhave under this scenariojust by chance. 11. Determine the characteristicsof the comparison distribution. A. This step involves figuring out the probabilities of getting different degrees of discrepancy by chance under the null hypothesis. B. The distribution of chance discrepancies under the null hypothesis for a particular way of measuring discrepancy (called chi-square) is called a chi- square distribution. C. The chi-square distribution is mathematically defined and depends only on the number of expected cell frequenciesthat are free to take on any possible value once the overall numbers in each row and column are set. (This is figured out by multiplying the number of rows minus 1 times the number of columns minus 1.) 111. Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. A. In this step one figures out how large an actual discrepancy would have to be in order to decide that the probability of getting such a discrepancy under the null hypothesis is so low that this whole scenario of the null hypothesis being true could be confidently rejected. B. There are standard tables that indicate the size of these discrepancies associated with various low probabilities. 314 Chapter Fourteen C. Just how unlikely would a discrepancyhave to be to decidethe whole scenario on which these tables are based (the scenario of no difference) should be rejected? The standard figure used in psychology is less likely than 5% (though 1% is sometimes used to be especially safe). (Say which you are using in your study-if no figure is stated in the problem and no special reason given for using one or the other, the general rule is to use 5%.) D. Once this decision is made, the cut-off level of discrepancy can be determined from the table. (State the level for your situation.) IV. Determine the score of your sample on the comparison distribution. A. Compute the degree of discrepancyfor your actual situation. B. This requires computing a number called a chi-square, which reflects the degree of divergence between observed and expected frequencies over the categories. It is computed in four steps (describe these steps in terms of your example). 1.For each cell find the discrepancy between the number observed and expected. 2. Square this discrepancy (this eliminates the problem of some discrepancies being positive and some negative). 3.Divide the discrepancy in each cell by the expected frequency. (This keeps these discrepanciesin proportionto the number of casesthat would have been expected.) 4.The chi-square statistic is the sum, over the categories, of each squared discrepancy divided by its expectedfrequency. V. Compare the scores in Steps I11 and IV to decidewhether or not to reject the null hypothesis. A. Note whether or not your chi-square exceeds the cutoff and draw the appropriate conclusion. Either: 1.Reject the null hypothesis: The distributions of cases over categories in the two variables is so discrepant from what you would expect if your sample represents a population in which the distributions over the two variables are unrelated to each other that you reject this scenario. 2.Fail to reject the null hypothesis: The result is inconclusive. On the one hand, these results were not extreme enough to persuade you that the discrepancy from the two variables being unrelated was due to chance. On the other hand, it is still possible that your sample really does represent a population in which the variables are related, but because of the people who happened to be selected to be in your samples from the population, this discrepancydid not show up stronglyenough. B. Be sure to state your conclusion in terms of your particular measures and situation, so it is clear to a lay person just what the real bottom line of the study is. Chapter Fourteen VI. Compute effect size and evaluate any nonsignificant results in terms of power. A. The degree of association between the distributions of cases over categories for the two variables can be indexed by a number that ranges from 0 (no association) to 1(perfect association). B. This number is the square root of the result of dividing the computed chi- square by the number of cases (or if greater than a 2 X 2 table, the division is by the number of cases times one less than the number of rows or columns in the smaller side of the table). C. Give the result for your study and compare it to Cohen's conventions for small, medium and large effect sizes, discussing it as an indication of the degree of association in relation to what is typical in psychology research. D. If you get a nonsignificant result, computepower (using the table). 1.Explain concept of power as probability of deciding that the population distributions are not independenton the basis of a study with this many subjects, given that there is a true associationof a given size in the population. 2. Compute power twice, once for a small and once for a large effect size in the population and discuss implications for the likelihood of there actually being an effect of a small and large size in the population. Chapter Self-Tests Multiple-ChoiceQuestions 1. A variable such as a person's nationalityis usually considered to be a.rank-order. b. quantitative. c. nominal. d.fractional. 2. The chi-square statistic is the sum, over all categories or cells, of the following calculation made within each category or cell: a.the difference between the squaredexpected frequency and the squared observedfrequency. b.the product of the expected frequencytimes the total number of cases observed in all categories or cells. c.the squared difference between the observed and expected frequency, divided by the expected frequency. d.the difference between observed and expected frequencies,divided by the expected frequency. 316 Chapter Fourteen 3. In a chi-square test for goodness of fit, the research hypothesis is that a,the population distribution of means fits the expected distribution of means. b.the population distribution of means is different between the two populations. c. the distribution of cases over categories differs between Population 1and Population 2. d.the distribution of cases over categories is the same for Population 1 and Population 2. 4. "Independence" in the chi-squaretest for independencerefers to a situation in which a. knowing a score's category on one variable gives no information about its category on the other variable. b. observed frequencies equal twice the expected frequencies. I c. the independent variable is truly causal and not merely predictive. d. if any relation exists between the two variables, either could be the cause of the other, but there are no third variables that might explain this relation. 5. A formula for determining the number of cases expected in any one cell of a chi-square contingency I table is 6. The number of degrees of freedom in a chi-square test for independence is computed from a formula that requires knowing only a,the number of subjects. b.the number of subjects and the number of cells. c. the number of rows and the number of columns. d.the number of rows, columns, and subjects. 7. The one important assumption for the types of chi-squaretests you learned in this book is that a. the expected frequency of cases must not be larger than the observed frequency. b.the populations distributionsmust be negatively skewed (or at least not positively skewed). c. each observed case must be unrelated to all others (that is, no two scores should come from the same subject). d.populations of each variable must be distributednormally. 8. The degree of association of a chi-square test of independence for a 2 X 2 contingency table is indicated by a.J: b. d(x2/7V). c. Cronbach's T. d.RIN. Chapter Fourteen 9. Power in a chi-squaretest for independence depends on all of the followingEXCEPT a. effect size. b.whether a one- or two-tailed test is used. c. sample size. d.degrees of freedom. 10.In a chi-square test of independence, it is now generally thought that it is acceptable to have a low expected frequency (below 5) in a cell provided that a.the expected frequenciesare all positive. b.the total number of subjectsis at least five times the number of cells. c.there are more subjectsthan cells. d.there are at least 30 subjects. Fill-In Questions 1. inventedthe chi-square test. 2. You have taken a random sample of people at your college and asked each which of three fast-food chains he or she prefers. There is a tendency for one chain to be picked more often. To examine whether this preference would be likely to hold in the general population which this sample represents, you would conduct 3. The formula for computingchi-square is C[(O-E)ZI 1. 4. In the chi-squaretest for goodness of fit, the degrees of freedom are 5. If there is no relationship between the variables in a contingency table, they are said to be of each other. 6. In a chi-square test for independence,the expected frequency of a particular cell equals the percent of total observed cases in the cell's times the number of observed cases in the cell's column. 7. Becausethe chi-squaretest does not require that the parent populationsbe ,it is called a "nonparametric"or "distribution-free"test. 8. Completethe following formula: Cramer's 4 = Chapter Fourteen 9. The power of a study testing hypotheses using the chi-square test of independence is determined by significancelevel, degrees of freedom, effect size and 10.An organizationalpsychologistconsultingto a hotel chain conducteda survey of people's preference for smoking or nonsmoking rooms and whether the person was on business or nonbusiness travel. The psychologist'sreport included the following table: Observed (and Expected) Frequenciesfor Room Preference and Travel Typefor 80 Surveyed Patrons Room Preference Travel Smoking Non Total Percent Business 30 (33) 30 (27) 60 75 I Nonbusiness 14 (11) 6 ( 9 ) 20 25 Total 44 36 80 a. Of business travelers, % preferred smokingrooms. b. Of the nonbusinesstravelers, % preferred smoking rooms. c. The phi coefficientfor this study was I Problems and Essays 1. Design a five-category study to be analyzedby the chi-square test for goodness of fit. (a) Make up data, (b) analyze it (use the .05 significance level), and (c) describe the underlying logic of the analysis and interpretyour "findings"to a person who has never had a course in statistics. 2. An industrialpsychologistworking for a particular manufacturing company was concerned that one of their four plants might have consistently more worker grievances than the others. Records are kept for a month, during which Plant A had 15grievances;Plant B, 34; Plant C, 17;and Plant D, 18. (a) Do these data (which are fictional) suggest that the four plants are different in how many grievances arise? (Usethe .05 significancelevel.) (b) Explain your analysisto a person who has never had a course in statistics. 3. A survey is conducted of the weight of newborns (recorded as below average, average, and above average) and depressionof mothers during pregnancy (rated as severe,mild, or not depressed). (a) Do the data (which are fictional)in the following table suggest birthweight is related to mother's depression during pregnancy? (Usethe .05 significancelevel.) (b) Computethe effect size and indicate whether it is large, medium, or small. (c) Explain your analysesto a person who has never had a course in statistics. Level ofMother's Depression During Pregnancy Birthweight Severe Mild Not Depressed below average 8 5 1 average 3 8 12 above average 9 7 7 4. Graduate students in the humanities, social sciences, and natural sciences were surveyed and found to listen to three different types of music when they studied. The relation of field of study to type of music yielded a significantassociation, XZ(4,N=60) = 14.32,~< .05. (a) Compute a measure of association and indicate whether it is a large, medium, or small effect. (b) Explain the results of the study (includingthe effect-size result) to a person who has never had a course in statistics. Using SPSS/PC+ StudentwarePlus with this Chapter If you are using SPSS for the first time, before prbceedingwith the material in this.section read the Appendix on Getting Started and the Basics of Using SPSS/PC+ StudentwarePlus. You can use SPSS to cany out a chi-square test for independence and the associated measure of effect size (I$or Cramer's $). You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to trythe suggestions for using the computer to deepen your understanding. Chapter Fourteen Chapter 15 Strategies When Population Distributions Are Not Normal: Data Transformations, Rank-Order Tests, and Computer-Intensive Methods Learning Objectives To understand, including being able to conduct any necessary computations: H Assumptions for the major parametric statisticaltests. Implicationsof violating assumptions. Recognizing violations of the normality assumption. H Rationale of data transformations. W Procedures of major data transformations. H Rationale of rank-order tests. Applications of rank-order tests. Rank-ordertransformationsfollowedby standardparametric tests. H Power of an experimentusing a one-way analysis of variance. Equal interval and ordinal levels of measurement. Randomizationtests and approximaterandomization tests. Relative advantages and disadvantages of data transformations, rank-order tests, and computer- intensivemethods. Chapter Outline I. Assumptions in the Standard Hypothesis-TestingProcedures A. Most require meeting two assumptions. 1.Populationshave normal distributions. 2.Populationshave equal variances. B. Violation of assumptions can increase or decrease Type I and Type I1 error, often in unpredictable ways. C. Recognizing violations of assumptions. 1.Difficult to do with only sampledata. 2.Extreme skewness, kurtosis, or outliers of samples suggest population distribution is not normal. 11. Data Transformations A. Application of some regular mathematical procedure to each score (such as taking the square root of each). B. Done to make scores in the sample follow normal distributionin the hope that they will then represent a population that is normally distributed. C. Justified when underlying meaning of intervals between scores are arbitrary. ChapterFifteen 331 D. Major types of transformations: 1.Square root. 2.Log. 3.Inverse. E. Transformations do not change order of scores. F. Carried out by trial and error until a transformation creates a distributionin the sample that appears normal. G. After the transformation, the ordinary parametric test is applied. 111. Rank-Order Methods A. Involve converting scoresto ranks. B. Distributionfree and nonpararnetric. C. Rank-order tests are available that correspond to each major parametric method (see Table 15-7in text). D. Rank-order tests operate based on a known distribution of any set of ranks (rectangular) and involve precise computations of probability of getting a pattern of ranks as extreme as that observed in the study. E. Recently some statisticians have recommended applying ordinary parametric tests after making a rank-order transformation. F. Rank-order tests are particularly appropriate with data measured at the rank- order (ordinal)level. 1.Rank-order measurement has less information than the standard equal-interval measurement. 2. Some psychologists argue that our typical measures are not truly equal-interval, so that we should always transform to ranks. IV. Computer-IntensiveMethods A. Randomization test. 1.Computes probability that a particular organization of the data (such as the division of scores into an experimental and control group) represents a difference or association that is very unlikely in light of all possible organizations of the data. 2. Procedure. a. Compute difference or associationfor each possible organization of the data. b. Rank order the outcomes. c. Locate your actual difference or association and determine if it is in the top 5% (or 1%). 3. Not practical for situations with sample sizes of the magnitude used in much psychology research. B. An approximate randomization test is practical. It computes differences or associations for a randomly selected large number (say, 1000) of the possible organizationsof the data and bases the probability on these results. 332 ChapterFifteen V. Comparison of Methods A. Data transformation. 1.Advantage is that it permits the use of familiarand sophisticatedparametric techniques. 2.Disadvantages. a.No transformationmay work to make the data meet assumptions. b. May distortmeaning of scores. B. Rank-order tests. 1.Advantages. a. Can alwaysbe applied. b.Particularly suitablefor rank-order data. c. Underlying logic is very simple and direct. 2.Disadvantages. a. Less commonlyunderstoodthan standard parametric methods. b.In many complex situationsno standardrank-ordermethods have been developed. I c. May distortmeaning of scores. 3.In the past, ease of computation by hand was an advantage. I I C. Computer-intensivemethods. 1 1.Advantages. a. Underlying logic is very simpleand direct. 1 b. Can be applied to almost any situation,even ones for which no standard test has been invented. 2. Disadvantages. a.New, so that the cautions and limitations are not well worked out. b.New, so that standardstatisticalsoftware packages do not includethem. D. The relative advantages and disadvantages of these procedures in terms of power are not known for the circumstancesin which they are most likely to be applied. VI. Procedures Used When Populations Appear Nonnormal, as Described in Research Articles A. Data transformations are typically described at the start of the Results section, with a description of the distribution of the data that were transformed. B. Rank-order tests are described in the same way as other tests, often giving a Z statistic (for the normal approximation of the distribution of the underlying rank-order statistic). C. Computer-intensive methods are not yet common in journal articles, so that when they do appear they are described in considerable detail. Chapter Fifteen How to Conduct Hypothesis Tests When Populations Appear Nonnormal I. Examine data to see if the distributions suggest a nonnormal population distribution,then decide which method to use. A. If data transformation does not distort the underlying meaning of the data and there is a transformation that makes the data meet the assumptions, this method is appropriate. B. If the data are rank-order scores or most reasonably considered as ranks, and if a rank-order test is available,this method is particularly appropriate. C. If the data would be distorted by transformation or putting in rank order, if assumptions can not be met in these ways, or if no existing statistical test applies to the situation,then computer-intensivemethods are appropriate. 11. Carry out one of the three methods: A. Data transformation. 1.Examine the sample data to estimate the kind and degree of nonnormal shape of the population distribution. 2.Apply the transformationthat seems to offer the best correction. a. Squareroot for a moderately skewed distribution. b.Log for a highly skewed distribution. c. Inverse for a very highly skewed distribution. 3.Examine the transformed sample distribution; if it still clearly suggests a nonnormal population distribution,try a differenttransformation. 4.0nce an appropriate distribution has been created, cany out a standard parametric hypothesis test using the transformed scores. B. Rank-order test. 1.Transform all scores to ranks, ignoring which group the subject is in (however, if computing a correlation,rank each variable's scores separately),giving averageranks for ties. 2.Cany out the hypothesistest in one of these ways. a.Use one of the standard nonparametric tests (for which you have not learned the procedures in this text). b.Cany out a standardparametric hypothesistest using the ranks instead of scores. C. Computer-intensive method (a randomization test, applied when there are a small number of subjects). 1.Create all possible divisions of the scores into groups of the sizes in your sample (if a comparison of means of groups) or create all possible ways of matching up the scores representedby the two variables (if a correlation). 2. Computethe mean differenceor correlation for each of the possible combinations. 3.Rank-order the results. 4.Determine whether the result for the particular combinationthat represents your sample is in the most extreme 5% (or 1%)of all the results. ChapterFifteen Outline for Writing Essays on the Logic and Computations for Conducting Hypothesis Tests When Populations Appear Nonnormal The reason for your writing essay questions in the practice problems and tests is that this task develops and then demonstrates what matters so very much-your comprehension of the logic behind the computations. (It is also a place where those better at words than numbers can shine, and for those better at numbers to develop their skills at explainingin words.) Thus, to do well, be sure to do the following in each essay: (a) give the reasoning behind each step; (b) relate each step to the specifics of the content of the particular study you are analyzing; (c) state the various formulas in nontechnical language, because as you define each term you show you understand it (although once you have defined it in nontechnical language, you can use it from then on in the essay); (d) look back and be absolutely certain that you made it clear justwhy that formula or procedure was applied and why it is the way it is. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. These essays are necessarily very long for you to write (and for others to grade). But this is the very best way to be sure you understand everything thoroughly. One short cut you may see on a test is that you may be asked to write your answer for someone who understands statistics up to the point of the new material you are studying. You can choose to take the same short cut in these practice problems (maybe writing for someone who understands right up to whatever point you yourself start beingjust a little unclear). But everytime you write for a person who has never had statistics at all, you review the logic behind the entire course. You engrain it in your mind. Over and over. The time is never wasted. It is an excellent way to study. I. Explain that ordinarily one could use a standard statistical procedure to resolvethe issue (test the hypothesisraised by the essay question). A. Name the procedure that would be appropriate (such as a t test or analysis of variance). B. However, explain that these standard procedures require certain conditions be met to use them. C. One of these conditions is that the distribution of scores in the larger groups (populations) that your data are supposed to represent must follow a bell- shapedpattern known as a normal curve. Chapter Fifteen 335 D. However, in the data at hand, the scores do not seem to come from populations distributed in the shape of a normal curve. (Explain what leads you to this conclusion.) E. Thus one of several alternativeshave to be used. 11. If data transformation is selected, explain it. A. The purpose is to make data more likely to be representative of a normally distributedpopulation. B. Data transformation is acceptable when the underlying meaning of the intervals between scores are arbitrary and the transformation does not change the order of the scores. C. Deciding which transformation to use depends on which will make the sample data most closely follow a normal curve; summarizethe trial-and-error process you carried out in working the problem. D. Once the data are transformed, one can carry out the normal hypothesis test. E. Describe the logic and computations in the steps of the appropriate hypothesis testing procedure you apply. 111. If a rank-order method is selected, explain it. A. This method is used in three situations. 1.The data suggest a nonnormal population and transforming to ranks creates a situation with a known distribution. This is acceptable when the underlying meaning of the values of the variable are arbitrary. 2.The meaning of the intervals between values of the variables is inconsistent;converting to ranks, while reducing the amount of information, leaves only that information which one can be confident is accurate. 3.The data are in the form of ranks to begin with and standard methods assume equal intervalmeasurement. B. Describetransformation into ranks, noting how any ties are handled. C. Explain that you will then carry out a normal hypothesis test using the ranked data. D. Describethe logic and computations in the steps of the appropriate hypothesis testing procedure you apply. E. Note that statisticians have found that although ranks do not have a normal distribution, using the standard statistical procedures (parametric tests) with ranked data gives approximatelyaccurateresults. ChapterFifteen IV. If a randomization test is selected, explain it. A. This method is used in three situations. 1.Data suggest a nonnormal population (or other violation of assumptions; this procedure makes no assumptionsabout the distribution of the population). 2.No known statistical test exists for this hypothesis testing situation. 3.The researcher prefers this method because its basic logic is less complex and it does not require making any assumptionsat all about the populations. B. Describethe basic logic as it appliesto your situation: The procedure requires that you determine the probability that the particular division of scores into I groups (or the particular pairing of scores for the correlation) in your sample could have arisen as a chance division. 1.Identify and compute the difference (or correlation) for all possible groupings (or pairings) of scores. 2.Order the outcomesfrom smallest to largest. i 3.Determine the proportion of outcomes that are higher and lower than the outcome correspondingto the organization of your actual sample. C. Having explained all this, describe your actual steps of computation and results followingthe above procedure. Chapter Self-Tests I Multiple-ChoiceQuestions 1. When it comes to normality, in practice most researchers assume a. the population is NOT normally distributed,until normality is verified by specialprocedures. b. the population is normally distributedunless the sample'shistogram is drasticallynonnormal. c.the sample is normally distributedunless the population's histogram is drasticallynonnormal. d.the sample is NOT normally distributed,until normality is verified by specialprocedures. 2. A square-roottransformation is often used when the data are a. bimodally skewed. b.positively skewed. c.negatively skewed. d.normally distributed. ChapterFiften 3. Data transformations arejustified by all of the following arguments,EXCEPT a.the transformed data might better represent a population that is normally distributed. b. if there is not an inherent meaning in a score's number (as in the case of most psychological scales), transformations give a reflection of reality that is .at least as accurate as the original picture. c. transformationmakes the hypothesistesting procedure more stringentby increasingsample sizes. d.after transformations, scores that were higher are still higher (that is the order of the scores is unchanged). 4. "Non-parametrictests" use data which a. are transformed using antilogs. b. do not require estimatingpopulation parameters. c. are transformed using logs. d.are normally distributed. 5. Which of the followingrank-order tests correspondsto a t test for dependentmeans? a. Wilcoxin signed-ranktest. b. Wilcoxin rank-sum test. c. Kruskal-Wallish test. d. SpearmanRho. 6. If you convert scoresto ranks and then carry out an ordinary analysisof variance, a.the results will be seriously distortedbecause the distributionof scores is not normal. b.the scores must frst be convertedto logs before the rank transformation. c. the results will be quite accuratebecause ranks are normally distributed. d.the results will be a good approximationif theF table is used to determine significance and quite accurate if specialtables are used. 7. If a measure is equal-interval,then a differencebetween the scores of 2 and 3 is about the same as the difference between the scores of a. 2.4 and 2.6. b. 1 and 1.5. c. 4 and 9. d.10and 11. ChapterFifteen 8. When conducting a randomizationtest for a differencebetween two groups, a. all scores are first ranked from highest to lowest, ignoring which group they are in. b. a t test is computed and then compared to the t value cutoff obtained from large numbers of random data sets with the same population characteristics as your actual samples, generated by computer using a "Monte Carlo" method. c. two-tailed tests are always used. d.all scores are randomly divided into every possible combination of two groups and the difference in means is calculated for each combination. 9. All of the following are advantagesof rank-order tests, EXCEPT a.they are particularlyusefu! when the data are not clearly equal-interval. b.the logic is simple and direct, requiringno elaborateconstruction of hypothetical distributionsand estimatedparameters. c. there are many more such tests available than standard parametric tests, so that it is more likely that one of these can be used than a standardparametric test. d.they are more familiar to readers of research in psychology than are computer-intensivemethods. 10.All of the following are true about computer-intensive methods, EXCEPT a.they do not require either of the two main assumptions of parametrictests. b.they are available with most standardcomputer statistical programs. c.they have a direct logic, bypassing the process of constructing estimated population distributions, distributionsof means, etc. d.they can often be appliedwhen there is no existing standardtest. Fill-In Questions 1. Data transformations are used when the distribution of the population is thought to be 2. A single score that has a big effect on the mean of a group and therefore likely to distort significancetests comparingthat group to other groups is called a(n) 3. A log transformationhas a similar but weaker effect than a(n) transformation. 4. Because there is no need to estimatepopulation values, rank-ordertests are called 5. The parametricversion of the Wilcoxin rank-sum test is the 6. The null hypothesis of a(n) test is that the two populationshave the same median. Chapter Fifteen 7. When conducting a rank-order test with a large sample, a Z score is computedwhich is compared to a(n) 8. If a measure is ,then the meaning of the difference between scores of 27 and 29 is the same as that of 5 and 7. 9. In a(n) test, you (or a computer) actually calculate every possible allocation of scores into two groups, and then, for each of the allocations,find the difference scores between the means of the two groups. 10. do not require either normal population distributions or equal variances when testing hypotheses and can be applied even to situations in which no ordinary statisticaltests exist. Problems and Essays 1. Describe$he distribution of the following scores: (a) 160 182 189 1934654 (b)4.8 5.4 5.8 5.9 6.3 6.9 2. In a study of the effect on attractionof expecting to be liked, ten subjectswere randomly assignedto meet a stranger under conditions in which they did or did not expect the strangerto like them. This was followed by a short interactionafter which the subjectsindicatedhow attractedthey were to the person as a friend. Here are the scores: Did not expect to be liked: 77,83, 88,91,98 Did expect to be liked: 46,57,58,66,99 (a) Conduct a t test for independentmeans using the raw scores (use the .05 level, two-tailed). (b) Conduct another t test using square-root transformed scores. (c) Discuss the reasons for using the transformation and the implicationsof the difference in results of these two methods. 3. Do adults who changed elementary schools over five times in their childhood have a different number of "good friends" than those who only attended one school? A small sample was drawn, and the frequent movers had 3, 1,9, 13, and 6 good friends, while those who had not moved had 1, 4, 5,2, and 1 good friend. (a) Conduct the appropriatestandardparametric hypothesistesting procedure, but using ranks. (Use the .05 significancelevel.) (b) Explain your analysisto a person who has never had a course in statistics. Chapter Fifteen 4. In a particularweek an environmentally-conscious sanitationengineer noticed that at the three small apartment complexes where she left a variety of recycling containers (for plastics, clear glass, colored glass, cardboard, etc.), the residents recycled 0.5, 1.1, and 6.1 pounds of material. However, at three other apartment complexes, where she left only one container for all types of recyclables,the residents recycled 3.4,2.1, and 7.0 pounds. (a) Using the following chart (which includes the entire set of ways one can organize six scores into two equal-sized groups), conduct a randomization test to see whether the apartment complexes with one container recycled more than the complexes with several containers (use the .05 level). (b) Explain your analysisto a person who has never had a course in statistics. 5. A memory researcher compared number of symbols forgotten over one week under two different kinds of interference. In the results section the researcher reported "the Gonzales-Scott interference condition produced significantly greater interference than did the Janoff interference condition, based on a Mann-Whitney Utest, Z= 3.21,p < .01." (a) Explain why the researcher might have used the Mann-WhitneyU test instead of an ordinary t test for independentmeans. (b) Explain the meaning of theZ (in a general way~youneed not describe how it was computed) in the context of the study. (c) Suggestone other alternativeapproach the researcher could have used. ChapterFiften Using SPSS/PC+ StudentwarePlus with this Chapter This chapter assumes you are familiar with the basics of using SPSS and have worked examples from previous chapters. You can use SPSS to cany out data transformations. (SPSS can also carry out certain rank-order tests. However, like most standard statistics programs, SPSS does not carry out any of the computer- intensiveprocedures.) You should work through the example, following the procedures step by step. Then look over the description of the general principles involved and try the procedures on your own for some of the problems listed in the Suggestionsfor Additional Practice. Finally,you may want to trythe suggestions for using the computer to deepen your understanding and explore some of the advanced SPSS procedures, the rank-order tests. I. Example A. Data: Number of books read in the past year by four childrenwho are not, and four children who are, highly sensitive. These are fictional data from a text example. For the children who are not highly sensitive,the numbers of books read are 0,3, 10,and 22; for the highly sensitive children, 17,36,45, and 75. B. Follow the instructions in the SPSS Appendix for starting up SPSS and be sure the cursor is in the ScratchPad window. C. Enter the data as follows. I. Type DATA LIST FREE / HIGHSENS BOOKS. and press Enter to go to the next line. 2.Type BEGIN DATA. and press Enter to go to the next line. 3.Type one line per subject,using a 1 for not highly sensitive and a 2 for highly sensitive. 4.Type END DATA. and press Enter to move to the next line. D. Carry out the square root transformation-type COMPUTE SQRBOOKS = SQRT(BO0KS). and press Enter. E. Carry out the t test for independent means, using the square-root transformed variable, as follows. 1.Type T-TEST GROUPS HIGHSENS(1,Z) / VARIABLE SQRBOOKS. and press Enter. Figures SG15-1 and SG15-2 show the entire set of typed lines. Chapter Fifteen Chapter 16 Integrating What You Have Learned: The General Linear Model Learning Objectives To understand, including being able to conduct any necessary computations: W The hierarchicalrelation among the four major parametric methods. W The general linear model. W Equivalence of bivariate regression and correlation and multivariate regressionlcorrelationwith one independentvariable. W Equivalence of the t test for independentmeans and a two-group analysisof variance. W Equivalence of the t test for independentmeans and a significancetest of a correlation coefficient in which the predictor variable is a two-level numerical variable. W Graphic interpretationof the abovepoint. W Equivalence of the analysis of variance for two groups and the significance test of the correlation coefficientin which the predictor variable is a two-level numerical variable. W Equivalence of the analysis of variance for three or more groups and multiple regressionlcorrelation in which the predictor variables are two-level numerical variables. W How psychologistschoose among mathematically equivalent methods. W Assumptionsfor the four major parametric methods. W Criteria forjudging an observedrelationshipas causal. Chapter Outline I. Multiple Regression/Correlation (Review) A. Yields a systematicrule for predicting values of a dependent variable. B. In raw score form the rule is statedthis way: C. The correlation between the set of independent variables and the dependent variable is called a multiple correlation (R). D. R2is the proportionate reduction in squared error gained by using the multiple- regression prediction rule compared to simply predicting the dependent variable from its mean. ChapterSixteen E. R and R2can be tested against the null hypothesis that the population value is 0. 11. The General Linear Model A. A person's score on a dependent variable is conceived as the sum of three influences. 1.Some fixed influencethat will be the same for all individuals. 2.Influences of variableswe have measured, on which people have different scores. 3. Other influencesnot measured (error). B. Stated as a formula,it looks likethis: C. This formulais like a multiple regression formula. 1.Influence 1correspondsto the a in a raw-score multipleregressionprediction rule. 2.Influence 2 corresponds to the bs (b,, b2, etc.) and Xs (XI,X,,etc.) in a multiple regression equation. D. But in some ways it is not the same as a multiple regression formula. 1.The formula is for the actual, not the predicted, score on the dependentvariable. 2.The formula includes Influence 3, which is what accounts for the errors in prediction. E. It is called a linear model because the equation does not include any squared or higher power terms. 111. Bivariate Regression and Correlation in Relation to Multiple Regression: Bivariate regression is the special case of multiple regression/correlation in which there is only one predictorvariable. IV. The t Test for Independent Means as a Special Case of the Analysis of Variance A. The t test for independent means is equivalentto the analysis of variance when the analysis is of two groups. B. Both t and F can be understood as ratios of signal to noise. (Numerators are based on difference or variation among means of groups; denominators of both are based on variation within groups.) C. Below are the detailsabout this equivalency. 1.The numerators are based on differences or variations among means of groups. 2.The denominator o f t is partly based on pooling the estimates of variances from data within each group; the denominator ofF is the pooled estimate of variances from data within each group. 3.The denominator oft involves dividing by the number of subjects in each group; the numerator of F (when using the method for equal sample sizes of Chapter 11) involves multiplyingby the number of subjects in each group . ChapterSixteen 4.For analyses conductedon the same data (comparingmeans of two groups): a. The t score is exactly the square root of the F. b.The degrees of freedom for the t are the same as the denominator degrees of freedom for the F. c. The cutoff t equals the square root of the cutoff F. V. The t Test for Independent Means as a Special Case of the Significance Test of the CorrelationCoefficient A. The following points are true of the significance test of the correlation coefficient. 1.It tests the null hypothesisthat in the population there is a correlation of 0.I 2.The comparison distribution is a t distribution with degrees of freedom equal to the number of subjectsminus 2. 3.The score on the comparison distribution is a t score,with t = (r)(d[~-21)1d(1-+). B. A comparison of means of groups (the focus of a t test for independent means) 1 can be thought of as a nominal variable with two levels. I C. Representing a nominal variable with two levels as a numerical variable, using I any two arbitrary numbers to stand for the two levels,permits one to conduct a correlation between that variable and the dependentvariable. D. The t test for independent means is equivalent to the significance test of a correlation coefficient in which one of the variables has only two levels. That is, analyses conducted on the same data yield equivalent results. 1.Both methods give the same t. 2.Both methods give the same degrees of freedom. 3.Both methods yield the same cutoff t score and the same conclusions regarding significance. E. Creating a scatter diagram for the two-group situation further illustrates the relation. 1.The mean of each group is the same as the predicted score (from the regression equation) for each of the levels of the two-levelnumerical variable. 2.The variation between the two means is equivalent to the slope of the regression line-the greaterthis is, the more likely the t is significant. I I 3.The variationwithin each of the groups is equivalent to the spread around each predicted value in the scatter diagram-the smallerthis is, the more likely the t is significant. ChapterSixteen VI. The Analysis of Variance as a Special Case of the Significance Test in Multiple Regression/Correlation A. The analysis of variance for two groups is equivalent to the significance test of a bivariate correlation in which the predictor variable has two levels. B. This link is seen most clearly in comparing the computation of the proportion of variance accounted for (R2) in analysis of variance using the structural model approach and the proportionate reduction in error (rZ)in bivariate regression. 1.The within-group sum of squares (SS,) in analysis of variance is the sum of the squared differences of each score from its group's mean; the sum of squares for error (SSd in regression is the sum of the squared differences of each score from the predicted score (which, when there are only two levels of the predictor variable, is the same as the mean on each level of that predictor variable). Thus, SS, = SS,. 2.The total sum of squared error (SST)in analysis of variance is the sum of the squared differences of each score from the grand mean; the total sum of squared error (SST)in regression is the sum of the squared difference of each score from the mean of the dependent variable (which is the same number as the grand mean in analysis of variance). Thus, SSTin analysis of variance is equal to SSTin regression. 3.The between-group sum of squares in analysis of variance can be computed asSST-SS, (this is because SST=SSW+SSB);the reduction in squared error in regressionisSST-SSw. Thus, SSBin analysisof variance equals the reduction in squared error in regression. 4.In analysis of variance, R2= SSB/SST;in regression,rZ = reduction in squared error /SST. Since SSB= reduction in squared error, andSST is the same in both, R2 in analysis of variance equals rZ in regression. 5.If one put the sums of squares in a regression analysis into an analysis of variance table and computed an F, it would yield the same Fas the analysisof variance. C. The analysis of variance for more than two groups is equivalent to the significance test of a multiple regression/correlation in which the predictor variables each have two-levels and there are one fewer such predictor variablesthan there are groups. l . A comparison of means of several groups can be thought of as a nominal variable with as many levels as there are groups. 2. Such a many-leveled nominal variable can not be made into a single numerical variable because the numerical levels assigned would imply specific ordered and quantitative relations among the levels. 3. Such a many-leveled nominal variable can be made into a set of two-level numerical variables such that each represents being in or not being in one of the groups. (This is called nominal coding.) 4.It takes one less two-leveled variable than there are groups because the last group is representedby a case not being in any of the preceding groups. ChapterSixteen D. The proportion of variance accounted for (R2) in an analysis of variance for more than two groups is equivalent to the proportionate reduction in error (R2) of a multiple regression/correlation in which the predictor variables each have two-levels, and there is one fewer of the predictor variables than there are groups. VII.Choice of Test when Results Would Be Equivalent A. It is based in part on tradition and what people are used to. B. It is based in part on people confusing a correlational research design with a correlational statistic. , VII1.Assumptions: All four methods share the same assumptions. IX. Controversy: Causality (from Baumrind's analysis). A. One view of causality is the regularity theory. 1.It is associated with the philosophers Hume and Mill. I 2.It considersX to be a cause of Y if three conditions are met. I a.X and Yare regularly associated. i b.X precedes Y. c. There are no third causesthat precedeXthat might cause both X and Y. II 3.In psychology research this idea of regular association is indicated by a significant correlation. 4.In psychology research the requirement of no third causes is handled in one of two ways. a. Ideally there is random assignmentto levels ofX. b.As a makeshift when random assignment is not possible, groups are equated by statistical methods that look for third causes among variables measured as part of the research. B. Another view of causality is the generative theory. 1.It is associated with the philosophersAristotle, Aquinas, and Kant. 2.It requires everything required for the regularity view. 3.In addition, it requires that the researcher be able to identify a plausible explanation for the way in whichX influences Y. ChapterSixteen Outline for Writing Essays on the Equivalence of Various Methods The essays for this chapter involve explaining the equivalences of two different ways of carrying out computationsfor the same data. What is important in this case is to lay out and explain, in a step by step manner, each of the linkages involved. Your explanations should explicate the reasoning for why two numbers are equivalent-it is not enoughjust to point to the equality. The outlines below are examples of ways to structure your essays. There are other completely correct ways to go about it. And this is anoutline for an answer-you are to write the answer out in paragraph form. Examples of full essays are in the answers to Set I Practice Problems in the back of the text. t Test as a Special Case ofAnalysis of Variance I. Carry out the computations for the data using the two methods (use the method for equal sample sizes introduced in Chapter 11). 11. Make a chart showing equivalencies, similar to the main section of Table 16- 2 in the text. 111. Explain each parallel. A. The degrees of freedom for the t are the same as the denominator degrees of freedom for the F-because in both cases the estimate of the population variance based on the variation within the groups is based on this number of degrees of fieedom. B. The cutoff t equals the square root of the cutoff Fbecause the procedures are mathematically equivalent, except that the resulting t is the square root of the resultingF. C. Numerators of t and F are both based on differences among means of groups-because in both cases the statistic should be larger if there is greater difference among the groups (which is what is being tested). D. The denominator of t is partly based on pooling estimates of variances from data within each group; the denominatorof Fis a pooled estimate of variances from data within each group (note that S,2 = Sw2)-theseare the same because in both cases we want to divide, or reduce the effect of the difference between groups, in proportion to the amount of variation, or noise, within each group. ChapterSixteen E. The denominator of t involves dividing by the number of subjects in each group; the numerator of Finvolves multiplying by number of subjects in each groupthese divisions or multiplications are done in both cases to take into account the fact that we are dealing with means of groups, which do not vary as much as individual cases (explain this in more depth following explanations in Chapters 7, 10,and 11). t Testfor Independent Means as a Special Case of the SigniJicanceTestfor the CorrelationCoeBcient I. Carry out the computations for the data using the two methods. 11. Explain the equivalence of a two-category numerical variable in the correlationto the two-category nominal variable in the t test. i A. The difference between groups on a dependent variable equals the association I of the variable that represents what the groups differ on with the dependent 1 variable. B. Thus, a difference in a t test for independent means is the same as an I associationbetween its independentand dependentvariable. C. If you substitute a two-level numerical variable for a two-level nominal variable (giving each group one number), the correlation with this variable is not affected (except for sign) by the two numbers you pick-this is because in correlation everything is converted to Z scores and with Z scores a variable with only two numbers always comes out to the same two Z scores (if there are equal numbers of cases in the two groups,the Z scores are all +Is and -1s). D. Thus, the resulting correlation accurately represents the association between the original nominal variable and the dependentvariable. 111. Explain each parallel in the significance test. A. The null hypothesis for the t test is that there is no difference between populations on the dependent variable and for the correlation, that there is a 0 1 association. These are equivalent because both are hypothesizing no relation I between the independent and dependentvariable. B. Both are tested using a t distribution (this is because in both cases you are working with populations whose variance is not known). C. State that the degrees of freedom are the same in both cases (you needn't explain this). ChapterSixteen D. The computations lead to the same t score. 1.This is because they are testing the same thing. 2.Another way to understand this is that both calculations are based on the same comparison. a. A t test for independent means is larger when there is more difference between the group means and less variation within each group. b. A correlation is larger (as is the t computed from the correlation coefficient) when there is a maximum difference between the scores at each level of the independent variable (that is, if there are two groups, and there is maximum difference between the means of the two groups) and minimum variation among the scores at each level of the independent variable (that is, if each level is a group, there is minimum variation within groups). Illustrate this with a scatter diagram for your data. Proportion of VarianceAccountedfor inAnalysis of Variance(Rz)for Two Groups as a Special Case of Proportionate Reduction in Error inRegression (rz) I. Carry out the computations for the data using the two methods (for the analysis of variance, be sure to use the structural model approach from Chapter 12). 11. Explain the equivalence of a two-category numerical variable in the correlation to the two-category nominalvariable in the analysis of variance. A. The difference between groups on a dependent variable equals the association of the variable that represents what the groups differ on with the dependent variable. B. Thus, a difference in group means in an analysis of variance is the same as an associationbetween its independentand dependentvariables. C. If you substitute a two-level numerical variable for a two-level nominal variable (giving each group one number), the correlation with this variable is not affected (except for sign) by the two numbers you pick-this is because in correlation everything is converted to Z scores, and with Z scores a variable with only two numbers always comes out to the same two Z scores (if equal numbers in the groups, +l and -1). D. Thus the resulting correlation accurately represents the association between the original nominal variable and the dependentvariable. E. Thus the association between the independent and dependent variable in analysis of variance, measured by R2, refers to the same thing as the associationin correlation and regression, which can be assessed as r2. 360 ChapterSixteen 111. Explain each parallel in the computations. A. SSw= SSE. 1.The within-group sum of squares (SS,) in analysis of variance is the sum of the squared differences of each score from its group's mean. 2.The sum of squares for error (SSE)in regression is the sum of the squared differences of each score from the predicted score (which, when there are only two levels of the predictor variable, is the same as the mean on each level of that predictor variable). 3.This equivalence arises because in both cases the error or noise is the variation within groups or levels of the independentvariable. B. SS, is the same in both analyses. I 1.The total sum of squared error @ST)in analysis of variance is the sum of the squared differences of each score fiom the grand mean. 2.The total sum of squared error (SS,) in regression is the sum of the squared difference of each score from the mean of the dependent variable (which is the same number as the i grand mean in analysis of variance). I 3.This equivalence arises because in both casesthe baseline error or variance to be reduced or accounted for is the total variation of each score from the overall mean of all the 1 scores. 1 C. SSB= reduction in squared error. 1.The between-group sum of squares in analysis of variance can be computed asSST-SSw (this is because, SST= SSw+SSB). 2. The reduction in squared error in regression is SST-SSw. 3.This equivalence arises because in both cases the amount of variance accounted for (or error reduced) is the total error less the amount that remains even after dividing subjects into groups (or predicting their scores based on their group membership). D. R2in analysis of variance equals r2in regression. 1.In analysis of variance, RZ= SSB/SST. 2.In regression, 1.2 = reduction in squared error (which is the same as SS,) I SST. 3.This equivalence arises because proportion of variance accounted for and proportionate reduction in error are both about how much knowing the value of the independent variable improvesyour abilityto predict the score on the dependent variable. Proportion of VarianceAccountedfor in Analysis of Variance(RZ)for More than Two Groups as a Special Case of Proportionate Reduction in Error in Multiple Regression (RZ) I. Do not attempt to carry out the computations-this has not been covered in this course. 11. Explain the logic (without computations) of the points above about the equivalence of RZin analysis of variance with two groups and rZ in bivariate regression. ChapterSixteen 111. Add an explanation of the logic of nominal coding to convert an analysis of variance problem to a multiple regressionproblem. A. A comparison of means of several groups can be thought of as a nominal variable with as many levels as there are groups. B. Such a many-leveled nominal variable can not be made into a single numerical variable because the numerical levels assigned would imply specific ordered and quantitative relations among the levels. C. Such a many-leveled nominal variable can be made into a set of two-level numerical variables such that each represents being in or not being in one of the groups. (This is called nominal coding.) D. It takes one less two-leveled variable than there are groups because the last group is represented by a case not being in any of the preceding groups. E. Since the information included in this set of two-leveled numerical variables is the same as the information in the original many-leveled nominal variable, the analysis of variance using the nominal variable yields exactly the same result as the multiple regression using the set of two-leveled numerical variables. Chapter Self-Tests Multiple-ChoiceQuestions 1. The analysis of variance is a specialcase of a. chi-squaretests for independence. b. t tests. c.multiple regression/correlation. d.bivariate correlationand regression. 2. A person's score on a particular dependent variable can be conceived of as the sum of all of the following influences,EXCEPT a. other measured variables on which people have differentscores. b.other influencesnot measured. c. some fixed influencethat will be the same for all individuals. d.other variables which have no correlation with a person's score. ChapterSixteen 3. If you are comparing the means of two groups to test if they are different, you can use all of the following EXCEPT a(n) a. t test for independentmeans. b.t test for dependentmeans. c. bivariate correlation. d.analysisof variance. 5. The t test for independentmeans is a special case of the bivariate correlation, in which the predictor variable of the correlationhas a.rank-order values. b.continuousvalues. c. exactly two values. d.integrated determinants. 6. On a scatter diagram representing a correlation in which one of the variables has only two levels, the pattern which will produce the largest correlation is one on which the difference between the averages of the scores at the two levels is and the variation among the scores at each of the levels is a. small; moderate. b. large; moderate. c. small; large. d.large; small. 7. When using the bivariate prediction rule, the sum of squared errors computed in a bivariate correlationis the same as in the analysis of variance. a.SSB. b.SS,. c.ss,. d.RZ. Chapter Sixteen 8. Suppose an experiment is conducted in which subjects are randomly assigned to read a news story of one of three types-humorous, serious,or disturbing-and are then measured on their interest in the story. If you were to set up the analysis as a multiple regression predicting scores on interest, how many two-level numerical variableswould be required? a. 1. b.2. c. 3. d.4. 9. Students identified themselves as predominantly lighthearted, studious, unfocused, or pragmatic. This variable was nominally coded, with three variables, so that Variable A was a 1 if the subject was light hearted, but 0 if not. Variable B was 1 if studious, 0 if not. Variable C was 1 if unfocused, 0if not. If a subject scored 0 on the three variables,how had she identifiedherself? a. light hearted. b.pragmatic. c,unfocused. d.studious. 10.An organizational psychologist measures self-esteem when people begin working for the company (A') and job satisfaction (Y) one year later, finding a strong, significant correlation. Which of the criteria for causality (accordingto the "regularity"theory) does this study FAIL to meet? a.X and Yare regularly associated. b.X precedes Y. c.there are no other causes that precedeX that might causeX and Y. d.none of the above are unrnet-that is, the studymeets all three. Fill-In Questions 1. Bivariate correlation is a special instance of 2. In the formula, "Y= a + (bl)(X,) + (b2)(X2)+ (b3)(X3)+ e," the syrnbol(s) representing the fied influencethat applies equallyto all individuals is(are) 3. The t score and the Fratio are both fractions in which the numerator is influenced by and the denominatorby ChapterSixteen 4. In a t test for independent means, the number of subjects influences thet score mainly by making the denominator smaller, because one divides a key aspect of the denominator (the pooled population variance estimate) by the number of subjects in each group. In one method of conducting an analysis of variance, the corresponding adjustment in the computation of theFratio involves making the larger, because one ("multiplies" or "divides") the population variance estimate (pooled from the estimates within each group) by the number of subjects in each group. 5. When there are only two groups, the square root of the Fratio equals 6. The is a special case of the correlation coefficient-the case in which the predictor variable has only two values. 7. is comparable to analysis of variance, except that the former has numerical predictor variables, while analysis of variance has nominal predictor variables. 8. SS, in the analysis of variance is the same as in regression. 9. Suppose a nominal variable, major, was nominally coded as follows: Humanities (yes=l, no=O), Social Science (yes=l, no=O), Natural Sciences (yes=l, no=O), Arts (yes=l, no=O), and Other (not coded). What would be the scores of a person whose major was Natural Science? 10.According to Diana Baumrind, the two common ways of understanding causality are the "regularity" theory of causality and the " " theory of causality. Problems and Essays 1. For the following data (a) compute a t test for independent means, (b) compute an analysis of variance, (c) make a two-column chart of the major computations in which the parallel computations are laid out next to each other, and (d) explain each of the parallels to a person who understands both the t test for independent means and the analysis of variance, but is unfamiliar with their relationship. Group 1 Group 2 30 25 18 12 38 31 16 22 21 15 ChapterSixteen 2. For the following data (a) compute a t test for independent means; (b) compute a correlation coefficient; (c) using the formula t = (r)(dm-21)1.\1(1-~),determine the significance of the correlation coefficient; (d) explainhow it is possible to compute a correlation coefficientfor data set up as two groups; and (e) make a scatter diagram of the data and use it to discuss the parallels between the t test for independent means and the correlation coefficient and its significance test. (Both of your explanations4 and e-should assume the reader is familiar with the t test for independentmeans and the correlationcoefficient,but not with their relationship.) Group 1 Group 2 2 0 3 3 1 1 1 1 3 0 3. For the following data (a) compute an analysis of variance (based on the raw scores) using the structural model approach; (b) compute the proportion of variance accounted for (R2) from this analysis; (c) compute the correlation coefficient; (d) determine the raw-score prediction rule; (e) make predictions for each score on the dependent variable using this prediction rule and use this to compute, step by step, the proportionate reduction in error (P); (f) explain how it is possible to compute a correlation coefficient for data set up as two groups; and (g) list and explain the reasons for each of the parallels in the computations of R2 and P. (Both of your explanations-f and g-should assume the reader is familiar with analysis of variance, structural model approach, including the computation and meaning of R2, as well as with bivariate regression and the computationand meaning of rZ, but is unfamiliar with their relationship.) Group 1 Group 2 204 138 215 189 166 121 181 172 ChapterSixteen 4. A fictional study was done in which ability to detect a very faint light was measured under highly standardizedconditions, comparingfour different strains of pigeons (four of each strain). Using the following data, (a) create a nominal coding scheme and make a chart showing the codes and the dependent variable score for each pigeon, and (b) explain how this procedure would facilitate conducting a multiple regression on the data and how the results of that regression would be the same as for an analysis of variance for the same data. Pigeon Strain 1 S 2 G 3 S 4 R 5 C 6 G 7 R 8 C 9 S 10 S 11 R 12 C 13 C 14 R 15 G 16 G Light Detection Level .014 .058 .028 .I01 .024 .077 .088 .023 .019 .010 .075 .043 .028 .098 .071 .083 Using SPSS/PC+ StudentwarePlus with this Chapter The material in this section assumes you are already familiar with using SPSS for correlation, regression,the t test for independentmeans, and the one-way analysis of variance (as described in the Using SPSS sectionsof Chapters3,4, 10,and 11). You can use SPSS to carry out the various procedures on the same data set, in order to see the relations among them. You should work through the example, following the procedures step by step. Then try the procedures on your own for some of the problems listed in the Suggestions for Additional Practice. Finally, you may want to try the suggestions for using the computer to deepen your understanding. ChapterSixteen I. Example A. Data: Use the data entered in Chapter 10 (filename is CHlOXMPL) for the job-program experiment. (To bring up the file, press F3, press Enter, type CHIOXMPL, and press Enter.) B. Compute the t test again (to have the output for reference). The result is shown in Figure SG16-1. HORE / t-tests for indepndent samples of P R W Number Variable of Cases Wan SD SE of Wan Mean Difference = 3.0000 Levene's Test for Equality of Variances: F= ,226 P= .643 t - t e s t for Equality of Means 95% Variances t-value df 2-TailSig S E o f D i f f C I f o r D i f f Equal 2.75 12 .OlB 1.091 L .622, 5.378) Unequal 2.75 11.98 .018 1.091 (.622, 5.3781 jFigure SG16-1 C. Compute an analysis of variance for the same data by replacing the T TEST line with ONEWAY PERFORM BY PROGRAM(1,2). The result is shown in Figure SG16-2. .Compare this result to the results in Table 16-2and 16-4in the text. ChapterSixteen Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles Learning Objectives To understand each of these statisticaltechniques in a general way, so that you can recognize it in a research article, understand why it was done and what the results reported in an article mean, and make sense of the key terminology: Hierarchicaland stepwisemultiple regression. Partial correlation. Reliability. Factor analysis. Causal modeling, includingpath analysis and latentvariable modeling. Analysis of covariance. Multivariateanalysis of variance and covariance. Issues in the controversyover whether statisticsshould be controversial. The overallrelation among different statisticalmethods. How to read results involvingunfamiliar statisticaltechniques. Chapter Outline The Two Main Categories of Advanced StatisticalTechniques A. Those that focus on associations among variables (these are variations and extensionsof correlation and regression). 1.Hierarchicaland stepwisemultiple regression. 2.Partial correlation. 3.Reliability. 4.Factor analysis. 5.Causalmodeling. B. Those that focus on differences among groups (these are variations and extensionsof analysis of variance). 1.Analysis of covariance. 2.Multivariate analysisof variance and covariance. Chapter Seventeen 11. Brief Review of Multiple Correlation and Regression A. Multiple correlation is about the association of one dependent variable with the combinationof two or more predictor variables. B. Multiple regression is about predicting a dependent variable based on two or more predictor variables. 1.A multiple-regression prediction rule includes a set of regression coefficients, one to be multipliedby each predictor variable. 2.The sum of these multiplicationsis the predicted value on the dependentvariable. 3.When working with Z scores, the regression coefficients are standardized regression coefficients,calledljeta weights (Bs). 4.When working with raw scores, raw-score regression coefficients (bs) are multiplied by the raw score of each predictor variable, and a particular number (the raw-score regression constant,a)is also added in. C. Hypothesis testing: Significance can be computed both for each regression coefficient and the overall prediction rule. III.Hierarchica1 and Stepwise Multiple Regression A. Both methods examine the influence of several predictor variables in a sequentialfashion. B. Hierarchical multiple regression follows a hypothesized order. 1.Procedure: It computesthe correlation of the first predictor variable with the dependent variable, then how much is added to the overall multiple correlation by including the second-most-important predictor variable, and then perhaps how much more is added by including a thud predictor variable, and so on. 2.It is a recent procedure with no standard way of describingit or settingup a table. 3. It requires a specifictheoreticalbasis for the order in which the regression is carried out. Chapter Seventeen C. Stepwise multiple regression is used in exploratory studies-it determines which predictor variables of many that have been measured usefully contributeto the prediction. 1.Procedure. a. It computes the correlation of each predictor variable with the dependent variable and identifies the one with the highest correlation. (If none of these correlations are significant,the procedure stops.) b. It next computes the multiple correlation of each of the other variables in combination with the one just identified as having the highest correlation by itself to see which combination produces the highest multiple correlation. (If none of these multiple I correlations add significantlyto the predictability overjust using the first variable, the procedure stops.) c. It next computes the multiple correlation of each of the remaining variables in combination with the highest two to see which combination produces the highest I multiple correlation. (If none of these multiple correlations add significantly to the I predictabilityoverjust the first two, the procedure stops.) I I d. The process continues in this way until all variables are included in the prediction rule l or adding the next best variable does not add significantlyto the predictability. I 2.Caution: The prediction formula that results is the optimal small set of variables for predicting the dependent variable, as determinedporn the sample studied-when tried with a new sample, somewhat differentcombinationsoften result. IV.Partial Correlation I A. This is the degree of association between two variables, over and above the influence of one or more other variables. B. A variable over and above the influence of which the partial correlation is I computed is said to be held constant,partialed out, or controlledfor. (These 1 terms are interchangeable.) C. The partial correlation coefficient is interpreted like an ordinary bivariate correlation except one should remember that some third variable is being controlled for. D. You can think of a partial correlation as the average of the correlations between two variables, each correlation computed amongjust those subjects at each level of the variable being controlledfor. E. Partial correlation is often used to help sort out alternative explanations in a correlational study. 1.If the correlationbetween two variables dramatically drops or is eliminated when a third variable is partialled out, it suggests that the third variable was behind the correlation. 2. If the correlation between two variables is largely unaffected when a third variable is partialled out, it suggests that the third variable is not behind the correlation. V. Reliability Coefficients I ChapterSeventeen A. Reliability, the accuracy and consistency of a measure, is the extent to which if you were to give the same measure again to the same person under the same circumstances, you would obtain the same result. B. Reports of computations of reliability of a measurement are very common in research articles. C. Test-retestreliability. 1.This is the correlationbetween the scores of the same people who take a measure twice. 2.It is often an impractical or inappropriate approach to reliability, since having taken the test once could influencethe second taking. D. Reliability as internal consistency. 1.Split-half reliability is the correlationbetween two halves of the sametest. 2.Cronbach's alpha can be thought of as the correlation between scores on two halves of a test, averaging such computationsfor all possible divisions of the test into halves. 3.KR-20 (Kuder-Richardson-20)is a special case of Cronbach's alpha designed for tests that includeonly dichotomous items. E. In general a test should have a reliability of at least .7, and preferably closer to .9, to be considered useful. F. If a measure has low reliability, it tends to reduce the correlation between it and any other variable. (This can be adjusted for in bivariate correlation using a correction for attenuation.) VI.Factor Analysis A. This is a widely used procedure applied when a researcher has measured people on a large number of variables. B. It identifies groupings of variables (calledfactors) such that those within each group correlate with each other but not with variables in other groupings. C. The correlation between a variable and a factor is called the variable'sfactor loading on that factor. D. A widely used convention is to consider a variable part of a factor on which it has a loading of .3 or greater. E. Researchers usually give each factor a name based on the variables that load highly on it. (These names can be misleading.) F. Tables of the results of factor analyses usually give the factor loadings for each factor and also often give the percentage of variance that the factor as a whole accountsfor in the entire set of original variables. ChapterSeventeen VII.Causal Modeling A. Path analysis. 1.This procedure focuses on a diagram with arrows connectingthe variables, indicatingthe hypothesizedpattern of causalrelations among them. 2.It is based on the correlations among the variables; for each arrow the researcher can compute a path coefficient. a.A path coefficientindicatesthe extent to which the variable at the start of the arrow is associated with the variable at the end of the arrow, after controlling for all variables that also point to this variable. b.A path coefficient is the same as a beta in multiple correlation/regression, with the variable at the start of the arrow as a predictor variable, the variable at the end of the arrow the dependent variable, and all the other variables in the path diagram which also point to the variable at the end of the arrow as other predictor variables in the regression equation. 3.A variablehaving no arrowsto it within the path diagram is an exogenous variable. 4.A variablethat has arrows to it within the diagram is called an endogenousvariable. 5.A path diagram may explicitly emphasize that there are unknown variables influencing an endogenous variable by putting an arrow to it with a blank stem, or with the letterE (for error) at the stem of the arrow. 6.A path analysis is consideredto provide supportfor the hypothesized causalpattern if the path coefficientsfor the major arrows are all significantand in the predicted directions. B. Latent variable models. 1.This procedure is also widely known as structuralequationmodeling or LISREL. 2.It is an extensionof path analysiswith severaladvantages. a. It produces an overallmeasure of how good the model fits the data. b. It includesa significancetest-but the null hypothesisis that the model fits. c. It permits modeling of latent variables (as assessed by a set of manifest variables). 3.The path diagram. a. Manifest variables are shown in squares. b.Latent variables are shown in circles. c. The measurement model, the relation of the manifest to the latent variables they assess, usually involves arrows from each latent variable to its associated manifest variables. d.The causal model usually involves arrows showing the relations among the latent variables. 4.Limitations. a.A well-fitting model that is not a significantly bad fit is still only one of the possible models that could fit the data. b. It shares limitations of all methods ultimately based on the correlation coefficient. i. Associationdoes not demonstrate directionof causality. ii.It only takes into account linear relationships. iii. Results are distorted by restriction in range. ChapterSeventeen VIII. Analysis of Covariance(ANCOVA) A. This procedure is the same as an ordinary ANOVA, except that one or more variables are partialed out. B. A variable partialed out is called a covariate. C. The rest of the results are interpreted like any other analysis of variance. D. The analysisof covariance is generally used in one of two situations. I. One situation is the analysis of a random-assignment experiment in which some nuisance variable is partialed out. 2.The other situation is in a study in which it is not possible to employ random assignment, and variables on which groups may differ are partialed out. (This use is more controversial.) E. ANCOVA assumes that the correlation between the covariate and the dependent variable is the same in all the cells. IX.Multivariate Analysis of Variance (MANOVA) and Multivariate Analysis of Covariance (MANCOVA) A. Multivariatestatistical techniques involve more than one dependent variable. B. The most widely used multivariate techniques are MANOVA and MANCOVA. C. MANOVA is simply an analysis of variance in which there is more than one dependent variable. D. MANOVA tests each main and interaction effect of the independent variables on the combination of dependentvariables. E. A significant effect in MANOVA, which could be due to any one of the dependent variables, is usually followed up by a series of ordinary llunivariate'lanalyses of variance on each dependentvariable separately. F. MANCOVA is a MANOVA in which one or more variables are partialed out of the analysis. X. Overview of Statistical Techniques Considered: Most of the techniques covered in this course can be understood as representing the various combinations of the following possibilities. A. Association versus differencetest. B. One versus many independentvariables. C. One versus many dependent variables. D. Whether or not any variables are controlled. XI. Controversies: Should Statistics Be Controversial? A. Statisticsis usually taught today as if there were no controversies. ChapterSeventeen B. What is usually taught today is a wedding of the once opposed Fisher and Neyman-Pearson viewpoints. C. Recently, some psychologists have argued that these viewpoints have been misunderstood and misused as a result of being blended. D. Especially decried is the heavy emphasis on the null hypothesis, alpha, andp < .05 as a rigid cutoff point and sole determinant of the worth of a piece of research. XII.How to Read Results Involving Unfamiliar StatisticalTechniques A. Even well-seasoned researchers periodically encounter unfamiliar statistical methods in research articles. B. In these cases, you can usually figure out the basic idea. 1.Usually there will be a p level given and just what pattern of results is being considered significantcan be discernedfrom the context. 2.Usually there will be some indication of effect size (degree of association or the size of I the difference). ChapterSeventeen Chart of Major Statistical Techniques (From Table 17-1 in the text) Association Number of Number of Any or Independent Dependent Variables Difference Variables Variables Controlled? Name of Technique Bivariate Correlation & Regression Association 1 1 No Association Any number Mu1tiple Regression (Including Hierarchical & Stepwise) Yes No Partial Correlation Association 1 1 Association Many, not differentiated Reliability Coefficient, Factor Analysis Association Many, with specified causal patterns Path Analysis Latent Variable Modeling Difference 1 Difference Any number Analysis of Variance Yes Analysis of Covariance Difference ~ n ynumber Multivariate Analysis of Variance Difference ~ n ynumber Any number Multivariate Analysis of Covariance Difference Any number Any number Yes ChapterSeventeen Chapter Self-Tests Multiple-ChoiceQuestions Suppose a researcher wants to know what the correlation between one predictor variable and the dependent variable will be, and then how much is added by adding another predictor variable, and then perhaps even a third predictor variable. The sequence is planned in advance based on theory. What is this called? a. Multivariateanalysis of variance. b.Time-series analysis. c. Hierarchicalmultiple regression. d.Canonical correlation analysis. 2-3 Suppose that you want to know the relation between stress and study habits, over and above the fact that people tend to study less on Friday and the weekends. 2. What is this called? l a. Stepwise regression. I b. Partial correlation. c. Factor loading. I d. Factor analysis. 3.In this particular case, what is being done? a. A Cohen's Kappa is being computed ignoringday of the week. b. Study habit is held constant. - c. Day of the week is being controlledfor. d. Kendall's Tau is being applied hierarchically (that is, first with and then without including day of the week in the analysis). 4. Which of the following is the best example of test-retest reliability? a. The responses of half the items on a test are correlated with the responses on the other half of the test. b.A group of people are given a test, but half are given one test and half are given a different but similartest and their scores on the two tests are correlated. c. A group of people are given a test on one occasion and then later the same people are given another very similartest and the two sets of scores are correlated. d.A group of people are given the same test on two separateoccasionsand the scores are correlated. ChapterSeventeen 5. Computing reliability by correlating scores on two halves of a test raises the problem of which way to split the items in half. With Cronbach's alpha, in effect a. the halves are split by comparingodd and even numbered items. b.the test is simply split in half by top half and bottom half. c. it is split randomly: d.all possible splits aredone and then the results are averaged. 6. Supposethat a researcherwants to look at the relations among a large number of variables measured in a study. She wants to know how the variables clump together and which variables don't seem to correlate. What procedurewould be best for her to use? a.Factor analysis. . b.Hierarchical multiple regression. c. Stepwisemultiple regression. d.Partial correlation. , 7. In a path analysis arrows are drawn between variables. An exogenousvariable is a. a variable that is completelyisolated with no arrows going away or toward it. b,a variable that is at the start of a causal chain,having no arrows to it within the path diagram. c. a variable that only has arrows going to it within the path diagram. d.a variable that has arrows going both to it and away from it within the path diagram. 8. Which of the following is NOT an advantage of latent variable modeling over ordinary path analysis? a. The computer calculatesan overallmeasure of how good the model fits the data. b.It includes latent variables in the analysis. c. It rules out the possibilitythat any other pattern might create a better path diagram. d.A kind of significancetest can be computed. 9. How is an analysisof covariance(ANCOVA) differentfrom an analysisof variance (ANOVA)? a. ANCOVA allows you to control for the effect of an unwanted variable, whereas an ANOVA does not. b.ANOVA allows you to control for the effects of an unwanted variable, whereas an ANCOVA does not. c.ANCOVA allows you to use a factorial design, whereas an ANOVA does not. d.ANOVA allows you to use a factorialdesign, whereas an ANCOVA does not. ChapterSeventeen 10.How are MANOVAs and MANCOVAs differentfiom ANOVAs and ANCOVAs? a. They allow you to use two or more predictor variables, whereas ANOVAs and ANCOVAs do not. b.They allow you to partial out variables,whereas ANOVAs and ANCOVAs do not. c. They are more accurate than ANOVAs and ANCOVAs, but you can only use one predictor variable. d.They allow you to use more than one dependent variable, whereas ANOVAs and ANCOVAs do not. I Fill-In Questions 1. is an exploratorytechnique in which the researcher is trying to find the best small I set of predictor variables for some dependent variable based on results of a study that measured a large number of predictor variables. I I 2. Kuder-Richardson-10 (KR-20) is like Cronbach's alpha except it is used when i the items in the measure are all 3. A is the correlationbetween a variable and its factor. 4. In an ordinarypath analysis,each path coefficient is like a in multiple regression. 5. In one type of causal modeling, some of the variables included in the model are not actually measured, but can be included because they are considered to be the cause of variables that are measured in the study. These variablesthat are not measured are called variables. I 6. LISREL is a computer program that is widely used for a type of causal modeling known as 7. The variable held constant in a partial correlation is analogous to a in an analysis of covariance. 8. A researcher is planning a study comparing the effects of three kinds of psychotherapy on depression. Depression will be measured in each subject by both a behavioral and a questionnaire measure. In addition, the researcher wants to control for initial differences in expectations of benefits prior to therapy. The appropriate statistical technique for the entire analysis is a(n) Chapter Seventeen 9. Multiple-regressiontechniques and causal modeling techniques are examples of methods that focus on association, while analysis of variance and multivariate analysis of variance are examples of techniques that focus on 10.The Neyrnan-Pearson view of statisticswas in opposition to the view held by (who was also the inventor of the analysisof variance). Essays 1. A study is conducted which examines the influence of various factors on success in graduate school. Explain the (fictional) results, as shown in the following table, to a person who has never had a course in statistics. Hierarchical Multiple Regression (Dependent Variable= Reported Success in GraduateSchool) Predictor Variable RZfor All VariablesEntered Increment inR2 Social Class .04 .04 Undergraduate Record .15* .1 I* Social Skills .IS** .03 Desire to Succeed .25** .07* 2. A study was conducted in which women rated themselves on their practice of seven health behaviors. Explainthe (fictional)results, as shown in the followingtable, to a person who has never had a course in statistics. Factor Analysis Variable Eats Adequate Fiber ControlsFat Intake Controls Sugar Intake Tooth Flossing Daily Exercise Wears SeatBelts Breast Self-Exams Factor Loadings Factor I Factor 2 .73 .13 .68 .21 .71 .14 -.03 .53 .23 .49 .06 .38 .25 .38 ChapterSeventeen 3. A (fictional) study is conducted comparing achievement-test scores of high school students of five different ethnic groups. The researchersreporttheir results as follows: Although previous studies have shown differences among these ethnic groups on this achievement test, the present study, which used parental income and language skills as covariates, did not find any reliable difference; the analysis of covariance was not significant,F(4,248) = 1.63. This nonsignificant fmding is especially impressive in light of the large sample size employed in our study. Explain this result (including discussing issues of power regarding a null hypothesis result) to a person who has never had a course in statistics. 4. A study was conducted which examined marital happiness among four groups of married adults-women with no children,mothers of a newborn infant, men with no children, and fathers of a newborn infant. This created a 2 X 2 design (parental status X gender). Marital happiness was measured using three variables: a standard marital happiness questionnaire, number of positive I words included in a story written in an experimental setting about the spouse, and ratings of the I subject's marital happiness by the subject's closest friend. The (fictional) results of the study were reported as follows:I A multivariate analysis of variance (MANOVA) yielded a main effect for gender, Wilks' Lambda I F(1,418) = 14.31,p < .01, and an interaction effect, Wilks' LambdaF(1,418) = 9 . 3 8 , ~< .01. The main effect for parental status was not significant, Wilks' Lambda F(1,4 18) = 2.13. Follow-up univariate analyseswere significantonly for the questionnaire measure. For the gender main effect, women were less satisfiedwith their marriagethan men,F(l, 361) = 16.33,~< .01. The interaction effect was also significant, F(1, 361) = 8.14,p < .01. Based on a post-hoc analysis (Neuman- Keuls),the pattern of means associated with this interaction suggest that, for women, being a parent of a newborn is associated with less marital happiness, whereas for men there is little difference in marital happiness between those who are and are not fathers of newborn infants. Explain this result to a person who has never had a course in statistics. I ChapterSeventeen Using SPSS/PC+ StudentwarePlus with this Chapter The material in this section assumes you are already familiar with using SPSS from previous chapters. Because SPSS/PC+ StudentwarePlus is intended for beginning statistics students it includes only a few of the advanced procedures covered in this chapter. One of the advanced procedures considered in Chapter 17of the text that is included in the studentversion of SPSS is analysisof covariance. Thus, as an illustration,there is an example for you to try. However, please note that in practice it is important when conducting an analysis of covariance first to check if you have met the special assumption that the correlation of the dependent variable with the covariate should be the same in each cell. The procedure for making this check can be done in SPSS, but making sense of it involves statistical issues well beyond what it is reasonableto cover in a beginning course. I. Example Data for Analysis of Covariance A. The data to be used are an extension of the fictional experiment comparing the effects of information about a previous criminal record on rated guilt of a defendant in a mock trial. B. In this example, the idea is to reduce extraneous variance by controlling for scores on a questionnaire given to all subjects prior to the study which measured their general attitude towards defendants in criminal trials-how likely they feel it is that a person in that situation is guilty. C. The scores on this additional variable, in the order that subjects' data were entered in Chapter 11,are 8, 5,2,7,6, 7,3,5, 7,4, 5, 5, 8,4, and 3. 11. Enter the Data Lines for the Analysis A. If you completed the SPSS example for Chapter 11, you should have the basic data saved in a file called CHllXMPL. Call up that file now-press F3, press Enter, type CHllXMPL, and press Enter again. B. When the file appears make the following changes. 1.Add the variable INITATT to the end of the DATA LIST line, so it now reads DATA LIST FREE / INFOTYPE GUILT INITATT. 2.For each subject, add the score for this new variable after the other three numbers already entered. (Thus, the first data line should be 110 8; the second data line, 1 7 5; etc. 111.Carry Out the Analysis A. Replace the ONEWAY line with ANOVA GUILT BY INFOTYPE(1,3) WITH INITATT. Your typed lines should appear as shown in Figures SG17-1 and SG17-2. ChapterSeventeen Answers to Self-Test Problems Note. Answers to problems and essays include numerical results only. For examples of answers to essay sections see answersto Set I PracticeProblems at the back of the text. Chapter 1 , Multiple-choice: 1-b, 2-d, 3-a, 4-c, 5-b, 6-c, 7-a, 8-b, 9-a, 10-c Fill-Ins: i 1. values, intervals I I 2. grouped frequency table 3. interval, interval size I 4. histogram 5. bimodal I 6. rectangular 7. positively, skewedto the right 8. ceiling effect 9. normal curve 10.kurtosis 1. A grouped frequencytable is preferred over an ordinary frequencytable when the scoresrange over a great many differentvalues. Using an ordinary frequencytable in such a situation would fail to give a simple descriptionbecause there would be so many different values to look at, while the grouped frequencytable in this situation gives a more readily grasped summary of the pattern of scores. Answers 2. a.Interval f 70 - 79 5 60 - 69 2 50 - 59 1 40-49 2 30-39 0 20 - 29 2 10-19 8 0 - 9 4 c.Bimodal. (Kurtotic is also correct and one could arguethat it is slightlypositively skewed-that is, skewedto the right.) 3. a. Interval f 40-44 1 35-39 1 30-34 0 25-29 2 20-24 6 15-19 9 10-14 12 5- 9 8 0- 4 4 c. Shape: Unirnodal, skewedto the left 4. A floor effect is when most of the scores are near the bottom of the scale because it is not possible to get a lower score on this measure. For example, if you were to measure number of words spelledwrong on a secondgrade spellingtest completed by fifth graders-most of the fifth graders would spell none of the words wrong so the scores would pile up at zero. Answers Chapter 2 Multiple-choice: 1-b,2-d, 3-a, 4-b, 5-b, 6-d, 7-a, 8-c, 9-b, 10-d Fill-Ins: 1, mean 2, centraltendency 3. 5 4. mode 5. median I 6. outlier 7. SDz= Z(X-M)zIN;SDz= SSIN 8. themean 9. standard deviation I l 0 . Z ~(X-M)/N 1. M=CX/N=40/8=5 SS= (3.1-5)' +(3.8-5)2+(4-5)' +(4.5-5)'+ (4.5-5)2+(5.4-5)'+ (6-5)' +(8.7-5)2= 21.4 SLY = SSIN= 21.418 = 2.68; SD = .\IsD' = 02.68 = 1.64 2. Z for sales aptitude: Z = (X-M)ISD = (50-40)/4 = 1014= 2.5 Z for education aptitude: Z= (95-80)/20 = 15/20= .75 Greater aptitude in relation to others: Sales. 3. Raw score for Mary: X = (Z)(SD) +M = (1.23)(6.5)+42 = 50 Raw score for Susan: (-.62)(6.5)+42 = 38 Answers Chapter 3 Multiple-choice: 1-a,2-d, 3-c, 4-d, 5-b, 6-d, 7-d, 8-c, 9-b, 10-c Fill-Ins: 1. independent 2. tiredness 3. curvilinear,a curvilinear correlation 4. negative, a negative correlation 5. positive 6. -1 7. significant 8. 9 9. lower 10.binomial effect size display 1. a. Cholesterol 3001 2501 2001 *****Fill in dots***** 1501 1001 50I 0 1 1 2 3 4 5 6 Eggs per Day b.Generalpattern: positive linear correlation c. EggsIDay Cholesterol Raw Z Raw Z zxz~ 2 0 210 .33 0 0 -1.07100 -1.471.57 1 - .53 180 - .16 .09 5 1.60 270 1.312.09 Answers d. Proportionatereduction in error = rZ = .942= .88 e. Example answer: Eating eggs could cause higher cholesterol;people who have high cholesterol may choose to eat more eggs; some third factor, such as upbringing or genetic influences,may have the effects of making people have high cholesterol and also a liking for eggs. 2. a. 101 9 I Rated 8 1 Willingness 7 1 to Help 6 1 Campaign 5 1 4 1 3 1 2 1 1 I 0 ' 20 40 60 80 100 120 140 160 180200 Size of Gang b.Negative linear correlation. c. Size of Rated Willingness Gang to Help Campaign Raw Z Raw Z ZXZY 24 -1.2 10 1.6 -1.92 106 .4 4 - .5 - .20 42 - .9 7 .5 - .45 70 - .3 6 .2 - .06 90 .1 5 - .2 - .02 178 1.9 1 -1.6 -3.04 e.Example answer: Being in a bigger gang causes leaders to be less willing to help; leaders who are willing to help attractmore members; the kind of gang that attracts leaderswho are willing to help also attracts large numbers of members. 3. b. There may be a restriction in range. Answers Chapter 4 Multiple Choice: 1-a, 2-d, 3-c, 4-c, 5-c, 6-d, 7-d, 8-a, 9-a, 10-c Fill-Ins: 1. regression 2. 39 3. constant, regression constant 4. slope 5. error 6. -25 7. proportion of variance accounted for 8. multicollinearity 9. -.72 10.-13.2 1a.Predicted Z for performance equals = .87 X Z for hours of exercise per week. b. Hours Performance Exercise in Training Prediction Predictionfrom Person Per Week Program from Mean Bivariate Rule - Tested X Zx Y Zy Z,Error Error2 ZyError Errorz 1 3-.59 45 -.54 0 .54 .29 -.51 .03 .OO 2 11 1.76 90 1.47 0 -1.47 2.16 1.53 .06 .OO 3 6 .29 53 -.I8 0 .18 .03 .25 .43 .18 4 1 -1.17 25 -1.43 0 1.43 2.04 -1.02 .41 .17 5 4 -.29 72 .67 0 -.67 .45 -.25 -.92 .85 M=5 M=57 SST=4.97 SS,=1.20 SD=3.41SD=22.35 Proportionate Reduction in Error =(SST-SS,)/SST=(4.97-1.20)/4.97=3.77/4.97= .76 Check: r =.87; 13 = .872= .76; same as figured above. Answers o Two predicted Yvalues (for arbitrary Xs): X=O: Z=(X-M)ISD=(O-5)/3.41=-1.47; PredictedZ~(.87)(-1.47)=-1.28 Predicted Y=(Z)(SD)+M=(-1.28)(22.35)+57=28.39 x=10:z=(10-5)13.41=1.47 Predicted Z~(.87)(1.47)=1.28 Predicted Y=(1.28)(22.35)+57=85.61 0 2 4 6 8 1 0 1 2 d. 8 hours: Z=(X-M)/SD=(8-5)/3.41=3/3.41=.88;I Predicted Z~(r)(Z,)=(.87)(.88)=.77. Raw Predicted Performance (Y) = (.77)(22.35) +57 = 17.21+57 = 74.21 2a. Predicted Z for days to recover = .84 X Z for number of stressful events in past month Number of ! Stressfil Days to Prediction Predictionfrom I Person Events Recover from Mean BivariateRule Tested X Zx Y Zy Z Error Errorz Z Error Error2 I A 2 - .39 5 - .63 0 -.63 .40 -.36 .24 .06 B 0 -1.17 4 -1.26 0 -1.26 1.59 1.09 .17 .03 C 3 0 7 .63 0 .63 .40 0 -.63 .40 D 7 1.57 8 1.26 0 1.26 1.59 1.46 .20 .04 M=3 M=6 SST=3.98 SSE=.53 SD-2.55 SD=1.58 b. Proportionate Reduction in Error = (SSTSSE)lSST=(3.98-.53)/3.98=3.4513.98=.87 Check: r =.93; rZ = .86 same (within rounding error) as figured above. c. 5 Stressful Events:Z = (X-M)ISD = (5-3)/2.55 = 212.55= .78 Predicted Zy= (r)(Zx) = (.93)(.78) = .73 Predicted Raw Days to Recover (Y)=(PredictedZy)(SDy)+My=(.73)(1.58)+6 =1.15+6= 7.15 3a.Predicted Years = 75-(.1)(15)-(4)(0)+(.9)(2)+(3)(1) = 75-1.5-0+1.8+3 = 78.3 b. Predicted Years = 75-(0)(15)-(4)(2)+(.9)(1)+(3)(0) = 75-0-8+.9+0= 67.9 Answers Chapter 5 Multiple Choice: 1-b,2-c, 3-a, 4-d, 5-d, 6-b, 7-c, 8-c, 9-b, 10-a Fill-Ins: 3. normal curve table, table 4. subjective 5. probability 6. haphazard 7. parameter 8. statistic 9. p,mu 10.inferential 1a.Z = (X-M)/SD = (55-50)/5 = 515 = I. Approximation is that 34% are between mean and Z = 1. 50% are above mean, thus 16%(50%-34%) are above 55. b. Z = (40-50)/5 = -2. 34% + 14%= 48% are between mean and 2 SDs from mean. Thus, 2% (50% - 48%) are below 40. c. Z = (45-50)/5 = -1. 34% are between mean and 1sd below mean. 50% are above mean. Thus total above 40 is 84% (50%+34%). NOTE: It is easierto solve such problems if you draw pictures of the normal curve and the areas involved. 2a.From normal curve table, top 5% (45% between mean and Z) begins at 1.64 (or you can use 1.65). Thus, correspondingraw score is X = (Z)(SD) +M = (1.64)(2.8)+ 15.3 = 4.59 + 15.3 = 19.89. b. From normal curve table, bottom 10%(40% between mean and Z) begins at -1.28. Thus, correspondingraw score = (-1.28)(2.8)+15.3=-3.58+15.3= 11.72. Answers Chapter 6 Multiple Choice: 1-b,2-d, 3-c, 4-b, 5-a, 6-c, 7-b, 8-a, 9-c, 10-d Fill-Ins 1. hypothesistesting 2. people in general 3. Population 1 is taller than Population 2 I 4. Population 1 is not taller than Population 2 5. Population 2, people in general 6. null hypothesis 7. .O1 and .05, 1%and 5% 8. reject the null hypothesis,the research hypothesisis supported 9. 7.5% I 10.do not reject the null hypothesis, the study is inconclusive 1. Cutoff (.O5 level, one-tailed) = -1.65. Z score on comparison distribution of time to fall asleep is -136. Reject the null hypothesisthat time to fall asleep is unaffected by a glass of warm milk. 2. Cutoff (.O5 level, one-tailed)= -1-65. Z score on comparisondistributionof bus times is 2.17. Rejectthe null hypothesisthat the planes from this company arrive no later than planes of airlines in general. 3. Cutoff (.05 level, two-tailed)= *1.96. Z score on comparisondistributionof number of items recalled is -1.5. Do not reject the null hypothesis,the study is inconclusive. Answers Chapter 7 Multiple-Choice: 1-c,2-c, 3-b, 4-d, 5-b, 6-b, 7-a, 8-c, 9-b, 10-c Fill-Ins 1. distribution of means, distributionof means of all possible samplesof a given size from the population, sampling distributionof the mean 2. shape 3. 40.8 4. 2 5. -2 6. 30 7. the standard deviation of the distributionof means, oM 8. I 9. norm 10.Ztest 1. Cutoff (.O1 level, one-tailed)= 2.33 Distribution of means: p=48; a,Z=oZIN=14415=28.8,0 ~ 5 . 3 7 ;shape=normal. M = 63; Z score for test group: Z= (63-48)15.37= 1515.37= 2.79 Conclusion: Reject the null hypothesis. 2. Cutoff (.O5 level, one-tailed) = -1.64. Distribution of means: p=38; oMZ=oZ/N=36/10=3.6,0 ~ 1 . 9 0 ;shape=normal. Z score for test group: Z= (36-38)/3.6= -211.90 = -1.05 Conclusion: Do NOT reject the null hypothesis. 3. a. A Type I error would be to concludefrom the hypothesistesting process that the data supportthe contentionthat the new learningtechniqueworks when in fact it does not. A Type I1 error would be if the hypothesis testing process led to an inconclusive result (failureto reject the null hypothesis) about whetherthe learning techniqueworks, when in fact it does work. b.A Type I error would be to conclude from the hypothesis testing process that the data supportthe contention that the new drug speeds learning of symbolic communicationfor chimps,when in fact the new drug does not speed such learning. A Type I1 error would be if the hypothesistesting process led to an inconclusiveresult (failure to reject the null hypothesis) about whether the new drug speeds learning,when in fact the new drug does speed the learning. Answers Fill-Ins: , ! I 1, a,alpha, significance level 2. D,beta, 1- power 3. 84% 4. 98%I 5. minimum meaningful effect size 6. effect size 7. medium 8. o, standard deviation of the population of individual cases, oZ,variance of the distribution of 1 individual cases I I 9. o,, standard deviation of the distributionof means, o$, variance of the distribution of means, I 10.meta-analysis Chapter 8 la. 1-1, = 3.80-.25 = 3.55; y2=3,80;o=1.50; oM=d(1.502125)= .30. Z score cutoff on the Population 2 distributionof means = -1.65. Raw score cutoff on the Population2 distribution of means = (-1.65)(.30)+3.80 = 3.30. Corresponding Z score on the Population 1 distributionof means = (3.30-3.55)/.30 = -23 Power (using normal curve table to determine area below -33) = 50%-29.67% = 20.33%. c. Effect size = (3.55-3.80)/1.50 = -.17; small effect size. d. Example answers: 1.Give a stronger lecture-and thus provide a basis for increasingthe expected difference between known and hypothesized means. 2.Use only children of a particular age-and thus hopefully reduce the population variation. 3.Use more accurate measurement of consumptionof candy (perhaps use records over a two-week period)-and thus hopefully further reduce the population variance. 4.Use more subjects-and thus reduce the variance of the distributions of means. I I Multiple Choice: 1-d, 2-d, 3-c, 4-a, 5-b, 6-b, 7-b, 8-a, 9-c, 10-b 2a. p, = 74; 0=12; G ~ =d(122130) = 2.19. Z score cutoff on the Population2 distribution of means = 2.33. Raw score cutoff on the Population 2 distribution of means = (2.33)(2.19)+68= 73.10. CorrespondingZ score on the Population 1distribution of means = (73.10-74)/2.19= -.41. Power (using normal curve table to determine area above -.41) = 50%+15.91% = 65.91%. c. Effect size = (74-68)/12 = .5; medium effect size. d. Example answers: 1.Use an even sadder movie if possible-and thus provide a basis for increasingthe expected difference between known and expected mean. 2.Use a more accuratememory test if one is available(one that has a lower amount of variation in the general population)-and thus furtherreduce the population variance. 3.Use more subjects-and thus reduce the variance of the distributionsof means. 4.Use the .05 significancelevel-and thus make the cutoff on the Population 2 distributionof means not so extreme. 3. Conclusion is that the research hypothesis(that ball players like to chew gum more than general public) is probably false and the null hypothesis (that ball players liking of gum is no different from that of the general public) is probably true. Explanation: With high power, it is easy tc.reject the null hypothesis if it is false. Thus, had it been false, it probably would have been rejected. Since it was not rejected, it probably is not false. 4. Conclusion:Result could be due to a very small effect size so that it may not be of practical importance. Explanation: With very high power, it would have been easy to reject the null hypothesis even if the true effect were very small. Chapter 9 Multiple-choice: 1-c,2-b, 3-a, 4-b, 5-a, 6-a, 7-d, 8-c, 9-c, 10-d Fill Ins: 1. degrees of freedom 2, known population mean 3. N, sample size, number of cases in the sample 4. robust 5. .8 6. larger 7. control group 8. greater, larger 9. level of significance,alpha level 10.normal distribution.normal curve Answers 1. t test for a single sample t needed (df-5),p < .05, 1-tailed= -2.015 M = 17;S=4.4; S,- .85; t = -3.53 Reject null hypothesis. 2. t test for dependent means t needed (df = 4),p < .05, I-tailed = 2.132 Difference scores = .07, .18, .14, .18, -.13 M = .088; S = .017;S,= .003;t = 26.08 Reject null hypothesis. 4. t needed (df = 8),p < .05, 1-tailed= -1.860 Difference scores = -2, -2, -4, -2, -6, -1, +1, +4,O M=-1.33; S =8.25; SM=.96; t=-1.39 Do NOT reject null hypothesis. Chapter 10 Multiple-choice: 1-b,2-a, 3-b, 4-c, 5-d, 6-d, 7-d, 8-b, 9-b, 10-c Fill Ins: 1. independentmeans 2. equal 3. distributionof means 4. the distributionof differencesbetween means 5. standard deviation 6. normally distributed 7. variance 8. Sp(pooled estimate of the standarddeviationof the populations) 9. less 10.t(23)=3.21,p < -01,one-tailed. Answers 1. t-test for independentmeans;t needed (df-8, p<.01,2-tailed) =h3.356. TRAINING:N=5, df-4, M=15.8, S=2.7; NO TRAINING:N=5, df-4, M=16, S=6.5. S,2=[(4/8)(2.7)]+[(4/8)(6.5)]=4.6;SM2=4.6/5=.92;Sm2=.92;SDE2=.92+.92=134; SDE= 1.36. t=(15.8-16)/1.36=-.15. Do NOT reject null hypothesis. The results of the study are inconclusiveas to whether self-defense training enhances self-confidence.. . 2. t-test for independentmeans; t needed (df-5,p<.05, 1-tailed)= 2.015. PLACEBO: N=4, df-3, M=90.3, S=184.9; CONTROL:N=3, df-2, M=88, S=133. Sp2=[(3/5)(184.9)]+[(2/5)(133)]=164.15; SM2=164.15/4=41;Sm2=164.15/3=54.7; sDF2=41+54.7=95.7;SDE=9.78. t=(90.3-88)/9.78=.24. Do NOT reject the null hypothesis. The results of the study are inconclusiveas to whether this placebo procedure can produce an increase in intelligence. Chapter 11 Multiple-choice: 1-a,2-a, 3-b, 4-d, 5-c, 6-a, 7-b, 8-d, 9-d, 10-c Fill Ins: 1. equal (the same) 2. within 3. one 4. the signal 5. SM2,variance of the distributionof means 6. distribution of means 7. population 8. the population standarddeviation (oor Sw). 9. .10 10.less than 1. Fneeded (df-2,12;p < .05) = 3.89 Small:M=60.6,S=33.3; Standard:M=60.2, S=89.2; Large:M=54.2, S=59.2. GM=58.3; SM2=12.9;SB2= 64.3; Sw2= 60.6;F = 1.06. Do not reject the null hypothesis. Answers 2. F needed (df=3,8;p < .01) = 7.59 A' COLLEGE: M=4.33, S=2.33; PSYCHOLOGISTS:M=6, S = l ; LAWYERS: M=6, S = l ; PUBLIC: M=4.33, S=.33. GM=5.17;SM2=.93;SB2= 2.80; Sw2= 1.17;F= 2.39. Do not reject null hypothesis. 3. Fneeded (df=3,76;p<.01) = 4.06 (using figure for df = 3,75). GMs3.87; SM2=1.70;SB2=34;Sw2=3;F=11.33 Reject the null hypothesis;the researchhypothesis is supported. Chapter 12 Multiple-choice: I-c, 2-d, 3-a, 4-a, 5-a, 6-c, 7-c, 8-a, 9-c, 10-b 1. SS,, (M-GM)2,the sum of each scores' group's squared deviationsfrom the grand mean 2. 89+105= 194 3. 14+60= 74 4. 75+45 = 120 5. Source SS df MS F Between 40 4 I0 5 Within 50 25 2 Total 90 29 6. Bonferroni procedure 7. post-hoc comparisons,a posterioricomparisons 8. .06 9. .01 10.No Coffee 1. Fneeded (df-2,7; p < .01)= 9.55 Analysis of Variance Table Source SS d f MS F Between 48.2 2 24.1 22.9 Within 7.4 7 1.1 Total 55.6 9 Decision: Reject null hypothesis. Answers 2. Fneeded (dp2,8; pC.05) = 4.46 Analysis of Variance Table Source SS df MS F Between 5.06 2 2.53 3.09 Within 6.59 8 .82 Total 11.65 10 Decision: Do not reject null hypothesis. Chapter 13 Multiple-choice: 1-c,2-a, 3-c, 4-c, 5-b, 6-c, 7-d, 8-d, 9-c, 10-a Fill-Ins: 1. interaction 2. 5 X 3 , 3 X 5 3. parallel 4. interaction effect 5. between 6. grand mean, overall mean 7. Nc,,,,, number of cells 8. RZ,the proportion of variance accounted for, effect size 9. number of levels of the variable with which it is crossed 10.repeated-measures, within-subjects 3. a . 8 1 *Short Level of IndependentVariable I1 7 1 6 1 o -Long Level of IndependentVariable I1 5 I 4 1 0 *********drawin lines********** 3 1 2 1 1 1 0 1 - v v v A B C Independent Variable I b. Levels of Independent Variable I Answers , A B C Levels ofI Independent ) SHORT 6 6 6 6 Variable I1 } LONG 4 6 8 6 c. Main effects for independent Variable I and interaction effect. 4. Fneeded for main effect of ethnicity (df-2,12; p<.05) = 3.89 Fneeded for main effect of parent's income (dp1,12;p<.Ol) = 9.33 Fneeded for interaction effect (dp2,12;p<.01) = 6.93 Analysis of Variance Table Source SS df MS F reject? R2I Between-Groups 1 Ethnicity 1.44 2 .72 .93 no -13 I Parent's Income 18.00 1 18.00 23.14 yes .66 I Interaction 66.33 2 33.17 42.64 yes .88 I Within-Group 9.33 12 .78 Chapter 14 1 Multiple-choice: 1-c, 2-c, 3-b, 4-a, 5-d, 6-c, 7-c, 8-b, 9-b, 10-b Fill Ins: 1. Karl Pearson 2. chi-square test for goodness of fit 3. E 4. The number of categories minus 1 5. independent 6. row 7. normal, normal distributions,normally distributed 8. 4 { x2 ([IVl[d'I) 1 9. sample size; number of cases 10. a. 30160 = 50%; b. 14/44= 32%; c. $2.42180) = .17 Answers 2. X2needed (df-3,p < .05) = 7.815 0 E 0-E (0-E)' (0-E)'/E PlantA 15 21 -6 36 1.71 Plant B 34 21 13 169 8.05 PlantC 17 21 -4 16 .76 PlantD 18 21 -3 9 .43 -- - 84 60 0 X2=10.95 Decision: Reject null hypothesis. 3. X2needed (df-4, p < .05) = 9.488 Level of Mother's Depression During Pregnancy Birthweight Severe Mild b it Depressed I I I Total Percent below (4.7) (4.7) (4.7) average 8 5 1 14 23.3 I I 1 I I I average ( 7-71 (7.7) (7.7) 3 8 12 23 38.3 I I I I I above (7.7) (7.7) (7.7) average 9 7 7 23 38.3 TOTAL 20 20 20 60 99.9 XZ= 10.76 Decision: Reject null hypothesis. Cramer's I$ = 6{10.76/(60][2])} = .30, medium effect size. 4. Cramer's I$ = -35,medium effect. Answers Chapter 15 1 Multiple-choice: 1-b,2-b, 3-c, 4-b, 5-a, 6-d, 7-d, 8-d, 9-c, 10-b Fill-Ins: 1. non-normal; skewed;kurtotic 2. outlier 3. inverse 4. nonparametrictests; distribution-freetestsI 5. t-test for independentmeans 6. rank-order 7. normal curve, normal distribution 8. equal-interval I 9. randomizationI I i 10.computer-intensivemethods;randomization tests; approximate randomizationtests 1. A. skewed right (due to an outlier) B. approximatelynormal 2.a. t needed (df=8, 1-tailed,p < .05) = 2.306 Did Not Expect Did Expect M 87.4 65.2 S 63.3 407.7 S,Z = 235.5) SM2 47.1 47.1 SDw2= 94.2; SDIF= 9.71 t = (87.4-65.2)/9.71= 2.29 Did not expect to be liked: 77, 83,88,91,98 Did expect to be liked: 46, 57,58,66, 99 Do not reject null hypothesis. Answers b.SQUARE ROOT TRANSFORMEDDATA Did Not Expect Did Expect 8.77 6.78 9.11 7.55 9.38 7.62 9.54 8.12 9.90 9.95 M 9.34 8.00 S .18 1.41 S$ = .SO SM2 .16 .16 SDFz= .32;SDE= .57 t = (9.34-8.00)/.57 = 2.35 Reject null hypothesis. 3. t needed (df=8, two-tailed,p < .05) = 2.306 Movers Non-Movers Raw Rank Raw Rank 3 5 1 2 I 2 4 6 9 9 5 7 13 10 2 4 6 8 1 2 M 6.80 4.20 S 10.70 5.20 Spz= 6.00 SMz 1.20 1.20 SDE2= 2.40; SDF= 1.55 t = (6.8-4.2)/1.55 = 1.68 Do not reject null hypothesis. 4. Mean difference for our distribution:-1.6 (20)(.05) = 1, score must be the most extreme score Mean differences in order of the layout in the problem are as follows: -1.6 -3.4 -4.27 -1 -.07 - .93 2.33-2.73 .53 -.33 1.6 3.4 4.27 1 .07 .93 -2.33 2.73 -.53 .33 Mean differences in order from smallest to largest: -4.27 -3.4 -2.73 -2.33 -1.6 -1 -.93 -.53 -.33 -.07 .07 .33 .53 .93 1 1.6 2.33 2.73 3.4 4.27 Actual mean is fifth highest. Therefore, do not reject null hypothesis. Answers Chapter 16 Multiple-Choice:1-c,2-d, 3-b, 4-a, 5-c, 6-d, 7-c, 8-b, 9-b, 10-c Fill-Ins: 1. multiple regressionlcorrelation 2. a 3. difference between (or variation among) groupmeans; variation within groups 4. numerator; multiplies 5. the t score 6. t-test for independentmeans, analysisof variance with two groups 7. multiple regressionlcorrelation 8. ss, 9. 0,0,1,O; Hurnanities=O, Social Science=O, Natural Science=l, Arts=O I 10.generative 1. t-test for independentmeans Analysis of Variance df = 8 df = 1,8 needed t (p<.05)=2.306 needed F (p<.05)= 5.32; 65.32=2.307 Sp2=71.65 S$ = 71.65 SM2=71.75/5=14.33;SM2=14.33;SM2=6.48;SB2=(6.48)(5)=32.4 (Denominatordivided by n=5) (Numerator multiplied by n=5) sDE2=14.33+14.33=28.66; sD,=428.66=5.35 t= 3.615.35 = .68 F= 32.4171.65= .45; 4.45 = .67 Do not reject null hypothesis. Do not reject null hypothesis. 2. t-test for independentmeans df = (5-1)+(5-1)=8 needed t (p<.05)=2.306 Computations SlZ=l;s2'=1.5; sp2=1.25 SM2=1.2515=.25; SM2=.25; SDE2=.25+.25=.5;sDF=.71 t = 11.71= 1.41 Do not reject null hypothesis. Correlation coefficient df= 10-2= 8 needed t (p<.05) = 2.306 Computations CodeXas 1 or 0; SDF.~;MF.~ M r 1.5;SDr1.12 Z(ZXZY)= 4.47; r=4.47110=.45 t = (.45)(48)/~(l-.452)= 1.43 Do not reject null hypothesis. Answers 3. Analysis of Variance Correlation coefficient Source SS df MS F CodeX as 1 or 0;SDr.5; M r . 5 Between 2664.5 1 2664.5 3.67 M ~ 1 . 5 ;S D ~ 1 . 1 2 Within 4359 6 726.5 C(ZxZy)=-4.91;r=-4.91/8=-.61 Total 7023.5 -= 228 - (36.5)(X) SubjectX Y My Y-My) (Y-MY)Z ( Y - ( Y - 2 4. Strain------ Pigeon S or Not G or Not R or Not Light Detection Level 1 i o o .014 2 0 1 0 .058 3 1 0 0 .028 4 0 0 1 .lo1 5 0 0 0 .024 6 0 1 0 .077 7 0 0 1 .088 8 0 0 0 .023 9 1 0 0 .019 10 1 0 0 .010 11 0 0 1 .075 12 0 0 0 .043 13 0 0 0 .028 14 0 0 1 .098 15 0 1 0 .071 16 0 1 0 .083 Answers Chapter 17 Multiple-Choice: 1-c,2-b, 3-c, 4-d, 5-d, 6-a, 7-b, 8-c, 9-a, 10-d. Fill-Ins 1. stepwisemultiple regression, stepwiseregression 2. dichotomous,having two values 3. factor loading 4. beta, standardizedregression coefficient 5. latent 6. latent variable causalmodeling, structuralequation modeling 7. covariate 8. analysis of covariance,ANCOVA 9. differences 1 10.Fisher I Nominal variable Chi-square statistic (X2) I Chi-square test for goodness of fit Contingency table I Expected frequency Independence I Observed frequency Chi-square test for independence I I A statistic that reflects the overall lack of fit between the expected and observed frequencies; the sum, A variable with values that are over all categories or cells, of the categories, with no numeric squared difference between relation; same as a categorical observed and expected frequencies variable. divided by the expected frequency. 14 I-- A hypothesis-testing procedure that A two-dimensional chart showing examines how well an observed frequencies in each combination of frequency distribution of a categories of two categorical categorical variable fits some variables. expected pattern of frequencies. 14 14 I- -+ The situation of no systematic relationship between two variables. 14 I-- A hypothesis-testing procedure that examines whether the distribution of frequencies over the categories of one categorical variable are unrelated to the distribution of frequencies over the categories of another categorical variable. 14 I In a chi-square test, the number of cases in a category or cell expected if the null hypothesis were true. In a chi-square test, the number of cases in a category or cell actually obtained in the study. Marginal frequency Rank-order test Phi coefficient (4) Nonparametric test I Cramer's phi (Cramer's 4) Distribution-free test I Data transformation Parametric test A hypothesis-testing procedure which makes use of rank-ordered data. A hypothesis-testingprocedure making no assumptions about population parameters; approximately the same as a distribution-free test. 15 t- A hypothesis-testing procedure making no assumptions about population distributions; approximately the same as a nonparametric test. An ordinary hypothesis-testing procedure which makes assumptions about the shape and other parameters of the populations. In chi-square, the frequency in a row or column of a contingency table. A measure of association between two dichotomous categorical variables. A measure of association between two categorical variables that is applicable regardless of the number of levels of the two variables. 14 --I The application of one of several mathematical procedures (e.g., taking the square root, log, inverse) to each score in a sample in order to make the sample distribution closer to normal. 15 I Levels of measurement Randomization test I Approximate randomization tests I Equal-interval measurement I Ordinal measurement General linear model I I Least-squares model Nominal coding I I A hypothesis-testing procedure that considers every possible reorganization of the data in the Types of underlying numerical sample in order to determine if the information provided by a measure, organization of the actual sample such as equal-interval, rank-order, data was unlikely to occur by and nominal (categorical). chance. 15 15 I- -I Approximation to a randomization Measurement in which the test in which a computer creates a differencebetween any two scores randomly-determined large number represents an equal amount of of the possible reorganizations of difference in the underlying thing the data. being measured. 15 15 t- __1 A general formula that is the basis Measurement in which the scores of most of the statistical methods are ranks. covered in this text. The usual method of determining Converting a nominal predictor the optimal values of regression variable in an analysis of variance coefficients as those that produce into several two-level numeric the least squared error between variables. predicted and actual values. 16 16 I I I ~ierarchicalmultiple regression Controlling for I I Stepwise multiple regression I I ! I Reliability I I Partial correlation coefficient Test-retest reliability I Partialing out Split-half reliability I A procedure in which predictor Removing the influence of a variables are added in a planned variable from the association among sequential fashion to examine the the other variables; same as contribution of each over and above partialing out. those already included. The degree of consistency of a An exploratory procedure which measure. identifies the best subset of potential predictor variables. The correlation between scores obtained on a measure at two The correlation between two different testings of the same variables, over and above the people. influence of one or more other variables. 17 I- Correlation of the scores from Removing the influence of a items representing two halves of a variable from the association among test. the other variables; same as controlling for. 17 17 I Cronbach's alpha Path coefficient Factor analysis Causal analysis Factor loading Exogenous variable Endogenous variablePath analysis The degree of relation associated with an arrow in a path analysis (including latent variable models). A procedure, such as path analysis or latent variable modeling, that analyzes correlations among a group of variables in terms of a predicted pattern of causal relations among them. 17 I-- A widely used index of a measure's reliability in terms of the correlations among the items. An exploratory statistical procedure that identifies groupings of variables (factors) correlating maximally with each other and minimally with other variables. 17 -I A variable in a path analysis (including latent variable models) The correlation of a variable with a that is at the start of a causal chain, factor. having no arrows to it within the path diagram. 17 I- A variable in a path analysis (including latent variable models) that has arrows to it. A method of analyzing the correlations among a group of variables in terms of a predicted pattern of causal relations. I Latent variable modeling Analysis of covariance I Structural equation modeling Covariate Measurement model Causal model I 17 d 17 1 *I Multivariate analysis of variance I Multivariate analysis of covariance I I 1 A sophisticated type of path analysis involving latent An analysis of variance which is (unmeasured) variables, permits a conducted after first adjusting the kind of significance test, and variables to control for the effect of provides measures of the overall fit one or more unwanted additional of the data to the hypothesized variables. causal pattern. 17 17 I- --I A variable controlled for in an Same as latent variable modeling. analysis of covariance. An analysis of variance in which In latent variable modeling, the set there is more than one dependent of causal paths between latent and variable. measured variables. An analysis of covariance in which In latent variable modeling, the set there is more than one dependent of causal paths between latent variable. variables. Assumption Weighted average Robustness Pooled estimate of the population I variance I I Variance of a distribution of I t test for independent means differences between means I I I Distribution of differences between means Harmonic mean An average in which the scores being averaged do not have equal influence on the total. An average of the estimates of the population variance from two samples, each estimate weighted by the proportion of its degrees of freedom of the total degrees of freedom. 10 I- It equals the sum of the variances of the distributions of means corresponding to each of two samples. 10 I- A special kind of average which is more influenced by smaller scores. A condition, such as a population having a normal distribution, required for carrying out a particular hypothesis-testing procedure. The extent to which a particular hypothesis-testing procedure is reasonably accurate even when its assumptions are violated. Hypothesis testing procedure in which there are two separate groups of subjects whose scores are independent of each other and in which the population variance is not known. 10 -I The distribution of all possible differences between means of two samples; the comparison distribution in a t test for independent means. Analysis of variance F distribution I Between-group population variance estimate Structural model I Within-group population variance estimate I Analysis of variance table I F ratio Multiple comparisons I A mathematically defined curve describing the comparison distribution used in an analysis of variance; the distribution of F ratios when the null hypothesis is true. 11 I- A way of understanding ANOVA as a division of the deviation of each score from the overall mean into parts corresponding to its deviation from its group's mean and its group's mean's deviation from the overall mean. 12 I--- A chart showing the major elements in computing an analysis of variance using the structural-model approach. Procedures for examining the differences among particular means in the context of an overall analysis of variance. A hypothesis-testing procedure for studies involving two or more groups. In analysis of variance, the estimate of the variance of the population distribution of individual cases based on the variation among the means of the groups studied. 11 In analysis of variance, the estimate of the variance of the distribution of the population of individual cases based on the variation among the scores within each of the groups studied. 11 In analysis of variance, the ratio of the between-group population variance estimate to within-group population variance estimate. Planned comparisons Linear contrast Bonferroni procedure Factorial design Post-hoc comparisons Interaction effect I Proportion of variance accounted for Main effect I A special kind of planned comparison, that is like a correlation, in which, for each subject, one variable is the predicted influence of the group the subject is in and the other variable is the score on what is being measured. 12 t- A way of organizing a study in which the influence of two or more variables is studied at once by constructing groupings which include every combination of the levels of the variables. 12 I- Situations in factorial analysis of variance in which the combination of variables has a special effect which you could not predict from knowing about the effects of each of the two variables separately. 13 t- Difference between groups on one dimension of a factorial design (sometimes used only for significant differences). Multiple comparisons in which the particular means to be compared were designated in advance. A multiple-comparison procedure in which the total alpha percentage is divided among the set of comparisons so that each is tested at a more stringent significance level. 12 -I Multiple comparisons among particular means which were not designated in advance, but are being conducted as part of an exploratory analysis after the study is completed. 12 An indicator of effect size in analysis of variance; same as the proportionate reduction in error in multiple regression. Cell mean Least-squares analysis of variance I Collapsing over factors One-way analysis of variance I Dimension Two-way analysis of variance I Marginal mean ~epeated-measuresanalysis of I variance i The recommended approach to The mean of a particular factorial analysis of variance when combination of levels of the the number of subjects in the cells independent variables in a factorial are not all equal. design. A procedure in a factorial analysis Analysis of variance in which there of variance in which one of the is only one independent variable. dimensions (independent variables) is ignored, reducing the overall analysis to one less dimension. 13 13 I- __I In a factorial design, one of the Analysis of variance for a two-way independent variables crossed with factorial design. another independent variable. t-- In a factorial design, the mean Analysis of variance in which all score for all the subjects at a the levels of the independent particular level of one of the variable(s) are measured within the independent variables; row mean or same subjects. column mean. 13 i 13 I Hypothesis testing Cutoff sample score Research hypothesis Statistical significance I Null hypothesis Level of significance (or) I I Comparison distribution Conventional levels of significance A systematic procedure for The point on the comparison determining whether results of an distribution which, if the sample experiment provide support for a score reaches or exceeds it, the null particular theory or practical hypothesis will be rejected. innovation thought to be applicable to a population. 6 6 t- -I An outcome of -hypothesistesting in which the null hypothesis is A statement about the predicted rejected. relation between populations. The probability of obtaining statistical significanceif the null hypothesis is actually true. A statement that there is no difference between 'the populations. The distribution representing the The levels of significancewidely situation if the null hypothesis is used in psychology @ < .05 and p true and to which you compare < .01). your sample. Directional hypothesis Distribution of means One-tailed test Variance of a distribution of means I Nondirectional hypothesis Type I error Two-tailed test Type I1 error A distribution of all the possible A research hypothesis predicting a means of samples of a given size particular direction of difference from a particular population. between populations. The variance of the population The hypothesis-testing procedure divided by the number of cases in for a directional hypothesis. each sample. Rejecting the null hypothesis when A research hypothesis that does not in fact it is true. predict a particular direction of difference between populations. Failing to reject the null hypothesis The hypothesis-testing procedure when in fact it is false. for a nondirectional hypothesis. Norms Beta Z test Effect size Statistical power Effect-size conventions Alpha I The probability of a Type I1 error. The separation (lack of overlap) between or among populations due to the independent variable. Known population parameters on standardized tests which serve as standards of comparison for any individual who takes the test. A hypothesis-testing procedure in which there is a single sample and the population variance is known. The probability that the study will Conventions about what to consider yield a significant result if the a small, medium, and large effect research hypothesis is true. size. A statistical method for combining The probability of a Type I error; the results of independent studies, same as significance level. usually focusing on effect sizes. t test Repeated-measures design I I Unbiased estimate of the population variance I Within-subject design I Degrees of freedom t test for dependent means I I t distribution Difference score I A research strategy in which each subject is tested more than once; same as within-subject design. A research strategy in which each subject is tested more than once; same as repeated-measures design. A hypothesis-testing procedure in which the population variance is unknown. An estimate of the population variance, based on sample scores, which is equally likely to over or underestimate the true population variance. 9 -I A hypothesis-testing procedure in which there are two scores for each The number of scores free to vary subject and the population variance when estimating a population is not known. parameter. The difference between a subject's A mathematically defined curve score on one testing and the same describing the comparison subject's score on another testing. distribution used in a t test. Independent variable Positive linear correlation Dependent variable Negative linear correlation I Predictor variable Linear correlation Scatter diagram Curvilinear correlation I A relation between two variables in A variable that is considered to be which high scores on one go with a cause (or, in regression, any high scores on the other, mediums predictor variable). with mediums, and lows with lows. A relation between two variables in which high scores on one go with A variable that is considered to be low scores on the other, mediums an effect (or, in regression, any with mediums, and lows with variable that is predicted about). highs. 3 394 I- --I A relation between two variables which shows up on a scatter A variable that is used as a basis diagram as the dots roughly for estimating scores of individuals following a straight line. on another variable. A relation between two variables which shows up on a scatter A graphic display of the pattern of diagram as the dots roughly relationship between two variables. following a systematic pattern that is not a straight line. 3 3 I I The situation in which a correlation A relation between two variables is computed when only a limited which shows up on a scatter range of the possible values on one diagram as the dots exactly variable are included in the group following a straight line; a studied. correlation of r equals 1 or -1. 3 t- A statistical procedure that computes the correlation between two variables that would be expected if both variables were measured with perfect reliability. 3 I- A table showing the relation between two variables in which each variable is divided in half at the median and the percentage of cases in each of the four combinations are shown. 3 The average of the cross-products of Z scores of two variables. The extent to which the results of a study would be unlikely if in fact there were no association or difference in the populations the measured scores represent. 396 --I A table in which the variables are The measure of association between named on the top and along the variables that is used when side, and the correlations among comparing associations obtained in them are all shown. different studies or with different variables; r2. 3 I 3,4 I I Proportionate reduction in error I Regression line Bivariate prediction Error (in prediction) I Regression coefficient Multiple regression I I Regression constant Multicollinearity I A line on a graph representing the predicted values of the dependent variable for each value of the independent variable. The actual score minus the predicted score. The prediction of scores on one variable based on scores of two or more other variables. 1 The reduction in squared error using a bivariate or multiple regression prediction rule over the squared error using the mean to predict, expressed as a proportion of the squared error when using the mean to predict. 3,4,12,13,16,17 -I The prediction of scores on one variable based on scores of one other variable. The number multiplied by a person's score on the independent variable as part of a formula for predicting scores on the dependent variable. In a bivariate or multiple prediction The situation in multiple regression model using raw scores, a in which the predictor variables are particular fixed number added into correlated with each other. the prediction. Probability (p) Random selection I Subjective interpretation of probability I Haphazard selection Population Parameter Sample Statistic A procedure of selecting a sample of individuals to study which uses truly random procedures. A procedure of selecting a sample of individuals to study by taking whoever is available or happens to be first on a list. A descriptive statistic for a population. A descriptive statistic computed from the data about a particular sample of scores. The expected relative frequency of a particular outcome. Understanding probability as the degree of one's certainty that a particular outcome will occur. The scores of the entire group of subjects to which a researcher intends the results of a study to apply. The scores of the particular set of subjects studied that are intended to represent the scores in some larger population. Mode Standard deviation Median Definitional formula Outlier Computational formula Variance Z score The square root of the average of The value with the greatest the squared deviations from the frequency in a distribution. mean; roughly the average amount scores in a distribution vary from the mean. 2 I- The equation directly displaying the If you line up all the scores from meaning of the procedure it highest to lowest, the middle score. symbolizes. A score with an extreme (very high An equation mathematically or very low) value in relation to the equivalent to the definitional rest of the scores in the formula that is easier to use for distribution. hand computation. 2 I- The number of standard deviations The average of the squared a score is above (or below, if it is deviations from the mean. negative) the mean in its distribution. Symmetrical distribution Normal curve Skewness Kurtosis Floor effect Mean Ceiling effect Central tendency A specific, mathematically defined, bell-shaped frequency distribution which is symmetrical and unimodal. The extent to which a frequency distribution is too peaked and pinched together or too flat and spread out, in comparison to the normal curve. 1 I- The arithmetic average of a group of scores; the sum of the scores divided by the number of scores. A distribution in which the pattern of frequencies on the left and right side are mirror images of each other. The extent to which a frequency distribution has the preponderance of cases on one side of the middle. The situation in which many scores pile up at the low end because it is not possible to have any lower score. The situation in which many scores The typical or most representative pile up at the high end because it is value of a group of scores. not possible to have a higher score. Rectangular distribution Frequency distribution Interval size Unimodal distribution Histogram Bimodal distribution Frequency polygon Multimodal distribution The pattern of frequencies over the A frequency distribution in which various values; what a frequency all values have approximately the table, histogram, or frequency same frequency. polygon describes. A frequency distribution with one In a grouped frequency table, the value clearly having a larger difference between the start of one frequency than any other. interval and the start of the next. A frequency distribution with two approximately equal frequencies, each clearly larger than any of the others. 1 t- A frequency distribution with two or more approximately equal frequencies, each clearly larger than any of the others. A bar-like graph of a distribution in which the area along the horizontal axis and the height of each bar corresponds to the frequency of that value. 1 -I A type of line graph of a distribution in which the values are along the horizontal axis and the height of each point which connects the lines corresponds to the frequency of that value. 1 I Descriptive statistics Value Inferential statistics Score Frequency table Grouped frequency table Variable Interval A possible number or category that Procedures for summarizing or a score can have. otherwise making more com~rehensiblea set of scores. Procedures for drawing conclusions A particular subject's value on a based on, but going beyond, the variable. scores actually collected in a research study. A frequency table in which the A listing of the number of subjects number of subjects is indicated for receiving each of the possible each interval of values. values that the scores on the variable being measured can take. 1 1 I- --I In a grouped frequency table, each A characteristic that can take on of a specified-sized grouping of different values. values for which frequencies are reported. 1