CHAPTER 7 The Logic of Sampling CHAPTER OVERVIEW Now you'll see how social scientists can select a few people for study—and discover things that apply to hundreds of millions of people not studied. V Introduction A Brief History of Sampling President Alf Landon President Thomas E. Dewey Two Types of Sampling Methods Nonprobability Sampling Reliance on Available Subjects Purposive or Judgmental Sampling Snowball Sampling Quota Sampling Selecting Informants The Theory and Logic of Probability Sampling Conscious and Subconscious Sampling Bias Representativeness and Probability of Selection Random Selection Probability Theory, Sampling Distributions, and Estimates of Sampling Error Populations and Sampling Frames Review of Populations and Sampling Frames Types of Sampling Designs Simple Random Sampling Systematic Sampling Stratified Sampling Implicit Stratification in Systematic Sampling Illustration: Sampling University Students Multistage Cluster Sampling Multistage Designs and Sampling Error Stratification in Multistage Cluster Sampling Probability Proportionate to Size (PPS) Sampling Disproportionate Sampling and Weighting Probability Sampling in Review The Ethics of Sampling 04945_ch07_ptg01.indd 182 Introduction One of the most visible uses of survey sampling lies in the political polling that is subsequently tested by election results. Whereas some people doubt the accuracy of sample surveys, others complain that political polls take all the suspense out of campaigns by foretelling the result. Going into the 2008 presidential elections, pollsters were in agreement as to who would win, in contrast to their experiences in 2000 and 2004, which were closely contested races. Table 7-1 reports polls conducted during the few days preceding the election. Despite some variations, the overall picture they present is amazingly consistent and pretty well matches the election results. Now, how many interviews do you suppose it took each of these pollsters to come within a couple of percentage points in estimating the behavior of more than 131 million voters? Often TABLE 7-1 Election-Eve Polls Reporting Presidential Voting Plans, 2008 Poll Date Ended Obama McCain Fox Nov 2 54 46 NBC/WSJ Nov 2 54 46 Marist College Nov 2 55 45 Harris Interactive Nov 3 54 46 Reuters/C-SPAN/Zogby Nov 3 56 44 ARG Nov 3 54 46 Rasmussen Nov 3 53 47 IBD/TIPP Nov 3 54 46 DailyKos.com/Research 2000 Nov 3 53 47 GWU Nov 3 53 47 Marist College Nov 3 55 45 Actual vote Nov 4 54 46 Sources: Poll data are adapted from data presented at Pollster.com (http://www .pollster.com/polls/us/08-us-pres-ge-mvo.php) on January 29,2009. The official election results are from the Federal Election Commission (http://www.fec.gov /pubrec/fe2008/2008presgeresults.pdf) on the same date. For simplicity, since there were no undecideds in the official results and each of the third-party candidates received less than one percentage of the vote, I've apportioned the undecided and other votes according to the percentages saying they were voting for Obama or McCain. fewer than 2,000! In this chapter, we're going to find out how social researchers can pull off such wizardry. In the 2012 presidential election, the preelection polls again clustered closely around the actual popular votes for Barack Obama and Mitt Romney. Most correctly predicted the president would win reelection in a close race. Of course, the president is not elected by the nation's overall popular vote, but by the Electoral College, determined by how the votes go in the individual states. Former sports statistician, Nate Silver, conducted a meta-analysis of the many polls by a large number of polling firms and correctly predicted the 2012 outcomes in all the states and hence in the Electoral College (Terdiman 2012). For another powerful illustration of the potency of sampling, look at this graphic portrayal of then President George W. Bush's approval ratings prior to and following the September 11, 2001, terrorist attack on the United States (see Figure 7-1). The data reported by several different polling agencies describe the same pattern. Political polling, like other forms of social research, rests on observations. But neither pollsters nor other social researchers can observe everything that might be relevant to their interests. A critical part of social research, then, is deciding what to observe and what not to. If you want to study voters, for example, which voters should you study? The process of selecting observations is called sampling. Although sampling can mean any procedure for selecting units of observation—for example, interviewing every tenth passerby on a busy street—the key to generalizing from a sample to a larger population is probability sampling, which involves the important idea of random selection. Much of this chapter is devoted to the logic and skills of probability sampling. This topic is more rigorous and precise than some of the other topics in this book. Whereas social research as a whole is both art and science, sampling leans toward science. Although this subject is somewhat technical, the basic logic of sampling is not difficult to understand. In fact, the logical neatness of this topic can make it easier to comprehend than, say, conceptualization. 184 ■ Chapter 7: The Logic of Sampling 100 90 c CO > o o. a < 80 70 60 50 40 Before September 11th attack After September 11th attack • • • J_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_I_L , ,0° rj^ # # # <^> AY # ,nV" Vs> eft # # fcV" A.v # # ~V~ 2001 Date v 2002 Key: ♦ ABC/Post A Bloomberg A CNN/Time ■ CBS ♦ Fox • Gallup • Harris • IBD/CSM o Zogby ± Ipsos-Reid A NBC/WSJ ■ Newsweek ♦ Pew O AmResGp FIGURE 7-1 Bush Approval: Raw Poll Data. This graph demonstrates how independent polls produce the same picture of reality. This also shows the impact of a national crisis on the president's popularity: in this case, the September 11 terrorist attack and then President George W. Bush's popularity. Source: From drlimerick.com. (http://www.pollkatz.homestead.com/files/MyHTML2.gif). Although probability sampling is central to social research today, we'll take some time to examine a variety of nonprobability methods as well. These methods have their own logic and can provide useful samples for social inquiry. Before we discuss the two major types of sampling, I'll introduce you to some basic ideas by way of a brief history of sampling. As you'll see, the pollsters who correctly predicted the election in 2008 did so in part because researchers had learned to avoid some pitfalls that earlier pollsters had fallen into. A Brief History of Sampling Sampling in social research has developed hand in hand with political polling. This is the case, no doubt, because political polling is one of the few opportunities social researchers have to discover the accuracy of their estimates. On election day, they find out how well or how poorly they did. President Alf Landon President Alf Landon? Who's he? Did you sleep through an entire presidency in your U.S. history class? No—but Alf Landon would have been president if a famous poll conducted by the Literary Digest had proved to be accurate. The Literary Digest was a popular newsmagazine published between 1890 and 1938. In 1916, Digest editors mailed postcards to people in six states, asking them whom they were planning to vote for in the presidential campaign between Woodrow Wilson and Charles Evans Hughes. Names were selected for the poll from telephone directories and automobile registration lists. Based on the postcards sent back, the Digest correctly predicted that Wilson would be elected. In the elections that followed, the Literary Digest expanded the size of its poll and made correct predictions in 1920, 1924, 1928, and 1932. In 1936, the Digest conducted its most ambitious poll: Ten million ballots were sent to people 04945_ch07_ptg01.indd 184 8/21/14 11:50AM listed in telephone directories and on lists of automobile owners. Over 2 million people responded, giving the Republican contender, Alf Landon, a stunning 57 to 43 percent landslide over the incumbent, President Franklin Roosevelt. The editors modestly cautioned, We make no claim to infallibility. We did not coin the phrase "uncanny accuracy" which has been so freely applied to our Polls. We know only too well the limitations of every straw vote, however enormous the sample gathered, however scientific the method. It would be a miracle if every State of the forty-eight behaved on Election Day exactly as forecast by the Poll. (Literary Digest 1936a: 6) Two weeks later, the Digest editors knew the limitations of straw polls even better: The voters gave Roosevelt a second term in office by the largest landslide in history, with 61 percent of the vote. Landon won only 8 electoral votes to Roosevelt's 523. The editors were puzzled by their unfortunate turn of luck. A part of the problem surely lay in the 22 percent return rate garnered by the poll. The editors asked, Why did only one in five voters in Chicago to whom the Digest sent ballots take the trouble to reply? And why was there a preponderance of Republicans in the one-fifth that did reply? . . . We were getting better cooperation in what we have always regarded as a public service from Republicans than we were getting from Democrats. Do Republicans live nearer to mailboxes? Do Democrats generally disapprove of straw polls? (Literary Digest 1936b: 7) Actually, there was a better explanation— what is technically called the sampling frame used by the Digest. In this case, the sampling frame consisted of telephone subscribers and automobile owners. In the context of 1936, this design selected a disproportionately wealthy sample of the voting population, especially coming on the tail end of the worst economic depression in the nation's history. The sample effectively excluded poor people, and the poor voted predominantly for Roosevelt's New Deal recovery program. The Digests poll may or may not have correctly A Brief History of Sampling "185 represented the voting intentions of telephone subscribers and automobile owners. Unfortunately for the editors, it decidedly did not represent the voting intentions of the population as a whole. President Thomas E. Dewey The 1936 election also saw the emergence of a young pollster whose name would become synonymous with public opinion. In contrast to the Literary Digest, George Gallup correctly predicted that Roosevelt would beat Landon. Gallup's success in 1936 hinged on his use of something called quota sampling, which we'll look at more closely later in the chapter. For now, it's enough to know that quota sampling is based on a knowledge of the characteristics of the population being sampled: what proportion are men, what proportion are women, what proportions are of various incomes, ages, and so on. Quota sampling selects people to match a set of these characteristics: the right number of poor, white, rural men; the right number of rich, African American, urban women; and so on. The quotas are based on those variables most relevant to the study. In the case of Gallup's poll, the sample selection was based on levels of income; the selection procedure ensured the right proportion of respondents at each income level. Gallup and his American Institute of Public Opinion used quota sampling to good effect in 1936, 1940, and 1944—correctly picking the presidential winner each of those years. Then, in 1948, Gallup and most political pollsters suffered the embarrassment of picking Governor Thomas Dewey of New York over the incumbent, President Harry Truman. The pollsters' embarrassing miscue continued right up to election night. A famous photograph shows a jubilant Truman—whose followers' battle cry was "Give 'em hell, Harry!"—holding aloft a newspaper with the banner headline "Dewey Defeats Truman." Several factors accounted for the pollsters' failure in 1948. First, most pollsters stopped polling in early October despite a steady trend toward Truman during the campaign. In addition, many voters were undecided throughout the campaign, and these went disproportionately for Truman when they stepped into the voting booth. 04945_ch07_ptg01.indd 185 8/21/14 11:50AM 186 ■ Chapter 7: The Logic of Sampling Based on early political polls that showed Dewey leading Truman, the Chicago Tribune sought to scoop the competition with this unfortunate headline. More important, Gallup's failure rested on the unrepresentativeness of his samples. Quota sampling—which had been effective in earlier years—was Gallup's undoing in 1948. This technique requires that the researcher know something about the total population (of voters in this instance). For national political polls, such information came primarily from census data. By 1948, however, World War II had produced a massive movement from the country to cities, radically changing the character of the U.S. population from what the 1940 census showed, and Gallup relied on 1940 census data. City dwellers, moreover, tended to vote Democratic; hence, the overrepresentation of rural voters in his poll had the effect of underestimating the number of Democratic votes. Two Types of Sampling Methods By 1948, some academic researchers had already been experimenting with a form of sampling based on probability theory. This technique involves the selection of a "random sample" from a list containing the names of everyone in the population being sampled. By and large, the probability-sampling methods used in 1948 were far more accurate than quota-sampling techniques. nonprobability sampling Any technique in which samples are selected in some way not suggested by probability theory. Examples include reliance on available subjects as well as purposive (judgmental), quota, and snowball sampling. Today, probability sampling remains the primary method of selecting large, representative samples for social research, including national political polls. At the same time, probability sampling can be impossible or inappropriate in many research situations. Accordingly, before turning to the logic and techniques of probability sampling, we'll first take a look at techniques for nonprobability sampling and how they're used in social research. Nonprobability Sampling Social research is often conducted in situations that do not permit the kinds of probability samples used in large-scale social surveys. Suppose you wanted to study homelessness: There is no list of all homeless individuals, nor are you likely to create such a list. Moreover, as you'll see, there are times when probability sampling wouldn't be appropriate even if it were possible. Many such situations call for nonprobability sampling. In this section, we'll examine four types of nonprobability sampling: reliance on available subjects, purposive (judgmental) sampling, snowball sampling, and quota sampling. We'll conclude with a brief discussion of techniques for obtaining information about social groups through the use of informants. Reliance on Available Subjects Relying on available subjects, such as stopping people at a street corner or some other location, is sometimes called "convenience" or "haphazard" sampling. This is a common method for journalists in their "person-on-the-street" interviews, but it is an extremely risky sampling method for social research. Clearly, this method does not permit any control over the representativeness of a sample. It's justified only if the researcher wants to study the characteristics of people passing the sampling point at specified times or if less-risky sampling methods are not feasible. Even when this method is justified on grounds of feasibility, researchers must exercise great caution in generalizing from their data. Also, they should alert readers to the risks associated with this method. University researchers frequently conduct surveys among the students enrolled in large 04945_ch07_ptg01.indd 186 8/21/14 11:50 AM lecture classes. The ease and frugality of such a method explains its popularity, but it seldom produces data of any general value. It may be useful for pretesting a questionnaire, but such a sampling method should not be used for a study purportedly describing students as a whole. Consider this report on the sampling design in an examination of knowledge and opinions about nutrition and cancer among medical students and family physicians: The fourth-year medical students of the University of Minnesota Medical School in Minneapolis comprised the student population in this study. The physician population consisted of all physicians attending a "Family Practice Review and Update" course sponsored by the University of Minnesota Department of Continuing Medical Education. (Cooper-Stephenson and Theologides 1981:472) After all is said and done, what will the results of this study represent? The data do not provide a meaningful comparison of medical students and family physicians in the United States or even in Minnesota. Who were the physicians who attended the course? We can guess that they were probably more concerned about their continuing education than other physicians were, but we can't say for sure. Although such studies can provide useful insights, we must take care not to overgeneralize from them. Purposive or Judgmental Sampling Sometimes it's appropriate to select a sample on the basis of knowledge of a population, its elements, and the purpose of the study. This type of sampling is called purposive or judgmental sampling. In the initial design of a questionnaire, for example, you might wish to select the widest variety of respondents to test the broad applicability of questions. Although the study findings would not represent any meaningful population, the test run might effectively uncover any peculiar defects in your questionnaire. This situation would be considered a pretest, however, rather than a final study. In some instances, you may wish to study a small subset of a larger population in which many members of the subset are easily identified, but the enumeration of them all would be nearly Nonprobability Sampling ■ 187 impossible. For example, you might want to study the leadership of a student protest movement; many of the leaders are easily visible, but it would not be feasible to define and sample all the leaders. In studying all or a sample of the most visible leaders, you may collect data sufficient for your purposes. Or let's say you want to compare left-wing and right-wing students. Because you may not be able to enumerate and sample from all such students, you might decide to sample the memberships of left- and right-leaning groups, such as the Green Party and the Tea Party. Although such a sample design would not provide a good description of either left-wing or right-wing students as a whole, it might suffice for general comparative purposes. Field researchers are often particularly interested in studying deviant cases—cases that don't fit into fairly regular patterns of attitudes and behaviors—in order to improve their understanding of the more-regular pattern. For example, you might gain important insights into the nature of school spirit, as exhibited at a pep rally, by interviewing people who did not appear to be caught up in the emotions of the crowd or by interviewing students who did not attend the rally at all. Selecting deviant cases for study is another example of purposive study. In qualitative research projects, the sampling of subjects may evolve as the structure of the situation being studied becomes clearer and certain types of subjects seem more central to understanding than others do. Let's say you're conducting an interview study among the members of a radical political group on campus. You may initially focus on friendship networks as a vehicle for the spread of group membership and participation. In the course of your analysis of the earlier interviews, you may find several references to interactions with faculty members in one of the social science departments. As a consequence, you may expand your sample to include faculty in that department and other students that they interact with. This is called purposive (judgmental) sampling A type of nonprobability sampling in which the units to be observed are selected on the basis of the researcher's judgment about which ones will be the most useful or representative. 04945_ch07_ptg01.indd 187 8/21/14 11:50AM 188 ■ Chapter 7: The Logic of Sampling "theoretical sampling," because the evolving theoretical understanding of the subject directs the sampling in certain directions. Snowball Sampling Another nonprobability sampling technique, which some consider to be a form of accidental sampling, is called snowball sampling. This procedure is appropriate when the members of a special population are difficult to locate, such as homeless individuals, migrant workers, or undocumented immigrants. In snowball sampling, the researcher collects data on the few members of the target population he or she can locate, then asks those individuals to provide the information needed to locate other members of that population whom they happen to know. "Snowball" refers to the process of accumulation as each located subject suggests other subjects. Because this procedure also results in samples with questionable representativeness, it's used primarily for exploratory purposes. Sometimes, the term chain referral is used in reference to snowball sampling and other, similar techniques in which the sample unfolds and grows from an initial selection. Suppose you wish to learn a community organization's pattern of recruitment over time. You might begin by interviewing fairly recent recruits, asking them who introduced them to the group. You might then interview the people named, asking them who introduced them to the group. You might then interview those people named, asking, in part, who introduced them. Or, in studying a loosely structured political group, you might ask one of the participants who he or she believes to be the most influential members of the group. You might interview those people and, in the course of the interviews, ask who they believe to be the most influential. In each of snowball sampling A nonprobability sampling method, often employed in field research, whereby each person interviewed may be asked to suggest additional people for interviewing. quota sampling A type of nonprobability sampling in which units are selected into a sample on the basis of prespecified characteristics, so that the total sample will have the same distribution of characteristics assumed to exist in the population being studied. these examples, your sample would "snowball" as each of your interviewees suggested other people to interview. Examples of this technique in social science research abound. Karen Farquharson (2005) provides a detailed discussion of how she used snowball sampling to discover a network of tobacco policy makers in Australia: both those at the core of the network and those on the periphery. Kath Browne (2005) used snowballing through social networks to develop a sample of nonheterosexual women in a small town in the United Kingdom. She reports that her own membership in such networks greatly facilitated this type of sampling, and that potential subjects in the study were more likely to trust her than to trust heterosexual researchers. In more general, theoretical terms, Chaim Noy argues that the process of selecting a snowball sample reveals important aspects of the populations being sampled, uncovering "the dynamics of natural and organic social networks" (2008: 329). Do the people you interview know others like themselves? Are they willing to identify those people to researchers? Thus, snowball sampling can be more than a simple technique for finding people to study. It can be a revealing part of the inquiry. Quota Sampling Quota sampling is the method that helped George Gallup avoid disaster in 1936—and set up the disaster of 1948. Like probability sampling, quota sampling addresses the issue of representativeness, although the two methods approach the issue quite differently. Quota sampling begins with a matrix, or table, describing the characteristics of the target population. Depending on your research purposes, you may need to know what proportion of the population is male and what proportion female, as well as knowing what proportions of each gender fall into various age categories, educational levels, ethnic groups, and so forth. In establishing a national quota sample, you might need to know what proportion of the national population is urban, eastern, male, under 2 5, white, working class, and the like, and all the possible combinations of these attributes. Once you've created such a matrix and assigned a relative proportion to each cell in the 04945_ch07_ptg01.indd 188 8/21/14 11:50AM matrix, you proceed to collect data from people having all the characteristics of a given cell. You then assign to all the people in a given cell a weight appropriate to their portion of the total population. When all the sample elements are so weighted, the overall data should provide a reasonable representation of the total population. Although quota sampling resembles probability sampling, it has several inherent problems. First, the quota frame (the proportions that different cells represent) must be accurate, and it's often difficult to get up-to-date information for this purpose. The Gallup failure to predict Truman as the presidential victor in 1948 was due partly to this problem. Second, the selection of sample elements within a given cell may be biased even though its proportion of the population is accurately estimated. Instructed to interview five people who meet a given, complex set of characteristics, an interviewer may still avoid people living at the top of seven-story walk-ups, having particularly run-down homes, or owning vicious dogs. In recent years, attempts have been made to combine probability- and quota-sampling methods, but the effectiveness of this effort remains to be seen. At present, you would be advised to treat quota sampling warily if your purpose is statistical description. At the same time, the logic of quota sampling can sometimes be applied usefully to a field research project. In the study of a formal group, for example, you might wish to interview both leaders and nonleaders. In studying a student political organization, you might want to interview radical, moderate, and conservative members of that group. You may be able to achieve sufficient representativeness in such cases by using quota sampling to ensure that you interview both men and women, both younger and older people, and so forth. J. Michael Brick (2011), in pondering the future of survey sampling, suggests the possibility of a rebirth for quota sampling. Perhaps it is a workable solution to the problem of representativeness that bedevils falling response rates and online surveys. Selecting Informants When field research involves the researcher's attempt to understand some social setting—a Nonprobability Sampling ■ 189 juvenile gang or local neighborhood, for example— much of that understanding will come from a collaboration with some members of the group being studied. Whereas social researchers speak of respondents as people who provide information about themselves, allowing the researcher to construct a composite picture of the group those respondents represent, an informant is a member of the group who can talk directly about the group per se. Especially important to anthropologists, informants are important to other social researchers as well. If you wanted to learn about informal social networks in a local public-housing project, for example, you would do well to locate individuals who could understand what you were looking for and help you find it. When Jeffrey Johnson (1990) set out to study a salmon-fishing community in North Carolina, he used several criteria to evaluate potential informants. Did their positions allow them to interact regularly with other members of the camp, for example, or were they isolated? (In this case, he found that the carpenter had a wider range of interactions than the boat captain did.) Was their information about the camp pretty much limited to their specific jobs, or did it cover many aspects of the operation? These and other criteria helped determine how useful the potential informants might be. Usually, you'll want to select informants somewhat typical of the groups you're studying. Otherwise, their observations and opinions may be misleading. Interviewing only physicians will not give you a well-rounded view of how a community medical clinic is working, for example. Along the same lines, an anthropologist who interviews only men in a society where women are sheltered from outsiders will get a biased view. Similarly, although informants fluent in English are convenient for English-speaking researchers from the United States, they do not typify the members of many societies nor even many subgroups within English-speaking countries. informant Someone who is well versed in the social phenomenon that you wish to study and who is willing to tell you what he or she knows about it. Not to be confused with a respondent. 04945_ch07_ptg01.indd 189 8/21/14 11:50AM 190 ■ Chapter 7: The Logic of Sampling Simply because they're the ones willing to work with outside investigators, informants will almost always be somewhat "marginal" or atypical within their group. Sometimes this is obvious. Other times, however, you'll learn about their marginality only in the course of your research. In Jeffrey Johnson's study, the county agent identified one fisherman who seemed squarely in the mainstream of the community. Moreover, he was cooperative and helpful to Johnson's research. The more Johnson worked with the fisherman, however, the more he found the man to be a marginal member of the fishing community. First, he was a Yankee in a southern town. Second, he had a pension from the Navy [so he was not seen as a "serious fisherman" by others in the community].... Third, he was a major Republican activist in a mostly Democratic village. Finally, he kept his boat in an isolated anchorage, far from the community harbor. (1990: 56) Informants' marginality may not only bias the view you get, but their marginal status may also limit their access (and hence yours) to the different sectors of the community you wish to study. These comments should give you some sense of the concerns involved in nonprobability sampling, typically used in qualitative research projects. I conclude with the following injunction: Your overall goal is to collect the richest possible data. By rich data, we mean a wide and diverse range of information collected over a relatively prolonged period of time in a persistent and systematic manner. Ideally, such data enable you to grasp the meanings associated with the actions of those you are studying and to understand the contexts in which those actions are embedded. (Loflandetal. 2006:15) In other words, nonprobability sampling does have its uses, particularly in qualitative probability sampling The general term for samples selected in accord with probability theory, typically involving some random-selection mechanism. Specific types of probability sampling include EPSEM, PPS, simple random sampling, and systematic sampling. research projects. But researchers must take care to acknowledge the limitations of nonprobability sampling, especially regarding accurate and precise representations of populations. This point will become clearer as we discuss the logic and techniques of probability sampling. The Theory and Logic of Probability Sampling However appropriate to some research purposes, nonprobability-sampling methods cannot guarantee that the sample we observed is representative of the whole population. When researchers want precise, statistical descriptions of large populations—for example, the percentage of the population that is unemployed, that plans to vote for Candidate X, or that feel a rape victim should have the right to an abortion—they turn to probability sampling. All large-scale surveys use probability-sampling methods. Although the application of probability sampling involves some sophisticated use of statistics, the basic logic of probability sampling is not difficult to understand. If all members of a population were identical in all respects—all demographic characteristics, attitudes, experiences, behaviors, and so on—there would be no need for careful sampling procedures. In this extreme case of perfect homogeneity, in fact, any single case would suffice as a sample to study characteristics of the whole population. In fact, of course, the human beings who compose any real population are quite heterogeneous, varying in many ways. Figure 7-2 offers a simplified illustration of a heterogeneous population: The 100 members of this small population differ by gender and race. We'll use this hypothetical micropopulation to illustrate various aspects of probability sampling. The fundamental idea behind probability sampling is this: To provide useful descriptions of the total population, a sample of individuals from a population must contain essentially the same variations that exist in the population. This isn't as simple as it might seem, however. Let's take a minute to look at some of the ways researchers might go astray. Then, we'll see how probability sampling provides an efficient method for selecting a sample that should 04945_ch07_ptg01.indd 190 8/21/14 11:50AM The Theory and Logic of Probability Sampling ■ 191 50 a. o o. 30 20 10 . 44 44 6 1 1 6 1 1 White women African American women White men African American men FIGURE 7-2 A Population of 100 Folks. Typically, sampling aimsto reflect the characteristics and dynamics of large populations. For the purpose of some simple illustrations, let's assume our total population only has 100 members. © Cengage Learning® adequately reflect variations that exist in the population. Conscious and Subconscious Sampling Bias At first glance, it may look as though sampling is pretty straightforward. To select a sample of 100 university students, you might simply African American interview the first 100 students you find walking around campus. This kind of sampling method is often used by untrained researchers, but it runs a high risk of introducing biases into the samples. In connection with sampling, bias simply means that those selected are not typical nor representative of the larger populations they have been chosen from. This kind of bias does not have to be intentional. In fact, it is virtually inevitable when you pick people by the seat of your pants. Figure 7-3 illustrates what can happen when researchers simply select people who are convenient for study. Although women are only 50 percent of our micropopulation, the people closest to the researcher (in the lower right corner) happen to be 70 percent women, and although the population is 12 percent African American, none was selected into the sample. Beyond the risks inherent in simply studying people who are convenient, other problems can arise. To begin with, the researcher's personal leanings may affect the sample to the point where it does not truly represent the student population. Suppose you're a little intimidated by students who look particularly "cool," feeling they might ridicule your research effort. You might consciously or subconsciously avoid ■ tttjttftttttfttftttftttftt J^l^ *J^L* * * * * *Jü^* * * * * *Jüüt i^i^i^ I^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FIGURE 7-3 A Sample of Convenience: Easy, but Not Representative. Simply selecting and observing those people who are most readily at hand is the simplest method, perhaps, but it's unlikely to provide a sample that accurately reflects the total population. © Cengage Learning® 04945_ch07_ptg01.indd 191 8/21/14 11:50 AM 192 ■ Chapter 7: The Logic of Sampling interviewing such people. Or, you might feel that the attitudes of "super-straight-looking" students would be irrelevant to your research purposes and so you avoid interviewing them. Even if you sought to interview a "balanced" group of students, you wouldn't know the exact proportions of different types of students making up such a balance, and you wouldn't always be able to identify the different types just by watching them walk by. Even if you made a conscientious effort to interview, say, every tenth student entering the university library, you could not be sure of a representative sample, because different types of students visit the library with different frequencies. Your sample would overrepresent students who visit the library more often than others do. The possibilities for inadvertent sampling bias are endless and not always obvious. Fortunately, many techniques can help us avoid bias. Representativeness and Probability of Selection Although the term representativeness has no precise, scientific meaning, it carries a common-sense meaning that makes it useful here. For our purpose, a sample is representative of the population from which it is selected if the aggregate characteristics of the sample closely approximate those same aggregate characteristics in the population. If, for example, the population contains 50 percent women, then a sample must contain "close to" 50 percent women to be representative. Later, we'll discuss "how close" in detail. Note that samples need not be representative in all respects; representativeness is limited to those characteristics that are relevant to the representativeness That quality of a sample of having the same distribution of characteristics as the population from which it was selected. By implication, descriptions and explanations derived from an analysis of the sample may be assumed to represent similar ones in the population. Representativeness is enhanced by probability sampling and provides for generalizability and the use of inferential statistics. EPSEM (equal probability of selection method) A sample design in which each member of a population has the same chance of being selected into the sample. substantive interests of the study. However, you may not know in advance which characteristics are relevant. A basic principle of probability sampling is that a sample will be representative of the population from which it is selected if all members of the population have an equal chance of being selected in the sample. (We'll see shortly that the size of the sample selected also affects the degree of representativeness.) Samples that have this quality are often labeled EPSEM samples (EPSEM stands for "equal probability of selection method"). Later, we'll discuss variations of this principle, which forms the basis of probability sampling. Moving beyond this basic principle, we must realize that samples—even carefully selected EPSEM samples—seldom if ever perfectly represent the populations from which they are drawn. Nevertheless, probability sampling offers two special advantages. First, probability samples, although never perfectly representative, are typically more representative than other types of samples, because the biases previously discussed are avoided. In practice, a probability sample is more likely than a nonprobability sample to be representative of the population from which it is drawn. Second, and more important, probability theory permits us to estimate the accuracy or representativeness of the sample. Conceivably, an uninformed researcher might, through wholly haphazard means, select a sample that nearly perfectly represents the larger population. The odds are against doing so, however, and we would be unable to estimate the likelihood that he or she has achieved representativeness. The probability sampler, on the other hand, can provide an accurate estimate of success or failure. We'll shortly see exactly how this estimate can be achieved. I've said that probability sampling ensures that samples are representative of the population we wish to study. As we'll see in a moment, probability sampling rests on the use of a random-selection procedure. To develop this idea, though, we need to give more-precise meaning to two important terms: element and population* *I would like to acknowledge a debt to Leslie Kish and his excellent textbook Survey Sampling. Although I've modified some of the conventions used by Kish, his presentation is easily the most important source of this discussion. 04945_ch07_ptg01.indd 192 8/21/14 11:50AM An element is that unit about which information is collected and that provides the basis of analysis. Typically, in survey research, elements are people or certain types of people. However, other kinds of units can constitute the elements for social research: Families, social clubs, or corporations might be the elements of a study. In a given study, elements are often the same as units of analysis, though the former are used in sample selection and the latter in data analysis. Up to now we've used the term population to mean the group or collection that we're interested in generalizing about. More formally, a population is the theoretically specified aggregation of study elements. Whereas the vague term Americans might be the target for a study, the delineation of the population would include the definition of the element Americans (for example, citizenship, residence) and the time referent for the study (Americans as of when?). Translating the abstract "adult New Yorkers" into a workable population would require a specification of the age defining adult and the boundaries of New York. Specifying the term college student would include a consideration of full- and part-time students, degree candidates and nondegree candidates, undergraduate and graduate students, and so forth. A study population is that aggregation of elements from which the sample is actually selected. As a practical matter, researchers are seldom in a position to guarantee that every element meeting the theoretical definitions laid down actually has a chance of being selected in the sample. Even where lists of elements exist for sampling purposes, the lists are usually somewhat incomplete. Some students are always inadvertently omitted from student rosters. Some telephone subscribers request that their names and numbers be unlisted. Often, researchers decide to limit their study populations more severely than indicated in the preceding examples. National polling firms may limit their national samples to the 48 adjacent states, omitting Alaska and Hawaii for practical reasons. A researcher wishing to sample psychology professors may limit the study population to those in psychology departments, omitting those in other departments. Whenever the population under examination is altered in such fashions, you must make the revisions clear to your readers. The Theory and Logic of Probability Sampling ■ 193 Random Selection With these definitions in hand, we can define the ultimate purpose of sampling: to select a set of elements from a population in such a way that descriptions of those elements accurately portray the total population from which the elements are selected. Probability sampling enhances the likelihood of accomplishing this aim and also provides methods for estimating the degree of probable success. Random selection is the key to this process. In random selection, each element has an equal chance of selection independent of any other event in the selection process. Flipping a coin is the most frequently cited example: Provided that the coin is perfect (that is, not biased in terms of coming up heads or tails), the "selection" of a head or a tail is independent of previous selections of heads or tails. No matter how many heads turn up in a row, the chance that the next flip will produce "heads" is exactly 50-50. Rolling a perfect set of dice is another example. Such images of random selection, although useful, seldom apply directly to sampling methods in social research. More typically, social researchers use tables of random numbers or computer programs that provide a random selection of sampling units. A sampling unit is that element or set of elements considered for selection in some stage of sampling. A little later, we'll see how computers are used to select random telephone numbers for interviewing, a technique called random-digit dialing. The reasons for using random-selection methods are twofold. First, this procedure serves element That unit of which a population is composed and which is selected in a sample. Distinguished from units of analysis, which are used in data analysis. population The theoretically specified aggregation of the elements in a study, study population That aggregation of elements from which a sample is actually selected, random selection A sampling method in which each element has an equal chance of selection independent of any other event in the selection process. sampling unit That element or set of elements considered for selection in some stage of sampling. 04945_ch07_ptg01.indd 193 8/21/14 11:50AM 194 ■ Chapter 7: The Logic of Sampling as a check on conscious or unconscious bias on the part of the researcher. The researcher who selects cases on an intuitive basis might very well select cases that would support his or her research expectations or hypotheses. Random selection erases this danger. More importantly, random selection offers access to the body of probability theory, which provides the basis for estimating the characteristics of the population as well as estimating the accuracy of samples. Let's now examine probability theory in greater detail. Probability Theory, Sampling Distributions, and Estimates of Sampling Error Probability theory is a branch of mathematics that provides the tools researchers need to devise sampling techniques that produce representative samples and to analyze the results of their sampling statistically. More formally, probability theory provides the basis for estimating the parameters of a population. A parameter is the summary description of a given variable in a population. The mean income of all families in a city is a parameter; so is the age distribution of the city's population. When researchers generalize from a sample, they're using sample observations to estimate population parameters. Probability theory enables them to both make these estimates and arrive at a judgment of how likely the estimates will accurately represent the actual parameters in the population. For example, probability theory allows pollsters to infer from a sample of 2,000 voters how a population of 100 million voters is likely to vote—and to specify exactly what the probable margin of error of the estimates is. Probability theory accomplishes these seemingly magical feats by way of the concept of sampling distributions. A single sample selected from a population will give an estimate of the population parameter. Other samples would give the same or slightly different estimates. Probability theory tells us about the distribution of estimates that would be produced by a large number of such samples. To see how this works, we'll parameter The summary description of a given variable in a population. look at two examples of sampling distributions, beginning with a simple example in which our population consists of just ten cases, then moving on to a case of percentages that allows a clear illustration of probable margin of error. The Sampling Distribution of Ten Cases Suppose there are ten people in a group, and each has a certain amount of money in his or her pocket. To simplify, let's assume that one person has no money, another has one dollar, another has two dollars, and so forth up to the person with nine dollars. Figure 7-4 presents the population of ten people.* Our task is to determine the average amount of money one person has: specifically, the mean number of dollars. If you simply add up the money shown in Figure 7-4, you'll find that the total is $45, so the mean is $4.50. Our purpose in the rest of this exercise is to estimate that mean without actually observing all ten individuals. We'll do that by selecting random samples from the population and using the means of those samples to estimate the mean of the whole population. To start, suppose we were to select—at random—a sample of only one person from the ten. Our ten possible samples thus consist of the ten cases shown in Figure 7-4. The ten dots shown on the graph in Figure 7-5 represent these ten samples. Because we're taking samples of only one, they also represent the "means" we would get as estimates of the population. The distribution of the dots on the graph is called the sampling distribution. Obviously, it wouldn't be a very good idea to select a sample of only one, because the chances are great that we'll miss the true mean of $4.50 by quite a bit. Now suppose we take a sample of two. As shown in Figure 7-6, increasing the sample size improves our estimations. There are now 45 possible samples: [$0 $1], [$0 $2], . . . [$7 $8], [$8 $9]. Moreover, some of those samples produce the same means. For example, [$0 $6], [$1 $5], and [$2 $4] all produce means of $3. In Figure 7-6, the three dots shown above the $3 mean represent those three samples. Moreover, the 45 samples are not evenly distributed, as they were when the sample size was *I want to thank Hanan Selvin for suggesting this method of introducing probability sampling. 04945_ch07_ptg01.indd 194 8/21/14 11:50AM The Theory and Logic of Probability Sampling ■ 195 FIGURE 7-4 A Population of 10 People with $0-$9. Let's simplify matters even more now by imagining a population of only 10 people with differing amounts of money in their pockets—ranging from $0 to $9. © Cengage Learning® only one. Rather, they're somewhat clustered around the true value of $4.50. Only two possible samples deviate by as much as $4 from the true value ([$0 $1] and [$8 $9]), whereas five of the samples would give the true estimate of $4.50; another eight samples miss the mark by only 50 cents (plus or minus). Now suppose we select even larger samples. What do you think that will do to our estimates of the mean? Figure 7-7 presents the sampling distributions of samples of 3, 4, 5, and 6. The progression of sampling distributions is clear. Every increase in sample size improves the distribution of estimates of the mean. The m 0 a. 1 ° Co t— » ii ° s CD O 10 9 8 7 6 5 4 3 2 1 0 True mea 1 = $4.50 K . . . . ......... $0 $1 $2 $3 $4 $5 $6 $7 $8 $9 Estimate of mean (Sample size = 1) FIGURE 7-5 The Sampling Distribution of Samples of 1. In this simple example, the mean amount of money these people have is $4.50 ($45/10). If we picked 10 different samples of 1 person each, our "estimates" of the mean would range all across the board. © Cengage Learning® Z " o .a E 10 9 8 7 6 5 4 3 2 1 0 True mean = $4.50 _L _L _L _L _L _L _L _L J $0 $1 $2 $3 $4 $5 $6 $7 $8 $9 Estimate of mean (Sample size = 2) FIGURE 7-6 The Sampling Distribution of Samples of 2. By merely increasing our sample size to 2, we get possible samples that provide somewhat better estimates of the mean. We couldn't get either $0 or $9, and the estimates are beginning to cluster around the true value of the mean: $4.50. © Cengage Learning® 04945_ch07_ptg01.indd 195 8/21/14 11:50 AM 196 ■ Chapter 7: The Logic of Sampling a. Samples of 3 b. Samples of 4 20 18 8 16 14 12 0) a 10 True mean = $4.50 $0 $1 $2 $3 $4 $5 $6 $7 Estimate of mean (Sample size = 3) 20 18 ° 16