simplified. Evidence-based images emerge from the simplification of truth tables in the form of configurations of conditions that diferentiate subsets of cases. In many ways the comparative approach lies halhvay between the qualitative approach and the quantitative approach. The qualitative ap- proach seeks in-depth knowledge of a relatively small number of cases. When the focusis on commonalities,it oftennarrows its scope to smaller sets of cases as it seeks to clarify their similarities. The comparative ap- proach usually addresses more cases because of its emphasis on diver- sity, and it is applied to sets of cases that are clearlybounded in time and space. As Chapter 6 shows, the quantitative study of covariation seeks broad familiarity with a large number of cases and most often views them as generic, interchangeable obsemations. Using Quantitative Methods to Study Covariation Introduction The starting point of quantitative analysis is the idea that the best route to understanding basic patterns and relationships is to examinephenom- ena across many cases. Focusing on any single case or on a small num- ber of cases might give a very distorted picture. Looking across many cases makes it possible to average out the peculiarities of individual cases and to construct a picture of social life that is purified of phenom- ena that are specific to any case or to a small group of cases. Only the general pattern remains. Quantitativeresearchers construct imagesby showing the covariation between two or more features or attributes (variables)acrossmany cases. Suppose a researcher were to demonstrate in a study of the top 500 cor- porations that those offering better retirement benefits tend to pay lower wages. The image that emerges is that corporationsmake trade-offs be- tween retirement benefits and pay, with some corporations investing in long-term commitments to workers (retirement benefits) and some em- phasizing short-term payoffs (wages and salaries). Evidence-based im- ages such as these are general because they describe patterns across many cases and they are pnrsiinoi~ious-only a few attributes or variables are involved (pay and retirementbenefits). Images that are constructed from broad patterns of covariation are considered generalbecause they condense evidence on many cases. The greater the number of cases, the more general the pattern. A quantitative researcher might construct a general image of political radicalism that links degree of radicalism to some other individual-levelattribute, such as degree of insulation from popular culture, and use survey data on thousands of people (includingpeople who are politically inert) to docu- ment the connection.Qualitative researchers studyingthis same question would go about the task very differently. The images they construct are detailed and specific, and they use methods that enhance rather than condense evidence. Using a qualitative approach, a researcher might Určeno pouze pro studijní účely 132 Cltnpter 6 construct an image of how political radicals nurture their radical com- mitments by studying the daily lives of twenty radicals in depth. These two images of radicalism, one by a qualitative researcher and one by a quantitative researcher, might or might not contradict. Even if they did not contradict each other, the two images still would be very different in degree of detail and complexity.Quantitativeresearcherssac- rifice in-depth howledge of each individual case in order to achieve an understanding of broad patterns of covariationacrossmany cases. Quantitative researchers often use the term correlntio?~to describe a pattern of covariation between two measurable variables. In the previ- ous example, degree of radicalism and degree of insulation from popular culture are correlated such that more radical people tend to be more in- sulated. They also sometimes describe a correlation between two vari- ables as a relationship, which should not be confused with the more conventional use of the term relntionship to describe socialbonds (for ex- ample, two lovers have a relationship). Again using the previous ex- ample, there is a relationship between degree of radicalism and degree of insulation. Usually, attributes of cases that can be linked in this way are under- stood as variables because they are phenomena that vary by level or de- gree. There are cases with high values of a variable (for example, more than eighteen years of education on the variable "educational attain- ment"), cases with moderate values (say,twelve years of education),and cases with low values (only a few years of education). Some variables (called independent or causal variables) may be defined as causes, and others (calleddependent or outcomevariables) may be defined as effects in a given analysis. The dependent variable is the phenomenon the in- vestigator wishes to explain; independent variables are the factors that are used to accountfor the variationin the dependent variable.A depen- dent variable in one analysis (for example, Gross National Product per capita in a study that seeks to explain why some countries are poor and others rich) may appear as an independent variable in the next (for ex- ample, as a causal variable that explains why people in some countries have a higher life expectancy than people in other countries). The Goals of QuantitativeResearch Because the quantitative approach favors general features across many cases, it is especially well suited for several of the basic goals of social research. These include the goals of identifying general patterns and re- UsiitgQt~antitntiueMethods to Shrdy Countintion 133 tionships, testing theories, and making predictions. These three goals all dictate examination of many cases-the more, the better-and favor a dialogue of ideas and evidence that centers on how attributes of cases (variables)are linked to each other. Ide~ztifiJi~lgGellernl Pnttenzs nnd Relntiolzships One of the primary goals of socialresearchis to identify generalrelation- ships. For a relationship to be general, it must be observed across many cases. In quantitative research this is understood not as observing the same exact phenomenon in each and every case, but as obsesving an as- sociation between two or more phenomena across many cases. When a social researcher claims that poorer countries tend to have higher rates of homicide, he or she in essence is stating that there is a general corre- spondencebetween a country's wealth and its rate of homicide such that richer countries tend to have lower homicide rates and poorer countries tend to have higher rates. (The United States is a striking exception to this general relationship.) Identifying general patterns and relationships is important because they offerimportant clues about causation.It is obviouslynot true that if two variables are related across many cases, then one necessarily causes the other. If we found that shoe sue and income were related, we would not argue that big feet cause high incomes. However, when variables are systematicallyrelated, it is important to consider the possibility that one may cause the other. Alternatively,the two correlatedvariablesboth may be the effects of some third, unidentified variable. An example: In the United States over most of the twentieth century, the more industrial states have tended to offer stronger support for lib- eral Democratic candidates. This general pattern connects an indepen- dent variable, percentage of the state's adult population employed in industry, to a dependent variable, percentage of a state's electorate vot- ing for liberal Democratic candidates. A causal relationship can be in- ferred from the correlation between these two variables: Conditions associatedwith having a lot of industry (such as urbanization, unioniza- tion, and so on) generate a preference for the liberal candidates among the people affectedby these conditions.The explanationof liberalvoting based on thisevidencethus may emphasize the impact of industrial con- ditions on people's interests and the translation of these interests to a preference for liberal candidates. The causal images behind correlations are central to the representations of sociallife that quantitative research- ers construct. Určeno pouze pro studijní účely Generally, quantitative social researchers identify causation with ex- planation. Once the causes of a phenomenon have been identified, it has been explained. The usual sequenceis: 1. a pattern of covariation is identified and the strength of the correla- tion is assessed, 2. causation may be inferred from the correlation, and, if so, 3. an explanationis built up from the inferred causal relationship. Another way of understanding thisis simply to say that quantitativeso- cial researchers construct images by examining patterns of covariation among variables and inferring causationfrom these broad patterns. While quantitative researchers often construct explanations and images from the broad patterns that they observe (like the rough correlationbe- tween income levels and educational levels) and relate these evidence- based images to their ideas about social life, they also test ideas drawn directlyfromsocial theories. Recall fromPart I of thisbook that all social researchers are involved in long-standing, abstract conversations about social life. Social researchers use this body of thought whenever they construct images, but they also seek to advance this body of thought and to constructformal tests of ideas drawn from it. Testing an idea is different from lrsiilg an idea to help make sense of some pattern in a set of data or body of evidence that already has been collected. When an idea is tested, it is first used to construct an image that is based on the ideas themselves, not the evidence. The researcher constructs a theoretical image. Researchers use these theoreticallybased images to derive testable propositions (alsocalledhypotheses)about evi- dence that has not yet been examined. Once examined, the evidence ei- ther supports or refutes the proposition (see Chapter 1). This formal assessment of hypotheses helps social scientists deter- mine which ideas are most useful for understanding social life. An idea that consistentlyfails to win support in these formal tests wiU eventually be dropped hom the pool of ideas that social scientists use. Ideas that consistently receive support are retained. One theoreticalimage in the study of socialinequality is the idea that advanced societies are ncllieveiitent oriented-they reward performance, while less advancedsocietiesare ascriptio~toriented-they reward people for who they are (for example, their family's social status). Thus, in an achievement-oriented society, a person of great ability from a low-status, impoverished background should nevertheless be successful. By con- trast, in an ascription-oriented society,people born into high-status fami- lies will be successful,regardless of their talents. These are theoretical images. There is no society that is totally achievement oriented, nor is there any society that is totally ascription oriented. However, these theoretical images have implications for in- equality in the United States,which is generally considered to be an ad- vanced society (despiteits absurdlyhigh homicide rate). Has the United Statesbecome more achievement oriented over the last forty years? Is it easier today for a talented person from a low-status, impoverished back- ground to succeed than it was in the 1950s?The theoretical images just described link the ascendance of the achievement orientation to societal advancement,suggesting that over the last forty years it should have be- come easier in the United States for a talented person from a low-status background to get ahead. Thus, the testableproposition is that evidence on "social mobility" (the study of who gets ahead) should support the idea that achievement has become more important and ascriptionless important in U.S. society. The increased importance of achievement criteria might be discernible in the strength of the relationship between educationalachievement and subse- quent income. Is the correlationbetween these two variables stronger in 1994 than it was in 1954?The decreased importance of ascription might be visible in the strength of the relationship between race and income. Is being black less of a liability in 1994 than it was in 1954? Of course, it would be possible to examine the effects of a variety of achievement and ascription variables on income over the last forty years (and at various points within this span of time) because there have been many surveys conducted over thisperiod with data relevant to the proposition. The quantitative approach is very useful for testing theoretical ideas and images such as these. Notice that these ideas are geiternl-they are relevant to many cases, and they are pnrsitl~o~lio~rs-theyconcern the op- eration of only a few causal variables. When theoretical ideas are rel- evant to many cases, like ideas about ascription versus achievement, we have more confidencein a test when it includes a very largenumber and a wide range of cases. Mnkiizg Predictions Another goal of socialresearch that mandates examinationof large num- bers of cases is making predictions. In order to be able to make predic- tions it is important to have as many cases as possible and to have a Určeno pouze pro studijní účely 136 Chnpter 6 variety of cases. When predictions are based on many cases, researchers have the largest possible data base at their disposal and are capable of making the most accuratepredictions. For example, to predict whether middle-aged, middle class, white, Southernmales will favor the Republican candidate in the next presiden- tial election, it isnecessary to know how people with thiscombination of characteristicsgenerallyvote in presidential elections. Do they always fa- vor Republican candidates? Do they vote differently when the Demo- cratic candidate is a Southerner?When issues related to national defense are important, are they more enthusiasticin their support for the Repub- lican candidate? Clearly, the greater the volume of evidence on the po- liticalbehavior of males in this category, the more precise the prediction for a future election. Having a lot of evidencemakes it easier to forecastfuture behavior. Knowledge of general patterns also helps. Suppose a researcher wants to predict the political behavior of middle-aged, middle class, Southern white males in an election that pits a Democratic candidate from the South againsta Republican candidate who favors greatermilitary spend- ing. Suppose further that this particular combination of candidate char- acteristics has never occurred before. How can social scientists extrapolate when one condition (Democratic candidate from the South) decreases this group's support for the Republican candidate, while the other (a pro-military posture) increases its support? Accumulated knowledgeof generalpattems helps in these situations. If research shows that, in general, the personal characteristicsof a candi- date (for example, being a Southerner) matter more to voters than the positions a candidate takes (for example, being pro-military), then the prediction would be that the Southern factor should outweigh the mili- tary factor. Knowledge of general patterns helps social researchers sharpen their predictions by providing important clues about how to weight factors accurately, even in the face of many unknowns and great uncertainty.Be- cause it is well suited for the production and accumulation of knowl- edge about general patterns, the variable-based approach offers a solid basis for making such predictions. Contrasts with Qualitative and Comparative Research When social researchers construct images from evidence, they may use any number of cases. Qualitative researchers typically use a small num- ber of cases (fromone to several handfuls); comparative researchers use Using Qtmi~titntiueMethods to Sttrdy Covnriation 137 a moderate number; and quantitative researchers use many (sometimes thousands). The images that qualitative researchers construct are de- tailed and in depth; the images that quantitative researchers construct are based on general patterns of variation across many, many cases. These general images linkvariation in one attribute of cases to variation in other attributes.The patterns of covariationbetween two or more such variables across many cases provide the basic raw material for the im- ages that quantitative researchers construct. The quantitative strategy favors generality. A quantitative researcher might show that there is a link between variation in income levels and variation in educational levels in a large sample of U.S. adults. This pat- tern of covariation evokes a general image of how people in the United States get ahead. If income levels covary more closely with educational levels than they do with other individual-level attributes (such as age, race, marital status, and so on),then it appears that successin the educa- tional system is the key to subsequentmaterial well-being. This image of how income differences arise in U.S. society is very different from one that links differences in income levels to differences in other attributes such as skin color. A key question in the application of the quantitative approach is the strength of the correlation of different causal variables, like educationallevel and skiq color, to dependent variables, likeincome. The quantitative approach prizes not only generality, but also parsi- mony-using as few variables aspossible to explain asmuch as possible. In a study of income levels, for example, the main concern of the quanti- tative researcher would be to identify the individual-level attributes with the strongest correlation with income levels. Is it educational levels?Is it age? Is it parents' income? Is it skin color? Which variables have the strongest links with differences in income? By identifying the variables with the strongest correlations, quantitative researchers pinpoint key causal factors and use these to construct parsimonious images. Parsimony and generality go together in quantitative research. Im- ages that are general also tend to be parsimonious. It is clear that parsi- mony is not a key concern of the qualitative approach. Qualitative researchersbelieve that in order to representsubjects properly, they must be studied in depth-to uncover nuances and subtleties. Comparative re- searchers lie halfway in between on the issues of parsimony and gener- ality. Rather than focus on patterns that are general across as many cases as possible--the primary concern of the quantitativeapproach, compara- tive researchers focus on diversity, on configurationsof similarities and differences within a specificset of cases. This difference between quantitative and comparative research is subtle but important. A parsimonious image that links attributes across Určeno pouze pro studijní účely many cases assumes that all cases are more or less the same in how they came to be the way they are. The person with low education and low income is, in this view, the reverse image of the person with high educa- tion and high income. They are two sides of a single coin. The comparative approach, by contrast, focuses on diversity-how different causes combine in complex and sometimescontradictoryways to produce different outcomes. Thus, instead of focusing on attributes that covary with differencesin income levels, like educationallevels, the comparative researcher might focus on the diverse ways people achieve material success, with and without education, and contrast these with the diverse ways they fail to achieve success. From a comparative per- spective,it is not a question of which attributes covary most closely with income levels, but of the differentpaths to achievingmaterial success. Of course, the comparative approach is best suited for the study of a moderate number of cases,not for the study of income differences across thousands of cases. Like the qualitative approach, the comparative ap- proach values knowledge of individual cases. The important point in this contrast between the quantitative approach and the comparative ap- proach is the difference between looking for variables that seem to be systematically linked to each other across many cases (a central concern of the quantitative approach) and examiningpatterns of diversity (a ma- jor objective of the comparativeapproach). The Process of Quantitative Research The quantitative approach is the most structured of the three research sttategies examined in this book. Its structured nature follows in part from the fact that it is well suited for testing theories. Wheneverresearch- ers test theories, they must exercise a great deal of cautlon in how they conduct their tests so that they do not rig their results in advance. Hu- man beings are reactive creatures.There is a largebody of research show- ing that when people are interviewed, their responses are shaped in part by the personal characteristics of the interviewer (such as whether the interviewer is male or female).If they know what a socialscientistis try- ing to prove, they may try to undermine the study, or they may become overcompliant. Tests in any scientiFic field that are not conducted care- fully cannot be trusted. The more structured nature of quantitative research also followsfrom its emphasis on variables. Variables are the building blocks of the images that quantitative researchers construct. But beforeresearchers have vari- ables that they can connect through correlations, they must be able to Usiilg Qirniititative Metlrods to Strndy Covnrintion 139 specify their cases as members of a meaningful set, and they must be able to specify the aspects of their cases that are relevant to examine as variables. In short, much about the research tpnds to be fixed at the out- set of the quantitative investigation. This orientation contrasts sharply with those of the other two strate- gies. In qualitativeresearch, investigators often do not decide what their case is a "case of" until they write up their results for publication (see Chapter 4). In the comparative approach, researchers assume that their cases are very diverse in how they came to be the way they are, and in- vestigators often conclude their research by differentiatingdistinct types of cases (see Chapter 5).Of course, quantitativeresearchers are quite ca- pable of differentiatingtypes of cases,but their primary focus is on relat- ing variables across all the cases they have data on. Cases and variables can be fixed at the outset of a study-as they tend to be in quantitative research--only if the study is well grounded in an analytic frame. Thus, analytic frames play a very important part in quantitative research. Researchers use analytic framgs to articulate theoretical ideas about so- ciallife (see Chapter 3). Frames specify the cases relevant to a theory and delineate their major features. The importance of frames to quantitative research can be seen most clearly in research that seeks to test theories. Once a theory has been translated into an analyticframe, specificpropo- sitions (or testable hypotheses) about how variables are thought to be related to each other can be stated. Researchers can then develop mea- sures of the relevant variables, collect data, and use correlational tech- niques to assess the links amongrelevant variables. Relationships among variables either refute or support theoretically based images. A theory of job satisfaction may emphasize the match between a person's skills and talents, on the one hand, and the nature of the tasks he or she is required to perfom, on the other. The basic theoretical idea is that people are happiest in their work when their job requires them to do things they are good at. Work that does not suit an employee makes the employee feel frustrated and dissatisfied, even useless. These theo- retical ideas can be expressed in a frame that details employee and job characteristicsrelevant to job satisfaction. To test the idea that job satisfactionis greatest when skills and duties are well matched, it would be necessary to elaborate this frame in ad- vance of data collection. Of course, researchers should not remain igno- rant of their research subjects before testing a theory. They should learn Určeno pouze pro studijní účely all that they can. The point is simply that the data used to test a theory is not the same as the evidence the researcher uses in developing or refin- ing the hypothesis to be tested. To do this would be to rig the results of the test in a way that would confirm the researcher's ideas. The framebecomes more or less fixed once theory testingis initiated. The job satisfactionframe is fixed on employees as cases,job satisfaction as the dependent variable, and the match between employee and job characteristics as independent variables. When a frame is fixed, the im- ages that can be constructed from evidence are constrained. When the goal is to test theory, the images that can be constructed are further con- strained by the hypothesis. In the job satisfaction example, if the re- searcher finds that the employeeswho are wellmatched in tenns of skills and duties are not the ones with the highest levels of job satisfaction, then the image constructed from the evidence rejects the theoretically based frame. Even when quantitative researchers are not testing theories, the im- ages that they can construct from evidence are still constrainedby their frames. In order to examine relationships among variables, it is neces- sary first to define relevant cases and variables. The examination of rela- tionships among variables usually cannot begin until after all the evidence has been collected. Furthermore, the evidence that is collected must be in a form appropriate for quantitative analysis. There must be many cases, all more or less comparable to each other, and they must have data on all, or at least most, of the relevant variables. Thus, quanti- tative research implements frames directly, as guides to data collection, tellingresearchers which variables to measure. Frorn Annlytic Frnme to Dntn Mntriv In quantitative research the collection of evidenceis seen as a process of filling in the data table (or data matrix) defined by the analytic frame. (An example of a small data matrix is presented in Table 6.1.) In the study of job satisfaction,the data on a single employee would fill one row of the data ma&, and there would be as many rows as employees. The columns of the data mahix would be the differentemployee and job characteristicsrelevant to the analysis. Thus, in quantitative research the data matrix mirrors the analyticframe. The researcher would not fill in thismatrix with data onjust anyone. In a study of job satisfaction,for example, the researcher would probably want to collect data on all the employees of a particular factory or firm. (Of course, if the firm or factory were very large, the researcher would Using Q~mntitntiueMetltods to Sbrdy Counrintiol~ 141 probably collect a systematic, random sample of its employees.)In order to construct a good test of the theory, the researcher would choose a work setting with many different kinds of jobs and with employeespos- sessingmany Merent kinds of skills. This combinationwould provide a good setting for testing the idea that matching skills with duties is im- portant for job satisfaction. If the researcher chose a work setting where everyone did more or less the same thing and had more or less the same skills, then it would not be an appropriate setting for testing the idea that matching skillswith duties matters. Thus, quantitative researchers exercise considerablecare when select- ing the cases to be used for testing a particular theory. The cases must be relevant to the theory, and they must vary in ways that allow the theory to be tested. When a theory is relevant to very large numbers (for ex- ample, all adults in the United States),the quantitative researcher uses a random sample of such cases (for example, every 10,000th person listed in the census).When it is not possible to use a national sample, the re- searcher may sample the people in a single city or region that is repre- sentative of the population as a whole. Of course, not all social theories are about variation among individu- als. Sometimes they are about other basic units-firms, families, facto- ries, organizations,gangs, neighborhoods, cities, households, bureaucra- cies, even whole countries. "In most quantitative research, cases are common, generic units like these. This preference for generic units fol- lows fromits emphasis on constructingbroad, parsimonious images that reflect generalpatterns. Mensz~riizgVnrinbles Quantitative researchers also exercise great care in developing measures of their variables. In the study of job satisfaction, the measurement of the dependent variable is critically important to the study as a whole. How should it be measured? Is it enough simply to ask employees to rate their degree of satisfaction with their jobs? Can employees be trusted to give honest and accurate assessments or will they worry that management is looking over their shoulders?Should the researcher also examineperson- nel files?Is this legal?Is it ethical?What about records on absenteeism?Is absenteeism a good measure of job dissatisfaction? What about asking supervisors to give their ratings of the people who work under them? Not surprisingly, there is an immense literature on the problems of measuring job satisfaction,and comparably large literatures exist on the measurement of most of the many variables that interest socialscientists. Určeno pouze pro studijní účely Even variables that seem straightforward are difficult to measure with precision, and controversies abound. What does years of educationmea- sure?Knowledge?Job-relevantskills?Time spent in classrooms? For example, it is clear that nations differ in wealth. Gross National Product in U.S. dollars per capita (GNP per capita) is a conventional measure of national wealth. However, GNP per capita has important liabilities. Some are technical. In order to get all countries on the same yardstick, their currencies must be converted to U.S. dollars. But the relevant exchange rates for making these conversions fluctuate daily. Thus, the rankings of countries on GNP per capita fluctuate daily. But wealth differences between countries are thought to be relatively long standing; differences induced by short-term exchange rate fluctuations are artificial. A more serious problem: Some counhies have a high GNP per capita but do not seem wealthy because most of their citizens do not live well. In the mid-1970s, for example, the GNP per capita of many oil-exporting countries skyrocketed, but living conditions in these countries were not as good as those of some poorer, non-oil-exporting counhies. Thus, it is possible, at least in the short run of a decade or so, to have a high GNF per capita and relatively poor living conditions, which contradicts the idea of GNP per capita as a measure of national wealth. A still more serious problem: Some countries have great income in- equality, with a substantial class of very rich people, many poor people, and few in between. These countries may appear to be much better off than they are because on the average--which is what GNP per capita captures--conditions seem OK. But the reality may be one of widespread sufferingin the faceof extremeriches. The issue of using appropriate measures is known as the problem of validity (see also Chapter 1).Do data collectionand measurement proce- dures work the way socialresearchers claim?One way to assess validity is to check the correlations among alternative measures that, according to the ideas that motivate the study, should covary. For example, a re- searcher may believe that years of education is a valid measure of gen- eral knowledge and could assess thisby administering a test of general knowledge to a large group of people representative of the population to be surveyed. If their scores on this test correlatestronglywith their years of education, then the researcher would be justified in treating years of education in the survey of the largerpopulation as a measure of general knowledge. Researchers are also concerned about the reliability of their measures. Reliability generally concerns how much randomness there is in a par- ticularmeasure (quantitativeresearchers refer to this as rnl~donzerror).For example, day-to-day exchange rate fluctuations produce randomness in the GNP per capita in U.S. dollars. The calculation of GNP per capita in U.S. dollars changes every time exchange rates change. Thus,GNP per capita calculated one day will not correlate perfectly with GNP per capita calculated the next, even though the estimates of the goods and services produced by each country are unchanged. Consider an example closer to home: When employees are askedhow satisfied they are with their jobs, their answers may reflect what hap- pened that day or over the last few days. Ask them again in a month, and their answers may reflect what's happening then. Thus, when the measurements of job satisfaction taken one month apart are correlated, asking the same people the same question, the relationshipmay be weak because of the randomness induced by different surrounding events. Researchers have developed a variety of ways to counteract weliability. In research on job satisfaction,they might ask many ques- tions that get at many different aspects of job satisfaction and use these together to develop a broad measure (for example, by adding the re- sponses to fonn a total score for each person).More than likely,employ- ees' responses to many of the questions will not change over one month. Thus, by adding together the responses to many related questions onjob satisfaction, the researchermight4 develop a measure that ismore reliable. Measurement is one of the most difficult and most important tasks facing the quantitative researcher because so much depends on accurate measurement. If a correlation is weak, say between job satisfactionand a measure of the match-betweenemployees' skills and duties, is it because the theory is wrong or because the measures are bad? Is the measure of job satisfaction accurate? Is the measure of skills adequate? Is the mea- sure of the match of employees' skills and duties properly conceived and executed?In the quantitative approach, there is no way to know for sure why a correlation that is expected to be strong comes out weak. Because researchers usually hold fast to their theories, they often blame their measures and complainabout the difficulty of measuring socialphenom- ena with precision. Exnminitlg Correlntio~zsni~dTesting Tlzeories The examination of correlations among variables is the core of the quan- titative approach, but quantitative researchers must travel a great dis- tance before they can compute a single correlation.They must translate their theoretical ideas into analytic frames. They must choose appropri- ate cases. If there are many, many such cases, they must devise a sam- pling strategy. They must develop valid, reliable measures of all their Určeno pouze pro studijní účely 144 Clrapter6 variables. If the goal of the investigationis to test theory, they must also articulate the proposition to be tested and take great care in measuring the variables central to the proposition. And they must fillin the data matrix definedby their analyticframes, the cases they have selected,and the measures they have devised. After all this preparation, the computation of correlations may seem anticlimactic.In qualitativeresearch, the investigatorengagesideas in ev- ery stage of the research, refining and clarifying categories and concepts as new evidence is gathered (see Chapter 4). In comparative research, a similarprocess oflinkingideas and evidence occursin the construction of truth tables (see Chapter 5). In quantitative research investigators must know a lot in advanceof data collection.They must learn as much as they can about the theories they want to test, about their cases, and abouthow to measure their variablesbefore they collectthe data that willbe used to test their theories.Thus, the examinationofrelationshipsamongvariables (the technique quantitative researchers use to construct evidence-based images) isnear the end of a very longjourney. When quantitative researchers test theories, the key question is whether or not the correlations follow patterns consistent with the ideas that motivated the study. Sometimes thisassessment involves the corre- lation between a single independent variable and a single dependent variable. In the study of job satisfaction: How strong is the correlation between job satisfactionand the degree to which employees' skills and duties are matched? Sometimes testing a theory involves comparing the strength of a correlationin differenttimes or settings: Is educationallevel more strongly linked to income level in 1994than it was in 1954?Some- times testinginvolves comparingthe correlationsof severalindependent variables with one or more dependent variables: Is the effect of race on income stronger or weaker than the effect of education on income?Did the pattern change between 1954and 19941 What do researchers do when correlationsdo not support their theo- ries? Sometimes, they simply report that the evidence does not support their theo~y.In other words, they report that they attempted to construct an evidence-based image consistent with some theory, but were unable to do so, suggesting that the theory is wrong. In general, however, the audiences for social science expect social life to be represented in some way in a research report. They do not expect a report of a failed attempt to construct a representation. Suchreports should be more common than they are because the logic of theory testing (that is, the effort to figure out which ideas are best supported by evidence)indicates that negative findings (thatis, failed representations)are very important. Using Qt~nittitatiueMetltodto Shldy Couariotio~~145 More often, if the initial test of a hypothesisfails,researchers examine their evidenceclosely to seeif there issupport for their theoryunder spe- cific conditions.After finding a weak correlationbetweenjob satisfaction and the degree to which employees' skills and duties are matched, a re- searcher might consider the possibility that other factorsneed to be con- sidered. Perhaps employees who have been with the firmthe longest are more satisfied, regardless of how well their skills are matched to their duties. Thisfactor would need to be taken into accountwhen examining the relation between job satisfaction and the match of skills and duties. Generally, researchers try to use their general knowledge of their cases and their theoretical understanding to anticipate refinements like these before they collect their data. They may also specify additional hypoth- eses in advance as a way to anticipate such failures. Using QuantitativeMethods An Iiztuoductioi~to QuniztifntiveMetlzods Quantitative methods focus directly on relationships among variables, especially the effects of causal or indepertdeitt variables on outcome or de- peitdettt variables. Another way to thinkabout the quantitative approach is to see the levelof the depenkent variable (forexample,variation across countries in life expectancy) as something that depeitds 011 the level of other variables (for example, variation across countries in nutrition). The sbength of the correlation between the independent and the dependent variable provides evidence in favor of or against the idea that two vari- ables are causally co~ectedor linked in some other way. The exact degree to which two variables correlate can be determined by computing a correlation coefficient. The most common correlation coefficient is known as Pearson's r and is the main focus of thisdiscus- sion. If the correlation is substantial and the implied cause-effect se- quence makes sense, then the cause (theindependent variable) is said to "explain variation" in the effect (thedependent variable). If citiesin the United Stateswith lower unemploymentrates also tend to have lower crime rates, then these two features of cities, unemploy- ment rates and crime rates, go together; they correlate. Generally, social scientists would argue that the unemployment rate (the independent variable) explainsvariation across cities in the crime rate (the dependent variable).The generalpattern of covariationin this hypothetical example is high unemployment rates-high crime rates, moderate unemployment rates-moderate crime rates, and low unemployment rates-low crime Určeno pouze pro studijní účely F I G U R E 6 . 1 Plot of Crime Rate with Rate of Unemployment Showing Positive Correlation 15.0 Crime Rate 10.0 5.0 Unemployment Rate rates, as depicted with hypothetical data on cities in Figure 6.1. In this figure, the correlation is described as a positive correlation because high unemployment rates go with high crime rates and low unemployment rates go with low crime rates. Some general patterns of covariation display negative correlations.If people who work in less bureaucratic settings display, on the average, itlore job satisfactionthan people who work in more bureaucraticsettings, then these two things, job satisfactionand degree of bureaucratizationof work, are negatively correlated. This pattern can be depicted in a plot of employeedata, asin Figure 6.2 which presentshypotheticalevidencecon- forming to the stated pattern. According to the diagram, bureaucratiza- tion explains variation in job satisfactionbecause job satisfactionis high when people work in settingsthat are lessbureaucratized, and vice versa. In both examples,features of cases, called variables, are observed not in the context of individual cases, but across many cases. It is the pattern acrossmany cases that defines the relation between the two features,not how the two features fit together or relate in individual cases. In the ex- ample of the positive correlationjust described, it may be that one of the Using Qtrn~~titntiveMethods to Shldy Counrintio~~147 F I G U R E 6 . 2 Plot of Job Satisfaction and Bureaucratization of Work Showing Negative Correlation High 50.0 1. .I. I Job Satisfaction I cities combininghigh unemployment and high crime rates had a recent, dramatic increase in unemployment coupled with a decrease in its crime rate--the opposite of the general pattem across cities. (If thiscity's crime declined from a very high level to a merely high level, it would still ap- pear in the high unemployment-high crime rate portion of Figure 6.1.) What happened in one case over time cannot be addressed in the corre- lation across many cities at a single point in time. What matters is the general pattem: Do the cities with the highest unemployment rates also have the highest crime rates? In other words, the analysis of the relation between unemployment and crime in this example proceeds across cit- ies, not within individual cities over time. The correlationcoefficientprovides a way to make a direct, quantita- tive evaluation of the degree to which phenomena (for example, unem- ployment rates and crime rates) covary across cases (such as cities in the United States). The Pearson correlation coefficient itself varies between -1.00 and +1.00. Avalue of -1.00 indicates a perfect negative correlation; a value of +1.00 indicates a perfect positive correlation; and a value of 0 37.5 - 25.0 - Low 1 2 3 4 5 6 7 Low High Bureautlatization ofWork < * . * . , * .* . * I * ** * I * 1 I . I 1. I * I I.. * * I * . * * * * - . Určeno pouze pro studijní účely indicatesno correlation. Sometimes a finding of no correlation is impor- tant because socialresearchers may have strongreasons to believe that a correlation should exist. The finding of no correlation may challenge widely accepted ideas. It is sometimes difficult to specify what value constitutes a "strong" correlation. People tend to be relatively unpredictable. Thus,some re- searchers consider an individual-level correlation strong if it is greater than .3 (or more negative than -.3). For whole countries, by contrast, a correlation of .3 is considered weak because many features of countries tend to be highly correlated (for example, average wealth, life expect- ancy,literacy,levelof industrialization,rate of car ownership,and so on). When assessing the strength of correlations, it is important to consider the nature of the data used in the computation. ComnputiizgCorrelntio~zCoefFcielzts The hand calculation of a correlation coefficient is time consuming but straightforward. Usually, computers are used to compute correlation co- efficients such as Pearson's r. The calculation of Pearson's r is illustrated in the appendix to this book in order to show the underlying logic of the coefficient. Remember, the goal of the computation is to assess the degree to which the values (or scores) of two variables covary across many cases, in either a positive or a negative direction. In other words, do the cases with high values on the independent variable tend to have high values on the dependent variable? Do the cases with low values on the inde- pendent variable tend to have low values on the dependent variable? If so, then a strong positive correlation exists. If high values on the inde- pendent variable tend to be associated with low values on the depen- dent variable, and vice versa, then a strongnegative correlation exists.If there is no pattern of covariation between two variables, then there is no correlationbetween them. The key to calculatinga correlation coefficientis to convert the scores on two variables to Z scores,as explained in the appendix.Z scores stan- dardize variables so that they all have the same mean or average value (0) and the same degree of variation. Table 6.1 reports data on two vari- ables for forty countries: the average number of calories consumed per person each day (theindependent variable) and life expectancy (the de- pendent variable). These two variables can be used to test the simple idea that in counhies where nuhition is better (as reflected in more calo- ries consumed per person) people tend to live longer (as indicated in a Using Qlmlrtitntive Metl~odsto Shrdy Covnriatiolr 149 T A B L E 6.1 Calculating the Correlation between Calorie Consumption and Life ~ G e c t a n c ~ Li/E Cnlarie Life E.rpectntlcy Cnlorie Consernptiorz Corrnhj Expecfnrrnj Connterptioioa Z Scores Z Scores Niger 45 2432 -2.04 -.70 Ethiopia 47 1749 -1.85 -1.92 Mali 47 2074 -1.85 -1.34 Uganda 48 2344 -1.75 -.86 Senegal 48 2350 -1.75 -35 Sudan 50 2208 -1.55 -1.10 Ghana 54 1759 -1.17 -1.90-..-..- Kenya 58 2060 -.78 -1.37 Zimbabwe 58 2132 -.78 -1.24 Botswana 59 2201 -68 -1.11 Indonesia 60 2579 -.58 -.44 Morocco 61 2915 -.49 .16 Peru 61 2246 -.49 -1.03...-~ philippines Thailand Turkey Syria Brazil Colombia Paraguay Mexico S.Korea Malaysia Hungary Poland Chile Jamaica Ireland United States Greece Australia Spain Italy Netherlands France Canada Sweden Nomay Switzerland Japan Určeno pouze pro studijní účely longer life expectancy).Table 6.1 also reports the Z scores for these two variablesfor all forty cases. Notice that countries with high scores on life expectancyhave posi- tive scores on life expectancyZ scores, and countrieswith low scores on life expectancy have negative scores on life expectancy Z scores. The same is true for calorie consumption. When the Z scores for two vari- ables are multiplied, the products indicate a lot about the correlation. If high scores on one variable correspond to high scores on the other, and low scores on one correspond to low scores on the other, then the prod- ucts of the Z scoreswill usually be positive, indicating a positive correla- tion. However,if low scores on one variable generally correspond to high scores on the other, and vice versa, then the products of the Z scores gen- erally will be negative, indicating a negative correlation. As the appendix illustrates, when the products of pairs of Z scores for two variables are averaged over all the cases, the number that results is Pearson's correlation coefficient, a number which varies between -1.00 (perfectnegative correlation)and +1.00 (perfectpositive correlation).The correlatio.~between life expectancyand calorie consumptionfor the forty countries in Table 6.1 is ,802, a strong positive correlation. The strong covariation between these two variables is clear from simply examining the table because the countries are sorted according to their values on life expectancy. The.calculation of the correlation coefficient provides a direct, quantitative assessment of the degree to which the two measures covary. The most basic use of correlation coefficients is to assess the strength of the relation between two variables. The correlationbetween calorie con- sumption and life expectancyis strong (r = .802), suggesting that an im- portant key to longer life expectancy is nutrition. But there are many other uses of correlations.Most of these involve the comparison of com- peting causes, as indicated in the strength of correlations. Consider the correlations reported in Table 6.2. The table shows all the correlations among four variables: three independent variables (calo- rie consumption, GNP per capita, and doctors per capita) and one de- pendent variable (life expectancy). (Notice that a variable correlates perfectly with itself, as shown by the values of 1.000 in Table 6.2.) GNP per capita is a rough measure of the wealth of a counhy. Doctors per capita is a rough measure of the availability of medical care. T A B L E 6.2 A Correlation Matrix with Three Independent Variables and a Deoendent Variable Dependent Vnrinble hidepei~deiztVnrinble5 Life Cnlorie GNP per Doctors Erpcctflllnj Canamlption Cnpitfl (US$) per Cflpifn Life expectancy 1.000 .802 ,651 ,721 Calorieconsumption ,802 1.000 ,848 ,321 GNP per capita (US$) ,651 ,848 1.000 ,671 Doctorsper capita ,721 ,321 ,671 1.000 The first columnshows the correlationsof the three independent vari- ables with the dependent variable. Calorie consumption is the most strongly correlated with life expectancyf (r = .802), followed by doctors per capita (r= .721),followedby GNP per capita (r = ,651).Is it possible to conclude from this evidence that all that really matters for life expect- ancy is calorie consumption?In other words, if the goal is to understand the variationin life expectancyacross countries,is howing nutrition lev- els enough? Is it reasonable to ignore the correlations with GNP per capita and doctors per capita? In order to answer a question like this, it is not enough simply to identify the independent variable with the strongest correlation with the dependent variable. It is also necessary to examine the correlations among the independent variables. Consider first the correlation between calorie consumption and GNP per capita. It is strong (1. = .848), suggest- ing that countries with the best nutrition are also the richest. Given that (1)these two independent variables are strongly correlated and (2)calo- rie consumptionhas a strongercorrelationwith life expectancythan does GNPper capita (r= ,802versus .651),it is reasonable to conclude that the linkbetween calorie consumptionand life expectancyis more fundamen- tal than the link between GNP per capita and life expectancy. In short, richer countries have better nutrition, but it is good nutrition that causes greater life expectancy,not wealth per se. Určeno pouze pro studijní účely What about doctors per capita? The correlation between doctors per capita and calorie consumption is positive, but not strong (r = ,321). Thus, in some countriesnutrition may not be good,but good health care is available, while in other countries, the opposite may be the case. In other words, doctors per capita and calorie consumption are not closely linked across countriesin the same way that GNP per capita and calorie consumption are. Thus, the correlation between doctors per capita and life expectancy, the dependent variable, is relatively independent of and separate from the correlation between calorie consumption and life ex- pectancy. Even though the correlation between doctors per capita and life expectancy(r = ,721)is not as strong as the correlation between calo- rie consumption and life expectancy (r = .802),it is an important correla- tion. The pattern of correlations in Table 6.2 indicates that both doctors per capita and calorie consumption affect life expectancy. A lot can be learned from looking at a correlation matrix like the one in Table 6.2. However, some quantitativestudies examinemany indepen- dent and dependent variables. Quantitative researchers use advanced statisticaltechniques such as multiple regression analysis to disentangle correlations among independent variables and assess their separate ef- fects on dependent variables. They also use exploratory data analysis techniques ("EDA; see Tukey 1977) to go beyond broad patterns of covariation to identify sets of cases that deviate from these broad pat- terns or to uncover very subtle patterns. Sometimesthese techniquescan be used to identify complex patterns of causation that are specificto sub- sets of cases included in a study (Learner 1978).These advanced statisti- cal techniques are very powerful data techniques and they further the primary goals of this approak assessing general patterns (including their Limits),making projections about the future, and evaluating broad theories. Conclusion Quantitative methods are best suited for addressing differences across a largenumber of cases.These methods focus especiallyon the covariation between attributes that vary by level, usually across many cases. If two features of cases vary together in a systematicway, they are said to cor- relate. Correlation is important because it may suggest that a causal or some other kind of important relation exists between the two features that are linked. Quantitativemethods provide a directway to implement a researcher's interest in general patterns, and quantitative researchers believe that these patterns of covariation provide important clues about social life. In many ways, the quantitative approach appears to be the most sci- entific of the three approachespresented in thisbook. It favors general- ity and parsimony. It uses generic units such as individuals, families, states, cities, and countries. It can be used to assess broad relationships across countless cases. It condenses evidence to simple coefficients, us- ing mathematical procedures. It can be used to test broad theoretical ar- guments and to make projections about the future. In short, it imitates many of the features and practices of hard sciences such as physics and chemistry. While the quantitative approach does have many of the features of a hard science, it would be a mistake to portray this approach as some- thing radically different from the other two strategies. All social re- search engages theoretical ideas and analytic frames, at least indirectly. All social research involves constructing images from evidence, usually lots of it. And all social researchers construct images by connecting so- cial phenomena. Určeno pouze pro studijní účely