5 Designing Qualitative Studies T-he Hirst evaluation The young people gathered around Halcolm. "Tell us again, Teacher of Many Things, about the first evaluation." "The first evaluation/' he began, "was conducted a long, long time ago, in ancient Babylon when Nebuchadnezzar was king. Nebuchadnezzar had just conquered Jerusalem in the third year of the reign of Jehoiakim, King of Judah. Now Nebuchadnezzar was a shrewd ruler. He decided to bring carefully selected children of Israel into the palace for special training so that they might be more easily integrated into Chaldean culture. This special program was the forerunner of the compensatory education programs that would become so popular in the 20th century. The three-year program was royally funded with special allocations and scholarships provided by Nebuchadnezzar. The ancient text from the Great Book records that the king spake unto Ashpenaz the master of his eunuchs that he should bring certain of the children of Israel, and of the king's seed, and of the princes; children in whom was no blemish, but well-favored and skillful in all wisdom, and cunning in knowledge, and understanding science, and such as had ability in them to stand in the king's palace, and whom they might teach the learning and the tongue of the Chaldeans. And the king appointed them a daily provision of the king's meat, and of the wine which he drank; so nourishing them three years, that at the end thereof they might stand before the king. (Daniel 1:3-5) 5 Designing Qualitative Studies "Uke. T~Vst evaluation__ The young people gathered around Halcolm. "Tell us again, Teacher of Many Things, about the first evaluation." "The first evaluation," he began, "was conducted a long, long time ago, in ancient Babylon when Nebuchadnezzar was king. Nebuchadnezzar had just conquered Jerusalem in the third year of the reign of Jehoiakim, King of Judah. Now Nebuchadnezzar was a shrewd ruler. He decided to bring carefully selected children of Israel into the palace for special training so that they might be more easily integrated into Chaldean culture. This special program was the forerunner of the compensatory education programs that would become so popular in the 20th century. The three-year program was royally funded with special allocations and scholarships provided by Nebuchadnezzar. The ancient text from the Great Book records that the king spake unto Ashpenaz the master of his eunuchs that he should bring certain of the children of Israel, and of the king's seed, and of the princes; children in whom was no blemish, but well-favored and skillful in all wisdom, and cunning in knowledge, and understanding science, and such as had ability in them to stand in the king's palace, and whom they might teach the learning and the tongue of the Chaldeans. And the king appointed them a daily provision of the king's meat, and of the wine which he drank; so nourishing them three years, that at the end thereof they might stand before the king. (Daniel 1:3-5) S. 209 ■ mi I 1111 llif I •1 IBl ■II ■■ail 210 !£j, QUALITATIVE DESIGNS AND DATA COLLECTION "Now this program had scarcely been established when the program director, Ashpenaz, who happened also to be prince of the eunuchs, found himself faced with a student rebellion led by a radical named Daniel, who decided for religious reasons that he would not consume the king's meat and wine. This created a serious problem for the director. If Daniel and his coconspirators did not eat their dor; mitory food, they might fare poorly in the program and endanger not only future program funding but also the program director's head! The Great Book says: But Daniel purposed in his heart that he would not defile himself with the portion of the ldng's meat, nor with the wine \vhich he drank; therefore he requested of the prince of the eunuchs that he might not defile himself. And the prince of the eunuchs said unto Daniel, I fear my lord the king, who hath appointed your meat and your drink; for why should he see your faces worse liking , than the children which are of your sort? Then shall ye make me endanger my head to the king. (Daniel 1:8,10) "At this point, Daniel proposed history's first educational experiment and program evaluation. He and three friends (Hananiah, Mishael, and Azariah) would be placed on a strict vegetarian diet for ten days, while other students continued; on the king's rich diet of meat and wine. At the end of ten days the program director would inspect the treatment group for any signs of physical deterioration and judge the productivity of Daniel's alternative diet plan. Daniel proposed the experiment thusly: Prove thy servants, I beseech thee, ten days; and let them give us pulse to eat, and wa- < ter to drink. Then let our countenances be looked upon before thee, and the countenance of the children that eat of the portion of the Icing's meat: and as thou seest, deal with thy servants. So he consented to them in this matter, and proved them ten days. (Daniel 1:12-14) "During tire ten days of waiting Ashpenaz had a terrible time. He couldn't sleep, he had no appetite, and he had trouble working because he was preoccupied worrying about how the evaluation would turn out. He had a lot at stake. Be-sid es, in those days they hadn't quite worked out the proper division of labor so he had to play the roles of both program director and evaluator. You see...." The young listeners interrupted Halcolm. They sensed that he was about to launch into a sermon on the origins of the division of labor when they still wanted to hear the end of the story about the origins of evaluation. "How did it turn out?" they asked. "Did Daniel end up looking better or worse from the new diet? Did Ashpenaz lose his head?" "Patience, patience," Halcolm pleaded. "Ashpenaz had no reason to worry. The results were quite amazing. The Great Book says that at the end of ten days their countenances appeared fairer and fatter in flesh than all the children which did eat the portion of the king's meat. ■mm. Designing Qualitative Studies !=L 211 Thus, Melzar took away the portion of their meat, and the wine that they should drink; and gave them pulse. As for these four children, God gave them knowledge and skill in all learning and wisdom; and Daniel had understanding in all visions and dreams. Now at the end of the days that the king had said he should bring them in, then the prince of the eunuchs brought them in before Nebuchadnezzar. And in all matters of wisdom and understanding, that the king inquired of them, he found them ten times better than all the magicians and astrologers that were in all his realm. (Daniel 1:15-18,20) "And that, my children, is the story of the first evaluation. Those were the good old days when evaluations really got used. Made quite a difference to Ashpenaz and Daniel. Now off with you—and see if you can do as well." f" ■■'■ *.#4,'3. '■■ '■'^i^t^—ft. —From Halcolm's Evaluation Histories A Meta-Evaluation A meta-evaluation is an evaluation of an evaluation. A great deal can be learned about evaluation designs by conducting a meta-evaluation of history's first program evaluation. Let us imagine a panel of experts conducting a rigorous critique of this evaluation of Babylon's compensatory education program for Israeli students: 1. Small sample size (N = 4). 2. Selectivity bias because recruitment into the program was done by "creaming," that is, only the best prospects among the children of Israel were brought into the program. 3. Sampling bias because students were self-selected into the treatment group (diet of pulse and water). 4. Failure to clearly specify and control the nature of the treatment, thus allowing for the possibility of treatment contamination because we don't know what other things, aside from a change in diet, either group was involved in that might have explained the outcomes observed. 5. Possibility of interaction effects between the diet and the students' belief system (i.e., potential Hawthorne and halo effects). 6. Outcome criteria vague: Just what is "countenance"? 7. Outcome measurement poorly opera-tionalized and nonstandardized. 8. Single observer with deep personal involvement in the program introduces possibility of selective perception and bias in the observations. 9. Validity and reliability data are not reported for the instruments used to measure the final, summative outcome ("he found them ten times better than all the magicians and astrologers"). 10. Possible reactive effects from the students' knowledge that they were being evaluated. 212 QUALITATIVE DESIGNS AND DATA COLLECTION Despite all of these threats to internal validity, not to mention external validity, the information generated by the evaluations appears to have been used. The 10-day formative evaluation was used to make a major decision about the program, namely, to change the diet for Daniel and his friends. The end-of-program summative evaluation conducted by the king was used to judge the program a success. (Daniel did place first in his class.) Indeed, it would be difficult to find a more exemplary model for the uses of evaluation in making educational policy decisions than mis "first evaluation" conducted under the auspices of Nebuchadnezzar so many years ago. This case study is an exemplar of evaluation research having an immediate, decisive, and lasting impact on an educational program. Modern evaluation researchers, flailing away in seemingly futile efforts to affect contemporary governmental decisions, can be forgiven a certain nostalgia for the "good old days" in Babylon when evaluation really made a difference. But should the results have been used? Given the apparent weakness of the evaluation design, was it appropriate to make a major program decision on the basis of data generated by such a seemingly weak research design? I would argue that not only was use impressive in this case, it was also appropriate because the research design was exemplary. Yes, exemplary, because the study was set up in such a way as to provide precisely the information needed by the program director to make the decision he needed to make. Certainly, it is a poor research design to study the relationship between nutrition and educational achievement. It is even a poor design to decide if all students should be placed on a vegetarian diet. But those were not the issues. The question the director faced was whether to place four specific students on a special diet at their request. The information he needed concerned the consequences of that specific change and only that specific change. He showed no interest in generalizing the results beyond those four students, and he showed no interest in convincing others that the measures he: made were valid and reliable. Only he and Daniel had to trust the measures used, and so data collection (observation of countenance) was done in such a way as to be meaningful and credible to the primary intended evaluation users, namely, Ashpenaz and Daniel. If any bias existed in Iris observations, given what he had at stake, the bias would have operated against a demonstration of positive outcomes rather than in favor of such outcomes. While there are hints of whimsy in the suggestion that this first evaluation was exemplary, I do not mean to be completely facetious. I am serious in suggesting that the Babylonian example is an exemplar of utilization-focused evaluation. It contains and illustrates all the factors modern evaluation researchers have verified as critical from studies of utilization (Patton 1997a). The decision makers who were to use findings generated by the evaluation were clearly identified and deeply involved in every stage of the evaluation process. Tire evaluation question was carefully focused on needed information that could be used in the making of a specific decision. The evaluation methods and design were appropriately matched to the evaluation question. The results were understandable, credible, and relevant. Feedback was immediate and utilization was decisive. Few modern evaluations can meet the high standards for evaluation set by Ashpenaz and Daniel more than 3,000 years ago. This chapter discusses some ways in which research designs can be appropriately matched to evaluation questions in an attempt to emulate the exemplary match be- Designing Qualitative Studies 213 tween evaluation problem and research design achieved in the Babylonian evaluation. As with previous chapters, I shall emphasize the importance of being both strategic and practical in creating evaluation designs. Being strategic begins with being clear about the purpose of the intended research or evaluation. M. Clarity About Purpose: A Typology Purpose is the controlling force in research. Decisions about design, measurement, analysis, and reporting all flow from purpose. Therefore, the first step in a research process is getting clear about purpose. The centrality of purpose in making methods decisions becomes evident from examining alternative purposes along a continuum from theory to action: 1. Basic research: To contribute to fundamental knowledge and theory 2. Applied research: To illuminate a societal concern 3. Sitmmative evaluation: To determine program effectiveness 4. Formative evaluation: To improve a program 5. Action research: To solve a specific problem Basic and applied researchers publish in scholarly journals, where their audience is other researchers who will judge their contributions using disciplinary standards of 214 !£J. QUALITATIVE DESIGNS AND DATA COLLECTION rigor, validity, and theoretical import. In contrast, evaluators and action researchers publish reports for specific stakeholders who will use the results to make decisions, improve programs, and solve problems. Standards for judging quality vary among these five different types of research. Expectations and audiences are different. Reporting and dissemination approaches are different. Because of these differences, the researcher must be clear at the beginning about which purpose has priority. No single study can serve all of these different purposes and audiences equally well. With clarity about purpose and primary audience, the researcher can go on to make specific design, data-gathering, and analysis decisions to meet the priority purpose and address the intended audience. In the Babylonian example, the purpose was simply to find out if a vegetarian diet would negatively affect the healthy appearances (countenances) of four participants —not why their countenances appeared healthy or not (a causal question), but whether the dietary change would affect countenance (a descriptive question). The design, therefore, was appropriately simple to yield descriptive data for the purpose of making a minor program adjustment. No contribution to general knowledge. No testing or development of theory. No generalizations. No scholarly publication. No elaborate report on methods. Just find out what would happen to inform a single decision about a possible program change. The participants in the program were involved in the study; indeed, the idea of putting the diet to an empirical test originated with Daniel. In short, we have a very nice example of simple font mtive evaluation. The king's examination of program participants at the end of three years was quite different. We might infer that the king was judging the overall value of the program. Did it accomplish his objectives? Should it be continued? Could the outcomes he ob-' served be attributed to the program? This is the kind of research we have come to call summative evaluation—summing up judgments about a program to make a major decision about its value, whether it should be continued, and whether the demonstrated model can or should be generalized to and replicated for other participants or in other places. Now imagine that researchers from the University of Babylon wanted to study the diet as a manifestation of culture in order to develop a theory about the role of diet in transmitting culture. Their sample, their data collection, their questions, the duration of fieldwork, and their presentation of results would all be quite different from the formative evaluation undertaken by Ash-penaz and Daniel. The university study would have taken much longer than 10 days and might have yielded empirical generalizations and contributions to theory, yet would not have helped Ashpenaz make his simple decision. On the other hand, we, might surmise that University of Babylon scholars would have scoffed at an evaluation done in 10 days, even a formative one. Different purposes. Different criteria for judging the research contribution. Different methods. Different audiences. Different kinds of research. These are examples of how purpose can vary. In the next section, I shall present a more formal framework for distmguishing these five different research purposes and examine in more depth the implications of" varying purposes for making design derisions. Previous chapters have presented the nature and strategies of qualitative inquiry, philosophical and theoretical foundations, and practical applications. In effect, the reader has been presented with a large array of options, alternatives, and variations, Designing Qualitative Studies \3. 215 How does one sort it all out to decide what to do in a specific study? The answer is to get clear about purpose. The framework that follows is meant to facilitate achieving this necessary clarity about purpose while also illustrating how one can organize a mass of observations into some coherent typology —a major analytical tool of qualitative inquiry. The sections that follow examine each type of research: basic research, applied research, summative evaluation research, formative evaluation, and action research. Basic Research The purpose of basic research is knowledge for the sake of knowledge. Researchers engaged in basic research want to understand how the world operates. They are interested in investigating a phenomenon to get at the nature of reality with regard to that phenomenon. The basic researcher's purpose is to understand and explain. Basic researchers typically work within specific disciplines, such as physics, biology, psychology, economics, geography, and sociology. The questions and problems they study emerge from traditions within those disciplines. Each discipline is organized around attention to basic questions, and the research within that discipline derives from concern about those basic questions. Exhibit 5.1 presents examples of fundamental questions in several disciplines. The fundamental questions undergirding each discipline flow from die basic concerns and traditions of that discipline. Researchers working within any specific disciplinary tradition strive to make a contribution to knowledge in that discipline and thereby contribute to answering the fundamental questions of the discipline. The most prestigious contribution to knowledge takes the form of a theory that explains the phenomenon under investigation. Basic researchers work to generate new theories or test existing theories. Doctoral students are typically expected to make theoretical contributions in their dissertations. Theories encapsulate the knowledge of a discipline. Basic researchers are interested in formulating and testing theoretical constructs and propositions that ideally generalize across time and space. The most powerful kinds of findings in basic science are those findings that are universal, such as Boyle's law in physics that the volume of a gas at constant temperature varies inversely with the pressure exerted on it. Basic researchers, then, are searching for fundamental patterns of the universe, the earth, nature, society, and human beings. For example, biologists have discovered that "changes in DNA are inher-ited, but changes in proteins (specifically, in their amino acid sequence) are not, . ., perhaps the only universal truth biologists have" (Smith 2000:43). Social science, to date, is markedly short of "universal truths." Nevertheless, generalizations across time and space remain the Holy Grail of basic research and theory. The findings of basic research are published in scholarly books, journals, and dissertations. Each discipline has its own traditions, norms, and rules for deciding what constitutes valid research in that discipline. To be published in the major journals of any particular discipline, scientists must engage in the kind of research that is valued by the researchers in that disciplinary tradition. Chapter 3 reviewed theoretical traditions closely associated with qualitative inquiry, for example, ethnography and phenomenology. Qualitative inquiry also contributes to basic research through inductive theory development, a prominent example being the "grounded theory" approach of Glaser and Strauss (1967), essentially an inductive strategy for generating and confirming theory that emerges from close involvement and di- 216 & QUALITATIVE DESIGNS AND DATA COLLECTION EXHIBIT 5.1 Fundamental Disciplinary Questions Anthropology What is the nature of culture? How does culture emerge? How is it transmitted? What are the functions of culture? Psychology Why do individuals behave as they do? How do human beings behave, think, feel, and know? What is normal and abnormal in human development and behavior? Sociology What holds groups and societies together? How do various forms of social organization emerge and what are their functions? What are the structures and processes of human social organizations? Political science What is the nature of power? How is power organized, created, distributed, and used? Economics How do societies and groups generate and distribute scarce resources?, How are goods and services produced and distributed? What is the nature of wealth? Geography What is the nature of and variations in the earth's surface and atmosphe How do various forms of life emerge in and relate to variations in theea^ What is the relationship between the physical characteristics of an area, and the activities that take place in that area? Biology What is the nature of life? What are variations in the forms of life? How have life forms emerged and how do they change? t t t( Ii P- tic di: jut rect contact with the empirical world. Basic qualitative research typically requires a relatively lengthy and intensive period of field-work. The rigor of field techniques will be subject to peer review. Particular attention must be given to the accuracy, validity, and integrity of the results. An example of interdisciplinary theory development comes from work in basic economic anthropology studying craft commercialization and the product differentiation that ordinarily accompanies increased craft sales. Artisans in emerging markets, such as those in rural Mexico, typi vate and develop specialties in ana' establish a niche for themselves■ina' economic environment. Chibnik'sf; sic research on commercial wood Oaxaca has led to the theory thatf ket segmentation resembles tfiela' of product life cycles described in, ness literature and is somewhat to the proliferation of equilibrium.'' mature or climax stages of ecologi sions. Chibnik examined both m mands and the initiative of ar*' Designing Qualitative Studies [3. 217 found that local artisans do not have total freedom in their attempts to create market niches since they are restricted by their abilities and the labor and capital they can mobilize. This is a classic example of interdisciplinary theory generation and testing bridging economics, ethnology, and ecology. Applied Research Applied researchers work on human and societal problems. In the example just cited, had Chibnik examined the problem of creating new markets for rural artisans and offered possible solutions for increased marketing, the work would have constituted applied rather than basic research. The purpose of applied research is to coninbuic knowledge that will help people understand the nature of a problem in order to intervene, thereby allowing human beings to more effectively control their environment. While in basic research the source of questions is the traditions within a scholarly discipline, in applied research the source of questions is in the problems and concerns experienced by people and articulated by policymakers. Applied researchers are often guided by the findings, understandings, and explanations of basic research. They conduct studies that test applications of basic theory and disciplinary knowledge to real-world problems and experiences. The results of applied research are published in journals that specialize in applied research within the traditions of a problem area or a discipline. Societal concerns have given rise to a variety of new fields that are interdisciplinary in nature. These emerging fields reflect the long-standing criticism by policymakers that universities have departments but society has problems. Applied mterdisciplinary fields are especially problem oriented rather than discipline oriented. For example, work on environmental studies often involves researchers from a number of disciplines. In agricultural research, the field of integrated pest management (IPM) includes researchers from entomology, agronomy, agricultural economics, and horticulture. Fields of interdisciplinary research in the social sciences include gerontology, criminal justice studies, women's studies, and family research. Exhibit 5.2 offers examples of applied interdisciplinary research questions for economic anthropology, social psychology, political geography, and educational and organizational development. Notice the difference between these questions and the kinds of questions asked by basic researchers in Exhibit 5.1. Applied researchers are trying to understand how to deal with a sig-nificant societal problem, while basic researchers are trying to understand and explain the basic nature of some phenomenon. Applied qualitative researchers are able to bring their personal insights and experiences into any recommendations that may emerge because they get especially close to the problems under study during fieldwork. Audiences for applied research are typically policymakers, directors and managers of intervention-oriented organizations, and professionals working on problems. Timelines for applied research depend a great deal on the timeliness and urgency of the problem being researched. A good example of applied research is Emerging Drug Problems, a work sponsored by the U.S. General Accounting Office (1998) that examined new street drugs, recent research on addiction, and alternatives for public policy. In contrast to basic researchers, who ultimately seek to generalize across time and space, applied research findings typically are limited to a specific time, place, and condition. For example, a researcher studying the nature of family problems in the 1980s 218 ĚL QUALITATTVE DESIGNS AND DATA COLLECTION EXHIBIT 5.2 Sample Interdisciplinary Applied Research Questions Applied economic anthropology How can the prosperous economy of an isolated, small minor- Applied social psychology Applied political geography Applied educational and organizational development ity group be preserved when that group encounters new competition from the encroaching global economy? How can a group within a large organization develop cohesion and identity within the mission and values of its parent structure and culture? How can people of previously isolated towns, each with its own history of local governance, come together to share power and engage in joint decision making at a regional level? How can students from different neighborhoods with varied ethnic and racial backgrounds be integrated in a new magnet school? would not expect those problems to be the same as those experienced by families in the 1880s. While the research might include making such comparisons, applied researchers understand that problems emerge within particular time and space boundaries. Evaluation Research: Summative and Formative Once solutions to problems are identified, policies and programs are designed to intervene in society and bring about change. It is hoped that the intervention and changes will be effective in helping to solve problems. However, the effectiveness of any given human intervention is a matter subject to study. Thus, the next point on the theory-to-action research continuum is the conduct of evaluation and policy research to test the effectiveness of specific solutions and human interventions. While applied research seeks to understand societal problems and identify poten- tial solutions, evaluations examine ant judge the processes and outcomes aimeda attempted solutions. Evaluators study pro grams, policies, personnel, organizations and products. Evaluation research can b conducted on virtually any explicit attemp to solve problems or bring about planned change. As illustrated in the Daniel story u history's "first evaluation" that opened thi chapter, evaluators distinguish two quit different purposes for evaluation: (1) sum mative evaluations that judge overall effec tiveness to inform major decisions abou whether a program should continue and (2 formative evaluations that aim to improv. programs. Summative evaluations serve the pui pose of rendering an overall judgment abou the effectiveness of a program, policy, o product for the purpose of saying that th evahiand (thing being evaluated) is or is nc effective and, therefore, should or should not be continued, and has or does not hav the potential of being generalizable to othe situations. A summative decision implies Designing Qualitative Studies IM 219 Are they doing research or evaluation? summing-up judgment or a summit (from the mountaintop) decision, for example, to expand a pilot program to new sites or move it from temporary (pilot test or demonstra- tion) funding to more permanent funding, or it may lead to program or policy termination. Summative evaluations seldom rely entirely, or even primarily, on qualitative 220 M. QUALITATIVE DESIGNS AND DATA COLLECTION data and naturalistic inquiry because of decision makers' interest in measuring standardized outcomes, having controlled comparisons, and making judgments about effectiveness from relatively larger samples with statistical pre-post and follow-up results. Qualitative data in summative evaluations typically add depth, detail, and nuance to quantitative findings, rendering insights through illuminative case studies and examining individualized outcomes and issues of quality or excellence—applications discussed in Chapter 4. Harkreader and Henry (2000) have provided an excellent discussion of the challenges of rendering summative judgments about merit and worth; they use as their example comparative quantitative performance data from Georgia schools to assess a democratic reform initiative. Fetterman (2U00b) shows how qualitative data can be the primary basis for a summative evaluation. His evaluation of STEP, a 12-month teacher education program in the Stanford University School of Education, included fieldwork immersion in the program, open-ended interviews with all students, focus groups, observations of classrooms, interviews with faculty, digital photography of classroom activities, and qualitative analysis of curricular materials, as well as a variety of surveys and outcome measures. The summative evaluations of a democratic reform initiative in Georgia and of Stanford's STEP program both followed and built on extensive formative evaluation work, the purpose of which we now examine. Formative evaluations, in contrast to summative ones, serve the purpose of improving a specific program, policy, group of staff (in a personnel evaluation), or product. Formative evaluations aim at forming (shaping) the thing being studied. No attempt is made in a formative evaluation to generalize findings beyond the setting in which the evaluation takes place. Formative evaluations rely heavily on process studies, implementation evaluations, case studies, and evaluability assessments (see Chapter 4). Formative evaluations often rely heavily, even primarily, on qualitative methods. Findings are context specific. Although formative and summative remain the most basic and classic distinctions in evaluation, other evaluation purposes: have emerged in recent years in "a world larger than formative and summative" (Patton 1996b). New purposes include ongoing "developmental evaluation" for program and organizational development and learning (Patton 1994; Preskill and Torres 1999); empowering local groups through evaluation participation (Fetterman 2000a; Patton 1.997b); and using the processes of evaluation (process use) to build staff capacity for data-based decision making and continuous improvement (Patton 1997a: 87-113, 1998). For our analysis here, however, these and related approaches to evaluation share the general purpose of improvement and can be included within the broad category of formative research along our theory-to-action continuum. In addition, some evaluation studies are now designed to generate generalizable knowledge about effective practices across different projects or programs based on cluster evaluations, lessons learned, "better" practices, and meta-analyses (Patton 1997a:70-75). This knowledge-generating approach to evaluation research, to the extent that it aims to discover general principles of effective practice rather than render judgment about the merit or worth of a specific intervention, falls roughly within the category "applied research" in this theory-to-action continuum. However, the emergence and increased importance of knowledge-generating evaluations illustrate why these five categories (basic, applied, summative, formative, and action re- Designing Qualitative Studies l=L 221 search) cannot be thought of as fixed or exhaustive; rather, this typology provides general guidance to major formations in the research landscape without charting every hill and valley in that varied and complex territory that research has become. Action-Oriented, Problem-Solving Research The final category along the theory-to-action continuum is action research. Action research aims at solving specific problems within a program, organization, or community. Action research explicitly and purposefully becomes part of the change process by engaging the people in the program or organization in studying their own problems in order to solve those problems (Whyie 1969). As a result, the distinction between research and action becomes quite blurred and the research methods tend to be less systematic, more informal, and quite specific to the problem, people, and organization for which the research is undertaken. Both formative evaluation and action research focus on specific programs at specific points in time. There is no intention, typically, to generalize beyond those specific settings. The difference between formative evaluation and action research centers on the extent to which the research is systematic, the different kinds of problems studied, and the extent to which there is a special role for the researcher as distinct from the people being researched. In formative evaluation, there is a formal design and the data are collected and/or analyzed, at least in part, by an evaluator. Formative evaluation focuses on ways of improving the effectiveness of a program, a policy, an organization, a product, or a staff unit. In action research, by way of contrast, design and data collection tend to be more informal, the people in the situation are often directly involved 'in gathering the information and then studying themselves, and the results are used internally to attack specific problems within a program, organization, or community. While action research may be used as part of an overall organizational or community development process, it most typically focuses on specific problems and issues within the organization or community rather titan on the overall effectiveness of an entire program or organization. Thus, along this theory-to-action-research continuum, action research has the narrowest focus. The findings of formative evaluation and action research are seldom disseminated beyond the immediate program or organization within which the study takes place. In many instances, there may not even be a full written research report. Publication and dissemination of findings are more likely to be through briefings, staff discussions, and oral communications. Summaries of findings and recommendations will be distributed for discussion, but the formality of the reporting and the nature of the research publications are quite different from those in basic, applied, or even summative evaluation research. An example of action research comes from a small rural community in the Midwest in which the town board needed to decide what to do with a dilapidated building on a public park. They got a high school class to put together a simple telephone survey to solicit ideas about what to do with the building. They also conducted a few focus groups in local churches. The results showed that the townspeople preferred to fix up the building and restore it as a community center rather than tear it down. The action research process took about a month. Based on the findings, a local committee was formed to seek volunteers and funds for the restora- 222 !S, QUALITATIVE DESIGNS AND DATA COLLECTION INACTION RESEARCH _1 * _\____ ?j ■ Educational researcher! Robald Gentile f 1994) noticed the increasing popularity arid importance \ . of action research and wondered^about the value of.all the work, involved: "Whilcthere are some yl| advantages of action research, the disadvantages are that the researcher must collect a lolbfcM data, carefully observing both the behaviors of interest and theconditions underwhich theyac-"' .\ i _ cur. Following that, onehastoscocc thedata, a process thatsometimes requires inventing ways to i categorize them, analyze them, and draw inferences that are appropriate farthe sample, design, /J l' • andsofdrth.'The problem With action research, in otfier words, isthat it requires toomuch action, j Fortunately, l-djscovered an alternative to action research that is probably best labeled 'inaction' ; ' l research" (p. 3D), i • ) T/jree examples of Inaction Research: * -V : | Statistics in service ofthe ld~"The invention or use ofstatistics to support some pnecan-ceived belief or entrepreneurial motive" (p: 30). ' j " "r; ~ ' [" f' '. '- - Scholarship as inacthnresearch-Making up quotations where the researcher knows'what . the interviewee really meant to say but didn't say'quite right, so the inaction research • helpsout by making up the needed quotation;also, "inventing'newkerminology that,bydef-' t h inition, has no historical usage"so no literature review is needed because there is no Ittera-, •, ture (p. 31). ' . ' ■ Happiness quotients-'Exposing people to a program or product and then following up with .'; t o one-page questionnaire asking how well they liked the program orproduef'therebygivir ; the appearance pf haying done an evaluation without all thecpst/mconyenience and.diffi-*. l, r * cultfes of'conducting rea,I fieldworik(p.31).' , ! tion, thereby solving the town board's problem of what to do with the building—an example of action-oriented, problem-solving research. The Purpose of Purpose Distinctions It is important to understand variations in purpose along this theory-to-action continuum because different purposes typically lead to different ways of conceptualizing problems, different designs, different types of data gathering, and different ways of publicizing and disseminating findings. These are only partly issues of sdiolarship. Politics, paradigms, and values are also part of the landscape. Researchers engaged in inquiry at various points along the continuum can have very strong opinions and feelings about researchers at other points along the continuum, sometimes generating opposing opinions and strong emotions. Basic and applied researchers, for example, would often dispute even calling formative and action research by the name research. The standards that basic researchers apply to what they would consider good research excludes even some applied research because it may not manifest the conceptual clarity and theoretical rigor in real-world situations that basic researchers value. Formative and action researchers, on the other hand, may attack basic research for being esoteric and irrelevant. Designing Qualitative Studies \3. 223 Debates about the meaningfulness, rigor, significance, and relevance of various approaches to research are regular features of university life. On the whole, within universities and among scholars, the status hierarchy in science attributes the highest status to basic research, secondary status to applied research, little status to summative evaluation research, and virtually no status to formative and action research. The status hierarchy is reversed in real-world settings, where people with problems attribute the greatest significance to action and formative research that can help them solve their problems in a timely way and attach the least importance to basic research, which they consider remote and largely irrelevant to what they are doing on a day-to-day basis. The distinctions along the continuum are not only distinctions about purpose and how one conducts research, but they also involve the issue of what one calls what one does. In other words, a person conducting basic research for the purpose of contributing to theory within a discipline may find it helpful to call that work applied research to get certain kinds of funding. Summative evaluation researchers may describe what they are doing as formative evaluation to make their work more acceptable to program staff resistant to being studied. On the other hand, applied researchers may call what they are doing basic research to increase its acceptability among scholars. In short, there are no clear lines dividing the points along the continuum. Part of what determines where a particular kind of research falls along the continuum is how the researcher describes what is being done and its purpose. Different reviewers of the same piece of research might well use a different label to describe it. What is important for our purposes is that researchers understand the implications of these distinctions, the choices involved, and the implications of those choices for both the kind of research undertaken and the researcher's status as a professional within various social groups. Exhibit 5.3 summarizes some of die major differences among the different kinds of research. Examples of Types of Research Questions: A Family Research Example To further clarify these distinctions, it may be helpful to take a particular issue and look at how it would be approached for each type of research. For illustrative purposes, let's examine the different kinds of questions that can be asked about families for different research purposes. All of the research questions in Exhibit 5.4 focus on families, but the purpose and focus of each type of research are quite different. With clarity about purpose, it is possible to turn to consideration of specific design alternatives and strategies. Clarity aboutpurpose helps in making decisions about critical trade-offs in research and designs, our next topic. m. Critical Trade-Offs in Design Purposes, strategies, and trade-offs— these themes go together. A discussion of design strategies and trade-offs is necessitated by the fact that there are no perfect research designs. There are always trade-offs. Limited resources, limited time, and limits on the human ability to grasp the complex nature of social reality necessitate tradeoffs. The very first trade-offs come in framing the research or evaluation questions to be studied. The problem here is to determine the extent to which it is desirable to study one or a few questions in great depth or to I EXHIB iMfcHI A Typology of Research Purposes Types of Desired Level Research Purpose Focus of Research Desired Results of Generalization Key Assumptions Publication Mode Standard forjudging Basic Knowledge as Questions deemed Contribution to Across time and The world is Major refereed Rigor of research, research an end in itself; important by one's theory space (iik';;!) patterned; those scholarly journals universality and discover truth discipline or personal patterns are knowable in one's discipline, verifiability of theory intellectual interest and explainable. scholarly books Applied Understand the Questions deemed Contributions to Within to- «cnt;ra 1 Human and societal Specialized academic Rigor and theoretical research nature and sources important by society theories that can be a time and space as problems can be journals, applied insight into the of human and used to formulate possible, but clearly understood and research journals problem societal problems problem-solving limited application solved with within disciplines, programs and context knowledge. interdisciplinary interventions problem-focused journals Summative Determine Goals of the Judgments and All interventions What worfts one Evaluation reports Generalizability to evaluation effectiveness of intervention generalizations with similar goals place under specified for program funders future efforts and to human interventions about effective types conditions should and policymakers, other programs and and actions (programs, of interventions and work elsewhere. specialized journals policy issues policies, personnel, the conditions under products) which those efforts are effective Formative Improve an Strengths and Recommendations Limited to specific People can and will Oral briefings; Usefulness to and evaluation intervention: weaknesses of the for improvements setting studied use information to conferences; internal actual use by A program, policy, specific program. improve what they're report; limited intended users in organization, or policy, product, or doing. circulation to similar the setting studied product personnel being programs, other studied evalustors Action Solve problems Organization and Immediate action; Here and now People in a setting Interpersonal Feelings about the research in a program, community problems solving problems as can solve problems by interactions among process among organization, or quickly as possible studying themselves. research participants; research participants, community informal unpublished feasibility of the solution generated Designing Qualitative Studies [3. 225 EXHIBIT 5.4 Basic research Applied research Summative evaluation Formative evaluation Action research Family Research Example: Research Questions Matched to Research Category What are variations in types of families and what functions do those variations serve? What is the divorce rate among different kinds of families in the United States and what explains different rates of divorce among different groups? What is the overall effectiveness of a publicly funded educational program that teaches family members communication skills where the desired program outcomes are enhanced communications among family members, a greater sense of satisfaction with family life, effective parenting practices, and reduced risk of divorce? How can a program teaching family communications skills be improved? What are the program's strengths and weaknesses? What do participants like and dislike? A self-study in a particular organization (e.g., church, neighborhood center) to figure out what activities would be attractive to families with children of different ages to solve the problem of low participation in family activities. study many questions but in less depth— the "boundary problem" in naturalistic inquiry (Guba 1978). Once a potential set of inquiry questions has been generated, it is necessary to begin the process of prioritizing those questions to decide which of them ought to be pursued. For example, for an evaluation, should all parts of the program be studied or only certain parts? Should all clients be interviewed or only some subset of clients? Should the evaluator aim at describing all program processes or only certain selected processes in depth? Should all outcomes be examined or only certain outcomes of particular interest to inform a pending decision? These are questions that are discussed and negotiated with intended users of the evaluation. In basic research, these kinds of questions are resolved by the nature of the theoretical contribution to be made. In dissertation research, the doctoral committee provides guidance on focusing. And always there are fundamental constraints of time and resources. Converging on focused priorities typically proves more difficult than the challenge of generating potential questions at the beginning of a study or evaluation. Doctoral students can be especially adept at avoiding focus, conceiving instead of sweeping, comprehensive studies that make the whole world their fieldwork oyster. In evaluations, once involved users begin to take seriously the notion that they can learn from finding how whether what they think is being accomplished by a program is what is really being accomplished, they soon generate a long list of things they'd like to find out. The evaluation facilitator's role is to help them move from a rather extensive list of potential questions to a much shorter list of realistically possible questions and finally to a focused list of essential and necessary questions. I 226 £}, QUALITATIVE DESIGNS AND DATA COLLECTION Review of relevant literature can also bring focus to a study. What is already known? Unknown? What are die cutting-edge dieoretical issues? Yet, reviewing the literature can present a quandary in qualitative inquiry because it may bias the researcher's tliinking and reduce openness to whatever emerges in the field. Thus, sometimes a literature review may not take place until after data collection. Alternatively, the literature review may go on simultaneously with fieldwork, permitting a creative interplay among the processes of data collection, literature review, and researcher introspection (Marshall and Rossman 1989:38-40). As with other qualitative design issues, trade-offs appear at every turn, for there are decided advantages and disadvantages to reviewing the literature before, during, or after fieldwork—or on a continual basis throughout the study. A specific example ol possible variations in focus will illustrate the kinds of trade-offs involved in designing a study. Suppose some educators are interested in studying how a school program affects the social development of school-age children. They want to know how the interactions of children with others in the school setting contribute to the development of social skills. They believe that those social skills will be different for different children, and they are not sure of the range of social interactions that may occur, so they are interested in a qualitative inquiry that will capture variations in program experience and relate those experiences to individualized outcomes. What, then, are trade-offs in determining the final focus? We begin with the fact that any given child has social interactions with a great many people. The first problem in focusing, then, is to determine how much of the social reality experienced by children we should attempt to study. In a narrowly fo study, we might select one particular interactions and limit our study to tir for example, the social interactions bety teachers and cliildren. Broadening die somewhat, we might decide to look orij those interactions that occur in the i room, thereby increasing the scope of; study to include interactions not onlyi tween teacher and child but also am., peers in the classroom and between anyy unteers and visitors to the classroom and children. Broadening the scope of thes still more, we might decide to look at« the social relationships that cliildren ex-ence in schools; in this case we would: beyond the classroom to look at interact' with other personnel in the school—for ample, the librarian, school counselors;s* cial subject teachers, the custodian, and/ school administrative staff. Broadening scope of the study still further, the educa ! might decide that it is important to lookr the social relationships children experi at home as well as at school so as to und stand better how children experiences are affected by both settings, so we would'* elude in our design interactions with p ents, siblings, and others in the ho Finally, one might look at the social relati ships experienced throughout the full ran of societal contacts that cliildren have, eluding church, clubs, and even mass me contacts. A case could be made for the importan and value of any of these approaches, fro the narrowest focus, looking only at stud-teacher interactions, to the broadest ir looking at students' full, complex so world. Now let's add the real-world co straint of limited resources—say, $50, and three months—to conduct die study.i some level, any of these research endeavo could be undertaken for $50,000. But it;'