Edited by Margaret Grosh and Paul Glewwe The World Bank Volume one Lessons from 15 years of the Living Standards Measurement Study Developing Countries Questionnaires Household Survey Designing for “Household surveys are essential for the analysis of most policy issues. This book has carefully assessed recent experience and developed today’s best-practice technique for household surveys. Indeed, much of this technique was developed and pioneered by the authors.This book is clear, systematic, and well structured. It is also wise and scholarly. It will be indispensable to anyone involved in carrying out or analyzing household surveys, and thus it is required reading for all those who wish to take evidence seriously when they think about policy.” —Nicholas Stern, senior vice president, Development Economics and chief economist, the World Bank “This book is an ambitious undertaking, but it quickly exceeded my expectations. It has many strengths: …comprehensiveness, …emphasis on practical application, ...and a sense of balance. For both my domestic and international survey research, this volume will serve as a valued reference tool that I will consult regularly.” —David R.Williams, professor of sociology and senior research scientist, Survey Research Center, University of Michigan “This is a comprehensive guide to planning household surveys on a range of socioeconomic topics in developing countries. It is authoritative, clear, and balanced.The work is a valuable addition to the library of any survey statistician or data analyst concerned with socioeconomic surveys in the developing world.” —William Seltzer, former head, United Nations Statistical Office Household survey data are essential for assessing the impact of development policy on the lives of the poor.Yet for many countries household survey data are incomplete, unreliable, or out of date.This handbook is a comprehensive treatise on the design of multitopic household surveys in developing countries. It draws on 15 years of experience from the World Bank’s Living Standards Measurement Study surveys and other household surveys conducted in developing countries. The handbook covers key topics in the design of household surveys, with many suggestions for customizing surveys to local circumstances and improving data quality. Detailed draft questionnaires are provided in written and electronic format to help users customize surveys. This handbook serves several audiences: • Survey planners from national statistical and planning agencies, universities, think tanks, consulting firms and international organizations. • Those working on either multitopic or topic-specific surveys. • Data users, who will benefit from understanding the challenges, choices, and tradeoffs involved in data collection. Edited by Margaret Grosh and Paul Glewwe Lessons from 15 years of the Living Standards Measurement Study Developing Countries Questionnaires Household Survey Designing for Copyright © 2000 The International Bank for Reconstruction and Development/THE WORLD BANK 1818 H Street, N.W. Washington, D.C. 20433, U.S.A. All rights reserved Manufactured in the United States of America First printing May 2000 The findings, interpretations, and conclusions expressed in this paper are entirely those of the author(s) and should not be attributed in any manner to the World Bank, to its affiliated organizations, or to members of its Board of Executive Directors or the countries they represent.The World Bank does not guarantee the accuracy of the data included in this publication and accepts no responsibility for any consequence of their use.The boundaries, colors, denominations, and other information shown on any map in this volume do not imply on the part of the World Bank Group any judgment on the legal status of any territory or the endorsement or acceptance of such boundaries. The material in this publication is copyrighted.The World Bank encourages dissemination of its work and will normally grant permission promptly. Permission to photocopy items for internal or personal use, for the internal or personal use of specific clients, or for educational classroom use, is granted by the World Bank, provided that the appropriate fee is paid directly to Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, U.S.A., telephone 978-750-8400, fax 978-750-4470. Please contact the Copyright Clearance Center before photocopying items. For permission to reprint individual articles or chapters, please fax your request with complete information to the Republication Department, Copyright Clearance Center, fax 978-750-4470. All other queries on rights and licenses should be addressed to the World Bank at the address above or faxed to 202-522-2422. ISBN:0-19-521595-8 Library of Congress Cataloging-in-Publication Data has been applied for. Part 1 Survey Design 5 1 Introduction Margaret Grosh and Paul Glewwe Accurate, up-to-date, and relevant data from household surveys are essential for governments to make sound economic and social policy decisions. Governments need these data to measure and monitor poverty, employment and unemployment, school enrollment, health and nutritional status, housing conditions, and other dimensions of living standards.They need the data to determine whether schools, health clinics, agriculture extension services, roads, electric power, and other basic services are reaching the poor and other disadvantaged groups.And analysts need household survey data to model economic behavior and thus provide answers to such important policy questions as: How would changes in food subsidies affect the population’s nutritional status? Would increasing fees for public schools reduce school enrollment, and how much revenue would be raised by such fee increases? Who would participate in a new labor-intensive public works program, and what would be the net benefit for participants? How would changes in the price of fertilizer affect farmers’ production of different crops? One way to collect the data needed to answer these questions is to conduct separate household surveys on each topic—that is, to conduct a labor force (employment) survey, a health survey, a housing survey, and so forth. Alternatively, data on many different topics can be collected in a single survey. Such a “multi-topic” household survey, which has many advantages, is the type of survey considered in this book. Household surveys are not a new invention. Stigler (1954) points out that systematic collection of data from households began over 200 years ago. The first known efforts were the collection of family budgets in England by Davies (1795) and Eden (1797). In the 1800s similar data were collected in Saxony, Prussia, Belgium, the United States, and undoubtedly other places as well.The motivation for much of this research was to focus public attention on the plight of the poor. By the mid-1800s, generalizations about household behavior were being drawn from these data. For example, Ducpetiaux’s 1855 study of 200 Belgian households was used by Ernst Engel to derive his classic law that the fraction of a household’s budget devoted to food falls as income rises. The statistical theory that supports modern survey methods was developed in the 1920s.This led to the establishment of high-caliber nationwide surveys in many countries, especially after World War II. Developing countries also participated in this phenomenon;for example,India’s annual National Sample Survey began in 1950. With the advent of modern computing, and especially the appearance of powerful personal computers, the collection and analysis of household survey data has expanded rapidly in both developed and developing countries. (See Deaton MARGARET GROSH AND PAUL GLEWWE 6 1997 for a brief review of household surveys in the 20th century.) Since 1970 several major international programs have been organized to support the collection of household survey data in developing countries. Among the largest such programs have been the United Nations Household Survey Capability Program, the World Fertility Surveys (which later became the Demographic and Health Surveys), and the World Bank’s Living Standards Measurement Study (LSMS) survey program. Other organizations, including the International Food Policy Research Institute, the RAND Corporation, and Cornell University, have also carried out household surveys in developing countries. Some U.N. organizations regularly participate in single-topic household surveys in developing countries, such as employment surveys done in collaboration with the International Labour Office. And two regional survey programs have been strongly influenced by, and indeed have grown directly out of,theWorld Bank’s LSMS program.The first of these, the Social Dimensions of Adjustment (SDA) program for Sub-Saharan Africa, was supported by a consortium of agencies and administered by theWorld Bank. The second and more recent regional survey program, the Improving Surveys of Living Conditions program for Latin America, is sponsored jointly by the Inter-American Development Bank, the World Bank, and the Economic Commission for Latin America. (The Spanish name for this program is Mejoramiento de las Enquestas de Condiciones de Vida; it is often referred to by its Spanish acronym, MECOVI.) The surveys done under the LSMS, SDA, and MECOVI programs are all multi-topic surveys. Because of these and other efforts, household survey data are now much more widely available than they were 10 or 20 years ago.World Bank statistics on the extent of poverty in 1985 were based on data from only 22 of 86 developing countries.Although these 22 countries accounted for 76 percent of the population of the 86 countries (Ravallion, Datt, and van de Walle 1991), it is significant that at that time no reliable data existed for three-fourths of the developing countries. Similar calculations currently underway are based on data from about 70 of 100 developing and transition countries, covering about 88 percent of the total population of the countries. Data for more than one point in time are now available for 50 countries. Coverage has grown the most in the region where it was lowest, Sub-Saharan Africa. In the 1985 calculation only 6 percent of Sub-Saharan Africa’s population was represented, while recent estimates cover 66 percent of this population (Ravallion and Chen 1998). Finally, the time lag between collection and dissemination of the data is getting smaller. In the 1985 World Development Report the average lag was 11 years, so the average survey date was 1974. Now the lag is only five years (Ravallion and Chen 1997). The surge in the collection of household survey data in developing countries has greatly increased the demand for knowledge on how best to design and implement such surveys. Moreover, the growing number of surveys provides a vast amount of experience from which to draw lessons.Yet until now it has often been difficult for those planning a new survey—especially one in a developing country—to find out about the experiences of previous household surveys: what was tried, the factors that influenced decisionmaking, what worked, and, most importantly, what did not work. The formal literature is scattered across disciplines—statistics, economics, sociology, psychology— and often contained in conference proceedings or government document series that are not widely indexed and are seldom available outside the country where they were written. An additional limitation is that a considerable amount of the formal literature pertains to surveys in industrialized countries. While much can be learned from such literature, it is still unclear how well the literature applies to settings with lower literacy rates, different income levels and employment and consumption patterns, and differing factors that affect the social interaction of the inter- view. Much of the experience of surveying in developing countries is poorly documented. Statistical institutes in developing countries have little money or staff to devote to experimentation or research; their mandate is production and their resources are few.Articles published in the formal academic literature that use the data from these surveys typically provide only a brief description of the data used. They may contain some hints about whether the data collection methods worked, but almost by definition, data collection efforts that failed usually do not lead to academic publications. Household survey questionnaires and their associated statistical abstracts (reports) contain both implicit and explicit information, but they are sometimes available only in the country in which they were CHAPTER 1 INTRODUCTION 7 administered. Moreover, statistical abstracts tend to minimize any problems that may have been associated with a survey because the statistical agencies that produce these abstracts do not want to publicize a survey’s shortcomings. In principle, the most useful information for the designers of future surveys would be the internal memoranda and informal notes of the agencies and people involved in designing and implementing past surveys. However, these are rarely filed and seldom preserved after a survey is completed, much less systematically made available to people outside the agency. The Objective and Audience for this Book The objective of this book is to provide detailed advice on how to design multi-topic household surveys, based on the experience of past household surveys.This book will help individuals and organizations that are planning a comprehensive, multi-topic survey to define the objectives of their survey, identify the data needed to analyze those objectives, and draft questionnaires that will collect such data. These tasks are not easy, because designing such a survey for a given country (or an area within a country) usually involves a host of tradeoffs among different objectives. This book aims to help survey designers evaluate these tradeoffs, set realistic objectives, and design a survey that best fulfills those objectives. This book was written with several target audiences in mind. The primary audience consists of the people most likely to carry out household surveys similar to the ones discussed in the book—the staff of the national statistical agencies and planning agencies responsible for their countries’ household surveys. A second audience consists of individuals or groups in consulting firms or international aid agencies that advise governments on the design of household surveys. A third audience is composed of researchers or research agencies that plan to field a survey to pursue their own research objectives.A fourth audience consists of individuals or groups working on a survey intended to evaluate or monitor the impact of a development project in a particular country—either a nationwide project or a project limited to a small part of the country.A fifth audience is composed of people working on a single-topic survey, because the book can provide them with guidance on how to collect “background” information from households— information on, for example, a household’s composition, basic characteristics, and level of welfare. Finally, this book will assist researchers who use household survey data produced by others, because it will help them understand the challenges, possible options, and tradeoffs involved in data collection. Such an understanding will allow these researchers to interpret household survey data more accurately and use these data more fully. The recommendations in this book apply to a broad range of multi-topic household surveys, reflecting the authors’ expectation that future surveys in developing countries will be increasingly diverse in their purposes and content.The book provides survey designers with a wide range of options from which they can pick and choose according to both the purpose of their survey and the prevailing circumstances in the country studied.Future household surveys will,and should,evolve in ways that are hard to foresee.Thus this book should be regarded as a starting point for planning new surveys rather than as an exhaustive treatise on the way to design all future household surveys. This book assumes that the survey designer has already decided to implement a multi-topic household survey, as opposed to a census, a qualitative study, or a single-purpose survey. Nevertheless, several chapters in this book compare the advantages and disadvantages of different data sources for studying certain topics. In addition,Chapter 25 provides a thorough discussion of qualitative data collection methods. The Experience on Which This Book Is Based Much of this book is based on the experience of the World Bank’s Living Standards Measurement Study (LSMS) program (Box 1.1), one of several recent international efforts to expand the pool of data on poverty and living standards in developing countries. The World Bank established the LSMS program in 1980 to explore ways of increasing the accuracy, timeliness, and policy relevance of household survey data collected in developing countries. Because the first LSMS surveys were designed by the World Bank for research purposes, there was little variation in these surveys’ design and implementation. However, by the late 1990s LSMS surveys had been carried out in a wide range of low- and middle-income countries, with the involvement of many different national agencies and international organizations. Over time LSMS MARGARET GROSH AND PAUL GLEWWE 8 surveys have become increasingly customized to fit specific country circumstances,including policy issues, social and economic characteristics, and local household survey traditions. Each survey has also inevitably reflected the interests (and prejudices) of the individuals planning it. The LSMS program has had its share of successes. Most importantly it has shown the feasibility of collecting comprehensive household survey data in developing countries. Since the first LSMS survey in 1985, LSMS surveys have been implemented in about 30 developing countries (Table 1.1). In some of these countries the original LSMS survey prototype was implemented in its entirety.In other countries this prototype was significantly altered to suit local circumstances. In still other countries it was used as a guide to redesign surveys that already existed. LSMS surveys were also the starting point for SDA surveys, which have been implemented in about 20 Sub-Saharan African countries, and for the MECOVI program now in progress in eight Latin American countries. The increase in the number of LSMS surveys and other household surveys has substantially expanded the stock of data that can be used to study poverty and, more broadly, economic and social development in developing countries. In every country where an LSMS survey has been done, the data have been used to measure and analyze poverty by the government, an international development agency, or both working together. In several countries LSMS data have directly influenced specific government policy decisions (see Box 1.2). Data from LSMS and similar surveys have also been used in hundreds of studies of developing countries, helping to extend what is known about poverty, household decisionmaking, and the impact of economic and social policy changes on household The overall objective of LSMS surveys is to measure and study the determinants of living standards in developing countries, especially the living standards of the poor.To accomplish this objective, LSMS surveys must collect data on many aspects of living standards, on the choices that households make, and on the economic and social environment in which household members live. Much of the analysis undertaken using LSMS surveys attempts to investigate the determinants of living standards—which requires more sophisticated analytical methods than simple descriptive tables. LSMS surveys have several characteristics that distinguish them from other surveys. One of the most important is that they use several questionnaires to collect information about many different aspects of household welfare and behavior.These consist of a household questionnaire, a community questionnaire, a price questionnaire, and, in some cases, a facilities questionnaire. (For more details on the questionnaires see Box 1.4.) Another characteristic of LSMS surveys is that they typically have nationally representative, but relatively small, samples—usually between 2000 and 5000 households. This will yield fairly accurate descriptive statistics for the country as a whole and for large subareas (such as rural and urban areas or a few agroclimatic zones), but usually not for political jurisdictions (such as states or provinces). The surveys’ sample sizes are generally adequate for the regression methods often used for policy analysis of LSMS survey data. Because of the complexity of most LSMS surveys, these surveys have rigorous quality control procedures to ensure that the data they gather are of high quality. These procedures, which are generally difficult to implement on larger samples, usually include several key elements. Both the survey’s fieldwork and its data entry are decentralized, and the people who carry out these tasks are strictly supervised. Interviewers receive extensive training (usually for about four weeks) prior to the survey. In the field, information is gathered not by asking one person all the questions about the household and its members but through a series of “mini-interviews,” with each adult responding for himself or herself.This procedure minimizes any errors caused by respondent fatigue or by the use of proxy respondents .The interviewers make multiple visits to households to find any members who were not home during the interviewer’s earlier visits—which also reduces the need to use proxy respondents. There is one supervisor for every two or three interviewers.The supervisors must revisit a significant percentage (often 25 percent) of the sampled households to check on the accuracy of the interviewer’s data.They must directly observe some interviews, and they must review each questionnaire in detail. Supervisors’ performance of these procedures is documented, and the supervisors are in turn supervised by staff from the central office of the statistical agency. Data entry and editing are done as soon as each interview is over, either in the local field office or by a data entry operator who travels to households with the team of interviewers. As data are entered into the computer, a data entry program carries out a large number of quality checks to detect responses that are out of range or inconsistent with the other data from the questionnaire. Any problems this program detects can be verified or corrected in a subsequent visit to the household by the interviewer. Box 1.1 An Introduction to LSMS Surveys CHAPTER 1 INTRODUCTION 9 welfare. Many of these studies have been presented at conferences and published in books or academic journals, and have thereby shaped thinking about these issues far beyond the countries in which the data were collected. Despite these successes, several challenges remain for LSMS surveys and other multi-topic household surveys. First and most obviously, many developing countries still have inadequate household survey data. This is true even for some of the countries that have recently fielded new surveys, including LSMS surveys. Ideally, all governments should collect data on a regular, ongoing basis in order to monitor poverty trends over time. However, survey efforts are still sporadic in many developing countries today, and many surveys have serious deficiencies such as limited questionnaires, samples that exclude rural areas, and long delays in processing the data after completing the fieldwork. Second, improvements are needed in the process of adapting the LSMS approach to countries that have not yet implemented LSMS-type surveys. It has been difficult for people working on a survey in one country to learn from the experience of other countries that have carried out LSMS and other multi-topic surveys. Mid-level staff in government statistical agencies know the details of why particular choices were made and know how well the choices worked, but they rarely meet with their counterparts in other countries. A small pool of World Bank staff and consultants also know many of these details and have been in contact with many of the people developing new surveys in particular countries. However, until now, they have Table 1.1 LSMS Surveys Has the survey been repeated, Country Year of first survey or will it be repeated? Number of households in sample Albania 1996 No 1,500 Algeria 1995 No 5,900 Armenia 1996 No 4,920 Azerbaijan 1995 No 2,016 Bolivia 1989 Yes 4,330–9,160 Brazil 1996 No 5,000 Bulgaria 1995 Yes 2,000 Cambodia 1997 Yes 6,010 China (Hebei and Liaoning only) 1995 No 800 Côte d’Ivoire 1985 Yes 1,600 Ecuador 1994 Yes 4,500 Ghana 1987/88 Yes 3,200 Guyana 1992/93 No 1,800 Jamaica 1988 Yes 2,000–4,400 Kazakhstan 1996 No 2,000 Krygyz Republic 1994 Yes 2,100 Mauritania 1988 Yes 1,600 Morocco 1991 Yes 3,360–4,800 Nepal 1996 No 3,373 Nicaragua 1993 Yes 4,454 Pakistan 1991 Yes 4,800 Panama 1997 Yes 4,945 Paraguay 1997/98 Yes 5,000 Peru 1985 Yes 1,500–3,623 Romania 1994/95 Yes 31,200 South Africa 1993 No 8,850 Tajikistan 1999 No 2,000 Tanzania—Kagera 1991 No 800 Tanzania—Human Resource Development Survey 1993 No 5,200 Tunisia 1995/96 No 3,800 Turkmenistan 1997 No 2,350 Vietnam 1992/93 Yes 4,800–6,000 Source: LSMS data bank. MARGARET GROSH AND PAUL GLEWWE 10 shared their knowledge of past surveys mostly on an informal basis, one person or one country at a time. And since the teams that are assigned to work on each specific survey are usually small, these teams start the survey development process with detailed knowledge of some of the topics to be covered by the survey questionnaires but less detailed knowledge of others. Third,the data gathered from some parts of LSMS survey questionnaires have been disappointing. Two particularly difficult problems are how to measure household income from agriculture and nonagricultural self-employment and how to measure savings and financial assets. Fourth, new issues have emerged since the first LSMS surveys were implemented. The economics profession has increasingly discounted the notion of the household as a unified decisionmaking body, trying instead to understand how goods, services, and power are allocated among the different members of a given household. In addition, there is growing interest in using qualitative and quantitative techniques in complementary ways, or even combining these techniques.And analysts increasingly use household survey data to address environmental issues. How this Book Came to Be In recognition of the continuing challenges for LSMS surveys, in the mid-1990s theWorld Bank initiated the multiyear research project that developed this book. (See Box 1.3 for a brief description of related initiatives.) The project was assigned three goals: to extend the range of policy issues that can be analyzed with LSMS data; to increase the reliability and accuracy of LSMS household surveys are designed to collect data that can be used to study living standards and how living standards are affected by government policies.The following examples illustrate how some governments and donor agencies have used LSMS data to help make policy choices. In 1989 the Jamaican government was considering whether it should eliminate subsidies for basic food items and use the funds saved to expand its food stamp program.While the government was making this decision, data from the Jamaican LSMS survey became available.Analysis of these data showed that most of the benefits from general price subsidies went to nonpoor households, while most of the benefits of the food stamp program went to the poor.This information helped the government decide to remove the subsidies on basic foodstuffs and expand its food stamp program. The government then commissioned further analysis of the LSMS data to find out how many families needed help in purchasing a minimum food basket, and how much help these families needed. The government used this information to choose new eligibility thresholds and benefit levels for the food stamp program. The Jamaican government has used its LSMS data in making many other decisions, such as whether to change kerosene subsidies and whether and how to subsidize medicines distributed through public health clinics. In addition, the government has used LSMS data to study the effects of raising user fees for public health care services. The Jamaican LSMS survey is conducted annually; the incidence of poverty is measured in each survey. In South Africa the 1993 LSMS survey provided the first comprehensive, credible data set for the entire territory of South Africa, including the homelands.The survey was completed just before the first democratic elections were held in the country.The data were quickly put to extensive use both by the new government and by academic researchers. The first product, an extensive statistical abstract, was followed by a poverty profile prepared jointly by the World Bank and the government’s Ministry of Reconstruction and Development, then by other studies and reports. This body of work has helped to shift the national debate about poverty away from the nature and extent of poverty toward policy options for reducing poverty. For example, young women in rural areas were made eligible for public works employment programs after the data showed that these women were often needy and that they would be able to participate in such schemes since they had access to childcare. The survey data also revealed that the old age pension program was well targeted, which convinced the government not to modify that program but instead to consider reforming other programs that appeared to be less well targeted. In 1998 the Government of the Kyrgyz Republic undertook a thorough assessment of the current and projected impact of its state pension reform.With the help of a World Bank team, the government analyzed data from the 1993 and 1996 Kyrgyz LSMS surveys to examine a range of policy alternatives.The survey data were used to show rates of participation in, contributions to, and receipts from pension programs by age cohort and by level of welfare.The data were particularly helpful to the government when it worked on setting a new level for the minimum pension, based on average earnings in the poorest quintile of the population. Forthcoming analytical work will include an assessment of household consumption and demand for utility services, and the formulation of a strategy to compensate the poorest for increases in utility prices. Box 1.2 Using LSMS Data to Inform Government Policy Choices CHAPTER 1 INTRODUCTION 11 the surveys; and to make it easier to implement LSMS surveys, either by simplifying survey design or by providing more and better instructional materials on survey design and implementation.This book contributes to the achievement of all three goals and thus addresses the four challenges facing the LSMS that were described above. Past LSMS surveys have typically consisted of a household questionnaire, a community questionnaire, and a price questionnaire; sometimes they have also included a school or health facility questionnaire.The household and community questionnaires are each composed of separate modules, sections of the questionnaire that focus on different topics (Box 1.4).This book reviews each module that has typically been a part of past LSMS surveys, and offers some interesting new additions. The author or authors of each chapter of this book were chosen according to the following criteria: extensive research experience on the topic in question; experience in analyzing data on that topic using data from LSMS and non-LSMS surveys (both multitopic and single-topic); and experience in collecting data in developing countries. In order to ensure that experiences and perspectives from both LSMS and non-LSMS surveys were included, a concerted effort was made to include not only people who have long been associated with LSMS surveys but also people associated with other survey traditions. The authors of the chapters that focus on specific modules have reviewed the relevant literature (both analytical literature and literature on survey experience), analyzed existing survey data,and,in the case of the consumption module,experimented with different methods of collecting data. Many authors have drawn lessons not only from LSMS experience but also from experience of other surveys,including the RAND Family Life surveys, the World Bank’s Social Dimensions of Adjustment (SDA) surveys, and the Demographic and Health Surveys (DHS). The authors of many chapters have reviewed a large number of single-topic surveys, including ones on housing, agriculture, water and sanitation, time use, and household income and expenditure. While this book was being written,two workshops were held that brought together all of the authors, as well as representatives from the various organizations that constitute the main audiences for this book. The participants in the first workshop were primarily data users—researchers and policy advisors.They were invitBox 1.3 Other LSMS Products A manual for planning and implementing the LSMS survey. When work began on this book about questionnaire design, work also began on a companion volume about planning and implementation: “A Manual for Planning and Implementing the Living Standards Measurement Study Survey” by Margaret Grosh and Juan Muñoz.The manual, completed in 1996, is intended for all people involved in planning and implementing an LSMS survey, including staff in planning agencies, statistical agencies, line ministries, academic institutions, and development agencies.The manual discusses such issues as sampling, fieldwork, data management, initial analysis, dissemination, and a host of planning and budgeting issues—in each case explaining the technical procedures and standards used in LSMS surveys. The manual is available in English, Spanish, and Russian. The LSMS data bank. Data from LSMS surveys are now much more accessible than they were in the early years of the LSMS program in the late-1980s. The LSMS website, http://www.worldbank.org/lsms/lsmshome.html, contains a catalogue of the data sets that are available, the documentation for most surveys, and the data from some of the surveys. Data sets and documentation not yet available from the website are available by mail.Three factors have made it possible to increase the accessibility of LSMS data. First, a growing number of countries have adopted more open data access policies, and some have even given the World Bank permission to place their data on the LSMS website. Second, the LSMS team at the World Bank has thoroughly documented most of the surveys, whether working alone, working with managers of survey projects, or commissioning documentation work. Good documentation preserves institutional memory, lowers the cost to the Bank of disseminating data, and reduces startup costs for new users of LSMS data sets.Third, the LSMS team now has a full-time data manager, good technical support, and adequate space to stock an inventory of questionnaires, manuals for field staff, abstracts, and other useful documents from each country’s survey. Other tools.The LSMS program periodically produces other tools for survey planners and analysts.The best way to keep abreast of these tools is to look on the “tools for managers of new surveys” and “tools for using household survey data” pages of the LSMS website. Readers of this book may be particularly interested in a paper by Deaton and Zaidi (1999) on how to construct consumption aggregates, which complements Chapter 5 in this book, and in a recent book by Deaton (1997) on analyzing household survey data, which brings together a large amount of statistical and econometric material relevant for policy analysis. MARGARET GROSH AND PAUL GLEWWE 12 ed to ensure that the book had correctly identified the research and information needs of potential data users. A larger share of the participants in the second workshop were data producers—staff from national statistical agencies and representatives of organizations that provide technical assistance or funding to national statistical agencies.They were invited to ensure that the book addressed their concerns—a requirement for any successful survey. Prior to the first workshop each chapter was reviewed by an expert in the relevant field. Before the second workshop the draft manuscript as a whole was reviewed by several experts in analyzing and producing household survey data in developing countries. After all of these people’s advice had been incorporated and a polished draft produced, the book was subject to another round of (anonymous) peer review and revisions.In addition,many of the draft chapters were given to people who were in the process of advising governments or survey institutions on the design of a multitopic household survey.This served as a limited field test for the book,and also confirmed that government planning and statistical agencies—and advisors of these agencies—had a genuine and pressing need for this book. Nevertheless it must be recognized that because the draft modules presented in each chapter are based primarily on lessons from past surveys, few of them have been rigorously field tested in the exact form presented here. Thus extensive field testing must be done in each country implementing a new survey. Survey designers should consider this testing a vital part of their job after they have chosen a set of modules, modified these modules, and combined the modules into survey questionnaires. Each chapter contains a “cautionary advice” box that specifies how much the draft module has been changed from its design in previous LSMS surveys, how well similar modules have worked in the past, and which parts of the modules most need to be customized to fit specific country circumstances. This book represents a major advance on three fronts. First, the book makes it easier for those workOne distinguishing characteristic of LSMS surveys is that they are both multi-topic and multi-level: they use several questionnaires to study many different aspects of household welfare and behavior. The largest LSMS questionnaire is the household questionnaire.The LSMS household questionnaire always collects detailed information to measure household consumption, which is the best monetary indicator of household welfare (see Chapter 5 for further discussion). The household questionnaire also collects information on income; transfer income and income from wage employment are collected in almost every LSMS survey, and many LSMS surveys also collect data on income from agriculture, household enterprises, and miscellaneous sources. LSMS household questionnaires always record information on a variety of other dimensions of welfare and on the use of social services; housing and related amenities such as water and sanitation; the level of education of adults, grade attainment and current enrollment rate of school-aged children; and vaccination histories and anthropometric (height and weight) measurements for children. A typical household questionnaire collects more information than this, in order to expand the range of living standards indicators that can be studied and to allow researchers to model the choices households make.The traditional list of modules included in a prototype LSMS survey includes: household roster, education, health, employment, migration, anthropometry, fertility, consumption, housing, agriculture, household enterprises, miscellaneous income, and savings and credit. Some of the information (consumption, housing quality, agricultural production) is collected only at the household level, but much of it (employment, education, health) is collected at the individual level. The community questionnaire gathers information on local conditions common to all households living in the same community. Many of these conditions recorded can be directly influenced by government actions.The information covered typically includes the basic characteristics (including distance from the community) of nearby schools and health facilities, the existence and condition of local infrastructure (such as roads and public transportation), sources of fuel and water, availability of electricity, means of communication, and local agricultural conditions and practices. A separate price questionnaire is used to record the prevailing prices of commonly purchased items in local shops and markets. In almost all countries prices vary considerably among regions—in order to compare the welfare levels of households that live in different regions one needs information on the prices that they face when purchasing goods and services. The community and price questionnaires are discussed in Chapter 13. Finally, in some LSMS surveys special facility questionnaires are used to gather detailed information on schools or health facilities. These questionnaires are discussed in Chapters 7 and 8. Box 1.4 Components of a Typical LSMS Survey CHAPTER 1 INTRODUCTION 13 ing on new household surveys to learn from the wide range of LSMS and other survey experience. Second, the book makes it much easier to customize the design of questionnaires for new multi-topic surveys. Third, the material presented in the book deals with new policy questions, presents new analytical methods to address both new and long-standing policy issues, and provides new ways to reduce or avoid measurement problems. How to Use this Book The process of designing a comprehensive,multi-topic survey can be divided into five steps.First,survey planners must define the fundamental objectives of the survey and decide on the overall design of the survey in light of these objectives. Second, within this general framework the survey planners must choose which modules to include in the questionnaires, the objectives of each of these modules, and the approximate length of each module.Third, the planners must work out the precise design of each module, question by question, in light of the module’s specific objectives and approximate length. Fourth, the modules must be integrated with each other and combined into a complete set of draft questionnaires (household,community, price, and in some cases, facility). Fifth, the draft questionnaire should be translated (if applicable) and field tested. Ideally, the five steps should be completed in chronological order. However, in practice, implementing any given step may reveal information that requires survey designers to rethink a previous step. This book consists of three volumes. Volumes 1 and 2 contain all 26 chapters of this book.Volume 3 provides the draft questionnaires introduced by the chapters in Volumes 1 and 2. Volumes 1 and 2 are organized into four parts. The first three chapters of Volume 1 constitute Part 1, which discusses the “big picture.” This includes decisions that must be made about the overall design of the survey and the modules to be used, as well as procedures for combining modules into questionnaires and questionnaires into a survey (or sequence of surveys). Chapter 2 starts by describing how to choose from among the three “classic” survey designs and how to select the modules to be included in the survey. Chapter 3 describes general procedures for designing each module, combining the modules into a well-integrated set of questionnaires, and translating and field testing the questionnaires. The remaining chapters ofVolume 1 form Part 2 of the book, and the first nine chapters of Volume 2 comprise Part 3.The chapters in Parts 2 and 3 discuss, in great detail, the individual modules that are the building blocks of any multi-topic household survey. Each chapter reviews the main policy issues pertinent to the subject matter of the module, identifies the data needed to analyze these issues, introduces one or more draft modules (which are presented inVolume 3), and provides annotated notes that explain the reasoning behind many of the details of each draft module. For most modules, two or three different versions are introduced, each of different length.Which module to use depends on the level of interest in the particular topic. In addition, many of the chapters in Parts 2 and III discuss how to add or delete submodules within each module in order to provide a better fit with local circumstances and the specific focus of the survey. Part 2 (Chapters 4–13), in Volume 1, introduces “core” modules that must be included in virtually all LSMS-type surveys: metadata, consumption, roster, education, health, employment, anthropometry, transfers and other nonlabor income, housing, and the community questionnaire.The modules on health and education come with draft questionnaires that can be used for gathering data from local schools and health care facilities.The collection of community-level data, including data on local prices, is discussed in Chapter 13, the final chapter of Part 2. Part 3 (Chapters 14–22), in Volume 2, introduces modules that are optional: environmental issues, fertility, migration, total income, household enterprises, agriculture, savings, credit, and time use. The last four chapters ofVolume 2 constitutes Part 4 of the book.These chapters contain material that is more methodological in nature. Chapter 23 discusses when and how to collect panel data—in other words, whether to interview the same households when doing a sequence of surveys, and how best to do so when this option is chosen. Chapter 24 reviews the issues involved in analyzing the allocation of resources and power within households, and summarizes the implications of this analysis for data collection. Chapter 25 summarizes how qualitative research methods can be used to complement the quantitative methods typically used in the design and analysis of multi-topic household surveys. This chapter stresses that qualitative methods can play a useful role in the design of multi-topic household surveys, especially in MARGARET GROSH AND PAUL GLEWWE 14 formulating questions and developing hypotheses for data analysis. It is unfortunate that these methods have been neglected by most survey designers, who usually have quantitative backgrounds. Chapter 26 reviews the basic economic and econometric concepts that underpin many of the chapters in this book. Although many survey designers have an economics background, many others do not, and even some economists may benefit from a review of this material. The chapter begins by presenting the basic economic model of the household and goes on to discuss standard econometric techniques that have often been used in policy research on developing countries. The reader should understand that the questionnaires provided in Volume 3 and discussed in Parts 2 and III are not polished or completed, and cannot be used immediately in any developing country. Final versions of the questionnaires for any country must be developed by the survey designers themselves. Survey designers must combine their own experience and expertise with the information in this book to design a country-specific questionnaire that will elicit the information needed to answer the most important policy questions of that country. This book is just a starting point. It provides survey designers with the lessons learned from past experience and with advice from experts who are familiar with both LSMS and other household surveys. Part 1 All readers of this book should read all of Part 1,which in addition to this chapter includes Chapters 2 and 3. CHAPTER 2: MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY. This chapter leads survey designers through the factors that need to be considered when determining the basic scope of the survey. It sketches three alternative designs for an LSMS-type survey—a full LSMS survey, a “core” (scaled-down) LSMS survey, and a core and rotating module design—and suggests rules to help survey designers choose the design most appropriate for the circumstances that they face. Chapter 2 also explicitly defines the “core” components that should be included in any LSMS-type survey. CHAPTER 3: DESIGNING QUESTIONNAIRE MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES. This chapter moves to a finer level of detail, providing general guidance on how to design individual modules and combine them into an integrated set of survey questionnaires. First each module must be customized to meet the specific objectives set out for it. Then the modules must be compared with each other to check for gaps and overlaps and to harmonize wording,codes, and recall periods. Next, survey designers must decide on the order of the modules and combine the modules into draft questionnaires. Finally, the questionnaires must be translated and field tested. Throughout this process, issues of questionnaire formatting will arise; Chapter 3 explains the principles and conventions used in formatting LSMS questionnaires. Part 2 After survey designers have read Part 1 of this book and decided on the broad issues concerning the scope and design of the survey, they can begin the painstaking but crucially important task of designing the individual modules. Detailed advice on designing individual modules is given in Parts 2 and 3 of the book. Part 2 includes all modules that should be included in almost any LSMS-type survey, while the modules in Part 3 are optional.Almost all of the chapters in Parts 2 and 3 follow a similar outline. The first section reviews the current policy issues in developing countries for the topic or sector covered by the chapter. The second section explains what kinds of data are needed from household surveys to address these policy questions and also discusses any measurement issues.The third section introduces one or more versions of a draft module (the modules themselves are presented in Volume 3), and the fourth section provides notes that explain the reasoning behind, and important details of, each version of the draft module. Chapters 4 on metadata and 5 on consumption are the first chapters in Part 2 because they contain a good deal of information on survey methods and issues of validity and measurement, information with broad implications for the subsequent chapters. Like Chapters 4 and 5, Chapters 6–13 cover “core” modules. Each of these topics does not have to be covered in great detail, but it is recommended that at least the essential parts of each of these modules be included. For example, the fullest version of the health module introducted by Chapter 8 is so extensive that it should be included only in a survey specializing in health issues.Yet questions 10–38 of the short health module are an essential part of the core. CHAPTER 1 INTRODUCTION 15 CHAPTER 4: METADATA—INFORMATION ABOUT EACH INTERVIEW AND QUESTIONNAIRE. “Metadata” are data about the survey itself, such as dates of interviews, identities of respondents, and time required for each interview.This topic has frequently been neglected in LSMS and other household surveys. Metadata are needed to guide the implementation of a survey, to help analysts interpret survey data, and to allow a quantitative evaluation of different survey procedures. Chapter 4 reviews the different kinds of metadata that should be collected as part of any multi-topic household survey, and provides guidance on how to collect them. If the recommendations made in this chapter had been adopted at the beginning of the LSMS program, the rest of this book and its companion manual (Grosh and Muñoz 1996) would have had a firmer empirical foundation for discussing the tradeoffs in survey design and implementation. For example, information on the time required to complete specific modules of varying lengths would have been very useful for deciding costs of expanding a module in terms of interview time. CHAPTER 5: CONSUMPTION. This chapter differs from the others by focusing most of its attention on measurement issues, specifically on how to measure household consumption.The chapter draws on the literature on data collection, and also on data collection experiments that were part of the research for this book.One important conclusion of Chapter 5 is that measurement of consumption is highly sensitive to differences in methods, and that consumption measurement techniques within a given country must therefore be standardized over time. Researchers and consumers of household survey data need to be aware of the stringent comparability requirements that consumption data must satisfy before comparisons can be made across different surveys. CHAPTER 6: HOUSEHOLD ROSTER. One of the fundamental decisions in any household survey is deciding who is and who is not a household member. Chapter 6 provides a basis for identifying all members of the household, thus selecting from which individuals the survey will collect information.The recommendations in this chapter do not differ greatly from procedures used in past LSMS surveys.The chapter offers no new proposals on how to define a “household,” which is a difficult issue in almost any household survey. LSMS surveys have not made any significant contribution to this issue.The chapter confirms the usefulness of gathering household roster information on any children and parents of household members who do not live in the household, as well as of linking parents and children to each other when both are household mem- bers. CHAPTER 7: EDUCATION. This chapter recommends only modest changes to the design of the education module that has been used in most previous LSMS surveys, because this design has worked quite well in the past. The education module (presented in Volume 3) collects information about the schooling of all household members, including highest grade attained, degrees obtained, and grades repeated. Individuals currently in school are also asked about the type of school they attend (public or private), their recent attendance, and the amount of money the household spends on their schooling. Chapter 7 also introduces an expanded version of the education module,which is useful for the design of surveys that focus on education issues. This expanded module requires administering relatively short tests of cognitive skills to members of the household as well as collecting information about local schools through school and teacher questionnaires. CHAPTER 8: HEALTH. The health data collected in previous LSMS surveys have been of limited usefulness for policy analysis.The draft health module introduced in Chapter 8 (and provided inVolume 3) has been dramatically revised from previous surveys. The new module consists of a series of submodules on selfreported health status, health-related behavior, child immunization, insurance coverage, health service utilization and cost, and knowledge of health providers. One key change is that information is collected on all visits made by household members to medical facilities during the reference period—not just the most recent visit.Another improvement is the collection of self-reported data on “activities of daily living,” data which cover the ability to climb stairs, carry heavy loads, or walk long distances.The chapter also presents an expanded version of the module that includes questions related to mental health and very detailed questions on the household’s health expenditures and utilization of health facilities.The expanded version of the module also collects data on observed activities of daily living and the cognitive functioning of house- MARGARET GROSH AND PAUL GLEWWE 16 hold members that are observed by the interviewer. CHAPTER 9: EMPLOYMENT. The collection of data on employment and labor force participation in past LSMS surveys has been fairly successful. However, given the large size of this module, there is ample room for many small improvements. Several modifications are suggested to past designs. First, detailed information on household members’ work for household businesses or on the household farm is collected in the household business and agriculture modules, not in the employment module. Second, job history information is collected in a way that focuses on each individual’s employment five years before the time of the survey. Third, summary employment information is collected in a way that can accommodate individuals who have done many different kinds of work in the past 12 months. CHAPTER 10: ANTHROPOMETRY. Nutritional status has always been one of the key nonmonetary indicators of welfare in LSMS surveys,especially for children;future surveys should continue to collect such data. Chapter 10 discusses the tradeoffs involved when anthropometric information is collected only for young children, rather than for all household members. The chapter recommends that in general anthropometric data should be collected for all household members, both adults and children. Chapter 10 also discusses the merits of collecting data on mid-upper arm circumference, which until now has been done in only one LSMS survey. CHAPTER 11: TRANSFERS AND OTHER NONLABOR INCOME. Many households receive income unrelated to any of their members’ current work activities. Past LSMS surveys have usually collected data on this income in a module on transfers and other nonlabor income. Chapter 11 introduces an improved version of that module,capturing more information on both public and private transfers. For private transfers, the new module also collects more information about the donor household and about the purpose of the transfers. CHAPTER 12: HOUSING. Past LSMS surveys collected data on housing to serve as indicators of “basic needs” and to derive the implicit consumption value (imputed rent) associated with owner-occupied housing. Chapter 12 introduces two draft housing modules, a short module and a longer module.The short module collects data similar to those collected by the housing module used in previous LSMS surveys, albeit with several improvements. The longer module collects information that can be used to study a wide range of housing policy issues.Both modules are flexible in that they can gather data on complex water supply systems and on many different rental arrangements. Finally, in the new housing module questions are added that are appropriate to places with cold climates and well developed infrastructure;such questions will be particularly useful for surveys in the transition economies of Eastern Europe and the former Soviet Union. CHAPTER 13: COMMUNITY AND PRICE DATA. Community questionnaires have been used in many past LSMS surveys to gather information on the economic environment in which households operate. Community characteristics that affect households’ economic environment can often be directly changed by government interventions.This chapter provides a much-needed discussion of how to define the “community” for which the information is to be collected and how to gather community information from a group of respondents. Finally, it introduces a longer and much more comprehensive community questionnaire (presented in Volume 3) than has been used in previous LSMS surveys.The design of this draft questionnaire is based on the experience of both LSMS surveys and the RAND Corporation’s Family Life Surveys, as well as on suggestions from many of the authors of the other chapters in this book. Part 3 This part, consisting of Chapters 14–22, is in Volume 2. The chapters in Part 3 follow the same format as those in Part 2. Part 3 covers topics that are likely to be of interest, but one would never include all of them in any one survey. None of the topics in Part 3 are required for an LSMS-type survey. CHAPTER 14: ENVIRONMENTAL ISSUES. To date, very few LSMS surveys have collected data that can be used to examine environmental issues. The environmental module introduced in Chapter 14 (and presented in Volume 3) offers a series of submodules that can be used, as appropriate, in different settings. There are very brief submodules on environmental priorities in urban and rural areas, on attitudes and perceptions CHAPTER 1 INTRODUCTION 17 about urban air quality, and on discount rates. All of these submodules could be used in many surveys, even surveys that do not focus on the environment. The environmental module also includes lengthy submodules on water, sanitation, and fuel, to be included in surveys for which the use of these resources is of particular interest. There are also contingent valuation submodules that attempt to measure the extent to which households are willing to pay for improvements in urban air quality, the urban water supply, urban sanitation, or the rural water supply.The design of all of the above submodules is based on extensive experience from single-purpose surveys. CHAPTER 15: FERTILITY. This chapter follows the same general approach used in many past LSMS surveys. The chapter introduces a short version of a fertility module that collects the data necessary to understand some general aspects of contraceptive use and to compile a maternity history that lists all births. Chapter 15 also introduces a standard version that includes a maternity history, a reproductive health submodule covering the previous three years, a longer section on contraceptive use, and a section on fertility preferences. Both the short and standard versions are presented inVolume 3.Whichever version of the module survey designers choose to use, the module should be administered to all women in the household of childbearing age.This departs from the past LSMS practice of interviewing only one randomly selected woman per household. CHAPTER 16: MIGRATION. Data on migration have been collected in many past LSMS surveys, but the amount of information collected has been quite small and the data have rarely been analyzed, despite significant interest in migration among researchers. This chapter introduces three different versions of the migration module: a short version, a standard version, and an expanded version. All three versions are presented in Volume 3.The standard and expanded versions are designed to collect much more detailed information than has been collected in the migration modules used in previous LSMS surveys. Including either the standard or the expanded draft migration module in a future survey will yield a rich data set that should prove very useful for comprehensive research on migration. CHAPTER 17: SHOULD THE SURVEY MEASURE TOTAL HOUSEHOLD INCOME? Chapter 5 (and indeed this entire book) argues that consumption is the best monetary measure of welfare in multi-topic surveys. This implies that the consumption module is essential and must always be included. In contrast, the book takes the view that collecting the data needed to calculate total household income is optional, which implies that the household enterprise, agriculture, and savings modules (which collect much of the income data) can be substantially reduced or even omitted, depending on the level of interest in these topics. Chapter 17 reviews the advantages and disadvantages of collecting the data needed to calculate total household income, and describes the circumstances under which measuring total income should be an objective of a multitopic survey. CHAPTER 18: HOUSEHOLD ENTERPRISES. Small businesses owned and operated by households are quite common in developing countries, yet it is difficult to collect accurate data on these income-generating activities. Based on their extensive experience of analyzing data from past household surveys—both LSMS and others—the authors introduce three versions of the household enterprise module (presented in Volume 3). Survey designers should choose the version that best matches policymakers’ level of interest in household enterprise issues. In previous LSMS surveys, data on employment in these enterprises were collected in the employment module, but this chapter recommends that such information be collected in the household enterprise module; each of the modules introduced in this chapter collects such data.One consequence of this is that the standard version of this module is now longer than the version typically used in previous LSMS surveys, and the expanded version is even longer than the standard version. CHAPTER 19: AGRICULTURE. Collecting accurate and comprehensive data on agricultural activities is difficult in any survey, and past LSMS surveys have experienced many problems in collecting such data. This chapter introduces short, standard, and expanded versions of the agricultural module (presented inVolume 3) that are very different from the agriculture modules used in previous LSMS surveys. In the standard and expanded versions, information on land owned and crops produced is gathered on a plot-by-plot basis, MARGARET GROSH AND PAUL GLEWWE 18 rather than at the level of the whole farm as was done in previous surveys. Information on household members’ work on their own farms is now obtained in the agriculture module, rather than in the employment module as in previous LSMS surveys.The short module is new, and is limited to collecting information on the households’ agricultural assets and on the total amounts of each crop produced by the household. CHAPTER 20: SAVINGS. It is difficult to collect data on household savings because many households are reluctant to provide savings-related information. Several previous LSMS surveys have collected a modest amount of data on savings, but these data have rarely been used in analysis due to doubts about their accuracy. Chapter 20 provides an extensive review of research on savings in developing countries, emphasizing the difficulties involved in doing such research. The two versions of the draft module introduced by this chapter (and presented in Volume 3) include several modest improvements to the module used in previous LSMS surveys. Neither of these versions is much longer than the savings modules of past surveys. CHAPTER 21: CREDIT. This chapter emphasizes that to capture all of the sources and uses of credit in a way natural to respondents, questions on household credit use must be inserted in several of the survey’s modules. Such questions should be inserted in the housing,consumption, household enterprise, and agriculture modules, as well as in the community questionnaire and in a special credit module. In contrast with most past LSMS surveys, the draft credit module introduced by this chapter (and presented inVolume 3) gathers information at the level of the individual rather than at the level of the household. CHAPTER 22:TIME USE. LSMS surveys have traditionally included neither comprehensive measures of time use nor modules dedicated to time use. This chapter discusses the experience of special time-use surveys, and uses this experience to formulate a special timeuse module (presented inVolume 3).This module will be of particular interest to researchers concerned with intrahousehold issues. However, the draft module is lengthy and could crowd other modules out of the survey (since there is a limit to the amount of time households are willing to be interviewed). Further experience will be needed in implementing such a module as part of a multi-topic survey and in analyzing the data collected before it becomes clear whether to routinely include time use questions in LSMS and similar multi-topic surveys. Part 4 This part, inVolume 2, presents four chapters that discuss several general survey design issues.These chapters are useful for survey designers to read alongside the chapters in Parts 2 and 3 that interest them. CHAPTER 23: RECOMMENDATIONS FOR COLLECTING PANEL DATA. This chapter reviews the advantages and disadvantages of collecting panel data in developing countries, along with past experience of collecting panel data.The chapter recommends that panel data be collected in most surveys, provided that in successive rounds the original sample of households is supplemented with a sample of households living in dwellings that have been constructed since the first survey. This is necessary to ensure that the sample remains nationally representative when each survey is implemented.Chapter 23 also recommends that information be collected from households in the first survey that will help interviewers find these households in subsequent surveys, even when it is not certain that later surveys will attempt to collect panel data. CHAPTER 24: INTRAHOUSEHOLD ANALYSIS. The study of the allocation of resources and responsibilities within households has grown in the economic literature over the last few years, and such issues are increasingly arising in policy discussions. This chapter explains which kinds of data should be collected at the individual level rather than at the household level in order to support intrahousehold analysis; from this perspective the chapter provides a critique of the modules proposed in Parts II and III. For modules deemed inadequate for intrahousehold analysis, the chapter proposes ways to modify them so that they better support such analysis. The chapter accepts that it is not feasible to collect individual-level data on all topics in an LSMS survey. Nevertheless, future LSMS surveys can be designed to support substantial intrahousehold analysis. Much of the data collected in past LSMS surveys—on employment, health, education, anthropometrics, migration, and fertility—have long been collected at the individual level.And the draft agriculture, household enterprise, credit, and miscellaneous CHAPTER 1 INTRODUCTION 19 income modules presented in Parts II and III of this book recommend the collection of more individuallevel data than were collected in previous LSMS surveys. In addition, the draft time-use module introduced in Chapter 22 is a new tool for gathering data that are crucial for intrahousehold analysis. CHAPTER 25: QUALITATIVE DATA COLLECTION TECHNIQUES. Previous LSMS surveys have focused almost exclusively on collecting quantitative data, making very little use of qualitative data collection methods. This regrettable tendency probably reflects the quantitative backgrounds of most survey designers. Chapter 25 explains ways in which qualitative research methods can usefully and effectively complement quantitative data collection. The chapter concludes that qualitative methods should not be combined with quantitative methods into a single survey; instead, both methods should be used in separate but complementary data collection exercises. Quantitative surveys can benefit from qualitative methods in several ways. For example, qualitative research can be used to help survey designers formulate the exact wording of particular questions, and qualitative methods are useful for creating hypotheses about household behavior, which can then be tested using quantitative data. CHAPTER 26: BASIC ECONOMIC MODELS AND ECONOMETRIC TOOLS. This chapter gives non-economists some basic information on economic models of household behavior, and reviews econometric methods commonly used in analyzing the policy questions discussed in this book. Chapter 26 is a useful reference for non-economists as they read other chapters. A glossary at the end of Chapter 26 defines the economic and econometric terms used in many chapters of the book. Note The authors would like to express their gratitude to Jere Behrman, Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto Martini, Raylynn Oliver, Kinnon Scott, and Diane Steele for comments on previous drafts of this chapter. References Davies, David. 1795. The Case of Labourers in Husbandry. Bath, United Kingdom: Printed by R. Cruttwell for G. G. and J. Robinson. Deaton, Angus. 1997. The Analysis of Household Surveys: A Micro Econometric Approach to Development Policy. Baltimore, Md.: Johns Hopkins University Press. Deaton, Angus, and Salman Zaidi. 1999. “Guidelines for Constructing Consumption Aggregates for Welfare Analysis” World Bank, Development Research Group,Washington, D.C. Eden, Frederick. 1797. The State of the Poor. London: Davis. Grosh, Margaret, and Juan Muñoz. 1996. A Manual for Planning and Implementing the Living Standard Measurement Study Survey. Living Standards Measurement Study Working Paper 126. Washington, D.C.:World Bank. Ravallion, Martin, and Shaohua Chen. 1997. “What Can New Survey Data Tell Us about Changes in Distribution and Poverty?” World Bank Economic Review 11 (2): 357–82 ———. 1998. Personal conversation updating Ravallion and Chen 1997. November 2 and 3.Washington, D.C. Ravallion, Martin, Guarav Datt, and Dominique van de Walle. 1991. “Quantifying Absolute Poverty in the Developing World.” Review of Income and Wealth Series 37 (4). 1991. “Quantifying Absolute Poverty in the Developing World.” Review of Income and Wealth Series 37 (4). 21 2 Making Decisions on the Overall Design of the Survey Margaret Grosh and Paul Glewwe Comprehensive, multitopic household surveys such as LSMS surveys usually consist of three separate questionnaires: a household questionnaire, a community questionnaire, and a price questionnaire. Each questionnaire is composed of modules, sections that collect information on a specific topic. Questionnaires and their modules can be combined in a variety of different ways to create a multitopic household survey.There is no single right way to combine modules and questionnaires into a survey; each way has advantages and disadvantages.The key is to choose a design that provides the best fit given the objectives of, and constraints on, the proposed survey. The starting point for designing the modules and questionnaires to be used in a multitopic survey is a set of policy questions.The overall objective of each survey is to collect the data needed to answer these questions. There are five steps involved in survey design. The first step is to define the fundamental objectives of the survey and to choose an overall survey design that best fits these objectives.This is usually done not by a single individual but by a team of survey designers who consult extensively with a broad range of individuals and organizations interested in the survey. In choosing the overall design,the team must take into account several important factors including the capacity for collecting data within the country, the funding available, and the amount and quality of data available from other sources. The second step involves deciding which modules to include in the survey, specifying the objectives of each module,and proposing an approximate length for the modules.This step is needed because a survey that attempts to include all possible modules will be too large and complex to implement. The third step is to work out, question by question, draft questionnaires for each module that will be included in the survey.This can be done by drawing on the detailed recommendations in the chapters in Parts 2 and 3 of this book, as well as the draft modules included in Volume 3.The fourth step is to compare the modules to each other to ensure that they are consistent and well integrated, and to combine them into draft household, community, and price questionnaires (in some cases omitting the community questionnaire).The fifth and final step is to translate and field test the draft questionnaires. Translation may not be necessary in some countries; field testing is always essential and must not be done quickly or superficially.The first two steps are discussed in this chapter.The third, fourth, and fifth steps are discussed in Chapter 3. While these five steps should ideally be done in the order given above, in reality there is likely to be substantial movement backward and forward among the various steps.Some objectives originally set out for the survey may prove impossible to achieve given the constraints. And discussion of the detailed objectives MARGARET GROSH AND PAUL GLEWWE 22 for each module may cause the survey design team to reassess the overall objectives of the survey. In other words, it may be necessary to take one or two steps backward at some points in order to continue to move forward.This is to be expected and even encouraged. As more is learned about what can and cannot be done, survey designers are more likely to produce a survey design that meets their objectives—which may also have become more realistic. It is better to pare down the number of objectives in order to achieve some of them than to attempt to do too much and, as a result, achieve few or none of the original objectives. The first four sections of this chapter cover the first step of the survey design process.The first section provides an overview of who should be involved in designing and assembling the questionnaires.The second section discusses the main factors that survey designers should take into account when choosing among survey design options. The third section outlines “core” elements that must be included in any LSMS or similar multitopic survey and reviews several classic survey designs, each of which supplements the core in a different way.The fourth section presents guidelines for choosing the survey design most appropriate for each of a range of different circumstances. The fifth and final section explains the second step of the survey process: choosing the modules to include in the survey, setting objectives for each module, and setting the approximate length of the modules. Organizing a Survey Design Team The most important factor ensuring the success of a multitopic household survey is the involvement of the right people in the process. Designing the survey questionnaires is much too large a task for one person. Instead, a team of experts must be involved, including members of the organization implementing the survey as well as research analysts from other institutions. If the team does not contain a sufficient diversity of experts, this can have negative repercussions for the data (Box 2.1).The design team must work together with policymakers and program managers to define the overall objectives of the survey and to settle on many details at each step of the survey design process. Researchers and Policy Analysts It is essential to involve researchers and policy analysts in questionnaire design.This book was written primarily by researchers and policy analysts,and much of the success of past LSMS surveys in supporting policy-relevant research is due to the fact that the surveys used questionnaires designed by people who would be actively involved in the analysis of the data. Researchers and policy analysts can ensure that the information collected in multitopic surveys is well suited for policy research. The lead role in designing the questionnaires of an LSMS or similar multitopic household survey should be given to a small group of researchers and policy analysts who share two characteristics. First, they should know what issues are of most concern to the country’s policymakers. Second, they should have experience in using data from similar surveys to analyze these issues.The group of researchers and policy analysts should include members of the national planning agency, representatives from the national statistics agency, local academic researchers, and one or more people who have helped analyze or design multitopic surveys in other countries. The team must include individuals with extensive experience in implementing and analyzing other Box 2.1 The Importance of a Well-Rounded Design Team An effective survey design team must include researchers and policy analysts, policymakers, and staff from the organization implementing the survey.The problems that can arise when one or more of these groups is not involved in designing the questionnaires are illustrated by the experience of an LSMS survey implemented in Jamaica. The household questionnaire for the first Jamaican Survey of Living Conditions (implemented in 1988) was designed primarily by international experts who had little knowledge of Jamaican social programs. Although the household questionnaire was largely successful in accomplishing its analytical objectives, it had two serious flaws. First, although food subsidy policies were an important issue at the time, the consumption module did not clearly distinguish expenditures on key subsidized food items from expenditures on similar nonsubsidized items. This made it more difficult to study the incidence of food subsidies. Second, the questionnaire asked respondents about their receipt of food stamps during the previous month even though food stamps were provided only every two months.This made it difficult to identify which households had received food stamps, thus hindering the study of another issue important at the time. Fortunately, these flaws were identified and corrected in the following year’s household questionnaire. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 23 household surveys in the country in question. Ideally, local researchers and policy analysts should take primary responsibility for designing the survey, because they have an intimate knowledge of the country’s culture, economy, and society, and they are very familiar with existing programs and key policy issues. Local researchers and policy analysts are also likely to know about previous surveys done in the country that have covered some of the topics included in the new survey. And they will know which people and institutions should be consulted during the survey design process. It may also be desirable to involve international researchers in the design of the questionnaire, especially in countries where local data analysts are not familiar with LSMS and other multitopic household surveys. International researchers can contribute their experience about what has and has not worked in surveys in other countries.1 Judicious use of the advice of both local and foreign experts will significantly improve survey design. Past LSMS surveys have probably made insufficient use of the knowledge available from local researchers and policy analysts.Too often the involvement of local professionals has been limited to statisticians from the statistical agency (data producers) and thus failed to draw on the expertise of social policy researchers from the government or academia (data users). Statisticians may have only a limited knowledge of sectoral policy issues and programs.While they do have an important role to play, their input must be combined with input from data users to set priorities among the different possible objectives for policy research. Policymakers When defining the fundamental objectives of the survey, the team responsible for drafting questionnaires must seek extensive input from policymakers and program managers in the country being surveyed. The team’s initial discussions with policymakers should focus in broad terms on the most important issues to be covered, which will determine the relative size of the different modules in the questionnaires. After this round of discussions, further discussions should be held to identify the important issues within each sector. Since drafting the module or modules for each sector requires a substantial amount of knowledge about how specific programs work, technical experts in many program agencies must be consulted. These people should be consulted before the modules are created, and they should also be shown draft modules to elicit comments. Unfortunately, in many previous LSMS surveys the survey design team did not give enough attention to communicating or consulting with policymakers. Policymakers, who are often unfamiliar with household surveys, may find it difficult to read complicated questionnaires or to imagine what analyses the resulting data could support. One option is to show policymakers and program managers examples of the kinds of tables and analyses that could be produced using data from the questionnaire; these might be either hypothetical tables for the country of the survey or tables made in other countries using data from similar surveys. Another strategy is to show policymakers a report based on the first year’s data; this is an excellent way to obtain policymakers’ feedback on the design of follow-up surveys.A third strategy is simply to ask policymakers what they need to know to implement effective policies. Data Producers It is critical that the survey design team include staff from the organization implementing the survey.This should ensure that the questionnaires designed are workable. Often data collection can be greatly simplified by making minor changes in the layout or flow of a questionnaire, changes that do not diminish the questionnaire’s analytical content. Data producers are an excellent source of suggestions for such changes.They are usually also experienced in details of designing a questionnaire, such as questionnaire formatting. For all of the above reasons, the team members from the organization implementing the survey should help design, or comment on, every draft of the questionnaire. It is also useful for the survey design team to solicit the input of experienced field supervisors, who will notice whether the instructions to the interviewer are clear, whether the skip codes are correct, and whether the format is consistent. There is of course a natural tension between data analysts, who want comprehensive information, and field supervisors, who are likely to see all of the disadvantages but few of the advantages of administering a lengthy, complex questionnaire. Each side must be prepared to make compromises and carefully listen to the other side’s point of view. MARGARET GROSH AND PAUL GLEWWE 24 Factors for Deciding among Various Survey Designs After the members of the survey design team have been selected,work can begin on designing the survey. The first task for team members is to review the factors that influence the overall design of the survey. This section discusses those factors in detail. The appropriate design of a household survey or sequence of household surveys differs from country to country. The most important factors for determining the design of a proposed survey are: the kinds of policy issues the survey aims to address; the information available from existing surveys and other data sources; the country’s institutional capacity for collecting data; and the financial and other resources available for implementing the survey, including any constraints on how these resources can be used. Policy Issues The design of a household survey should reflect the policy issues it is intended to address. One way to classify policy issues is in terms of their subject matter, such as health, education, employment, or housing. Another way to classify policy issues is in terms of the kinds of data used to address them. The four most common kinds of household survey analysis used to address policy issues are: simple descriptive statistics on living standards; monitoring poverty and living standards over time; describing the incidence and coverage of government programs; and measuring the impacts of policies and programs on household behavior and welfare. This subsection reviews these four types of analysis and provides a practical example of how the information needed affects the design of the survey. SIMPLE DESCRIPTIONS OF LIVING STANDARDS. The most straightforward objective for a household survey is to describe the living standards of the population at one point in time, often with particular emphasis on the living standards of the poor.This can be done by using the data to tabulate means and frequencies of key variables.The results of these tabulations are often disseminated by the national statistical agency in the form of statistical abstracts (reports) that contain a large number of tables and a minimal amount of descriptive text. It is also possible to produce more structured descriptive analyses that supplement household survey data with information from other sources. Structural analysis of descriptive data can sometimes be used to draw conclusions about the likely impact of government policies on living standards. Examples of such analyses are the “poverty profiles” typically provided in the World Bank’s poverty reports. In both types of descriptive analysis, the range of variables used to measure living standards can vary widely; variables may be used from virtually all of the survey modules or from only a small subset of modules. In general, most of the variables included in statistical abstracts and descriptive analyses come straight from the questionnaire (for example, percentage of households that have electricity) or require only a small amount of manipulation (for example, nutritional status as derived from weight and height data). Only one “complex” variable needs to be constructed: total household consumption. Other complex constructed variables, such as total income or net wealth, are used less often in simple descriptive presentations. MONITORING POVERTY AND LIVING STANDARDS. The descriptive analyses discussed above focus on living standards at one point in time. However, another important role of multitopic household surveys is to monitor how living standards and poverty change over time.When data are used for this purpose they must be comparable over time; for this to be the case, the data must be gathered using the same methods each time the survey is implemented. One aspect of such consistency concerns the design of the sample, which in each case must use the same definitions of basic concepts such as the distinction between urban and rural areas.A second requirement for comparability is that the questions defining variables of interest must remain the same each time the survey is administered. This is necessary because seemingly innocuous changes in the wording of questions can lead to serious comparability problems; changing the recall period for food expenditures can make it impossible to compare estimates of poverty and inequality over time. Another issue to consider when monitoring poverty and living standards over time is the frequency with which indicators of living standards must be monitored. Indicators that are fairly stable over short periods of time—such as fertility and adult literacy— need not be measured each time the survey is done. However, indicators that can change more quickly, such as consumption expenditure,children’s nutrition- CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 25 al status, and employment status, should be measured every time the survey is implemented. Surveys that monitor poverty and living standards over time are typically fielded every year, although it is also possible to field them biannually or semiannually. EXAMINING THE INCIDENCE AND COVERAGE OF GOVERNMENT PROGRAMS. Data from multitopic household surveys can also be used to measure the incidence and coverage of specific government programs. For example, data on the enrollment of household members in public schools are useful for investigating which children benefit from the provision of public schooling. Household survey data can also be used to study participation in government assistance programs such as food stamps, cash assistance, and school meals. Another example is descriptive statistics on purchases of subsidized food items, which can be used to examine whether the benefits of specified food subsidies vary by households’ levels of income. The incidence and coverage of these different kinds of programs are easy to calculate and useful for policymakers to know. A moderate sample size (2,000 to 5,000 households) should be sufficient to evaluate programs that affect a large proportion of the population. Evaluating programs that serve a small proportion of the population usually requires using a much larger sample of households or including a disproportionately large number of target and beneficiary households in the sample. ESTIMATING THE IMPACT OF POLICIES ON HOUSEHOLD BEHAVIOR AND WELFARE. Policymakers are often faced with questions that can be answered only by analyzing household behavior. Policymakers may want to know how changes in commodity taxes or subsidies would affect agricultural production or the consumption of basic food items. Answering such questions requires calculating price elasticities and thus modeling households’ production and consumption decisions. Such modeling requires data that go well beyond measurement of living standards indicators. In multitopic household surveys that attempt to model household behavior, each module that collects data on a behavior of interest is usually designed to gather information that can be used to estimate the impact of several different policy changes.The chapters in Parts 2 and 3 of this book discuss each module in great detail and provide many examples of issues that require the modeling of household behavior.The following questions give an idea of the range of policy issues that can be addressed:What is the impact of charging user fees at government health clinics on the use of those clinics by adults and by children? How can the government encourage parents to enroll their children in school? What are the impacts of women’s employment opportunities on their fertility? How do changes in prices brought about by structural adjustment programs affect the welfare and productivity of agricultural households? EXAMPLE SCENARIO:THE INCLUSION OR EXCLUSION OF THE ANTHROPOMETRY MODULE. The decision about including an anthropometry module demonstrates how the analytical potential of a multitopic household survey is related to its content. (See Chapter 10 for detailed information on the collection of anthropometric data.) If an anthropometry module is not included, the survey is not useful for studying nutrition issues. However, by collecting limited anthropometric data such as the height and weight of children under five years of age, the survey will allow analysts to describe the extent and patterns of malnutrition in early childhood. If the country studied has adopted large-scale food distribution or subsidy programs, the data can also be used to assess how well these programs are targeted to undernourished children. If the survey is repeated annually or biannually, it becomes possible to monitor changes in both the nutritional status of the population and the targeting of the program.Thus the data collected from a limited anthropometry module can address, at least partially, three of the four types of policy questions outlined above. A full version of the anthropometry module would collect data on height and weight for all household members, not just children. Such data could be used not only to gauge adult health but also to analyze the impact of government policies on household welfare and behavior. Suppose policymakers want to predict the impact of food programs on children’s nutritional status.This requires estimating the determinants of child weight and height. Because heredity is so important, parental height and weight information are needed to estimate these relationships accurately; lack of data on parents’ weight and height could lead to estimates that suffer from omitted variable bias and thus do not accurately show the impact of the food MARGARET GROSH AND PAUL GLEWWE 26 programs on children’s nutritional status. In general, including the full version of the anthropometry module in the survey—measuring both adults and children—greatly expands the possibilities for examining the impact of government policies on household behavior and living standards. Defining the objectives of a survey is often a less tidy process than the discussion so far has implied. Institutions, and people within institutions, may have varying objectives. Each sectoral ministry in a country is likely to be primarily interested in its own subject. The government as a whole may want the surveys to measure or monitor only a few indicators of welfare, while academics in the country’s universities and other research institutions may want the surveys to yield the detailed data needed to model household behavior. If international agencies are financing the survey, they may have still another set of objectives. For example, they may wish to ensure that the data are comparable with similar data from other countries or that the data can be used to study issues of interest to the development community in general, even if these issues are not a high priority in the country of the survey. Whatever the objectives envisaged when the survey is first designed, it is likely that researchers will later use the data for other analytical purposes. The multiple (and sometimes competing) objectives of household surveys are to be expected and even encouraged, since each of the groups with a stake in these surveys has its own legitimate priorities.The task of survey designers is to accommodate the different objectives as much as possible without compromising the quality of the survey. Other Information Available and Its Relation to Survey Objectives No household survey takes place in a vacuum. In most countries there are several other household surveys that have gathered or will gather information on issues that the new multitopic survey is intended to cover. The extent to which data from these sources influence the design of the new questionnaire depends on the amount and type of data available and on the objectives of the new survey. If the main objective of the new survey is to describe various aspects of the living standards of the population,it may seem that the topics already covered in other surveys need not be included in the new multitopic survey. For example, if the only goal of the new survey is to describe living standards and recent anthropometric data are already available from another survey, it may seem reasonable to drop anthropometric measurements from the new survey. However, there are two important advantages to collecting anthropometric data in the new survey. First, collecting these data would make it possible to produce descriptive tables that show simple relationships between nutritional status, as revealed by anthropometric measurements, and other variables of interest— for example, household expenditure levels. Second, collecting anthropometric data in the new survey would ensure that the anthropometric data used the same definitions and classification schemes as other survey data, and thus could be used to draw effective comparisons. If the two surveys classified, say, education levels or rural and urban areas differently, this would make it difficult to present analyses from the two surveys side by side in ways that would be simple to interpret.Analysis based on combining results from separate surveys will usually be more difficult, and thus more prone to error, than analysis based on data that have all been collected in a single survey. The case for collecting anthropometric data is even stronger if the purpose of a new survey is to investigate the impact of nutritional status on other socioeconomic outcomes (such as education, fertility, or labor force productivity).This objective implies that the survey must include an anthropometry module, even if recent information on nutritional status is available from other sources.To conduct these kinds of analyses, the variables of interest must all come from the same household survey.2 Although it is essential that data on key household-level variables come from the same households, it is often useful to supplement household survey information using data from a source other than a multitopic survey. In some cases, price data collected for generating consumer price indices can replace the price questionnaire typically used in LSMS and other multitopic household surveys (see Chapter 13). Other such alternative data sources are time-series data on weather and maps of soil quality and topography, all of which can be used for analyzing agricultural issues. In health and education, further possibilities arise for matching household survey data with data from other sources; some countries collect data from health clinics and schools that may be matched with the communities covered in a household survey. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 27 However, survey designers should exercise caution when contemplating this approach. Although matching data from different sources appears simple, it is often very difficult in practice. Many of the chapters in Parts 2 and 3 of this book discuss the potential for matching household survey data to data from other sources. An important question that often arises when planning a multitopic survey is whether such a survey can replace one or more existing surveys and thereby reduce total costs without any loss of information. A multitopic survey with an anthropometric module could replace a periodic anthropometric survey primarily intended to measure the extent of malnutrition. However, a multitopic survey cannot replace all other household surveys. Labor Force Surveys often require much larger sample sizes and more frequent data collection than would be appropriate for a multitopic survey.And specialized surveys, such as Demographic and Health Surveys and comprehensive farm management surveys, contain much more data on those topics than can usually be collected in LSMS and similar multitopic surveys. Still other surveys, such as farm surveys and small business surveys, have very different sampling frames since they are based on samples of farms or businesses rather than samples of households. A final issue is whether survey designers should implement an entirely new survey, modify an ongoing survey, or find creative ways to analyze existing data. Two arguments support implementing a new survey. First, past surveys may not have been adequately documented, or access to their data may be restricted. Second, inter-agency rivalry, arguments concerning ownership of survey data, and coordination problems when different surveys are carried out by different agencies may make it easier to begin a new survey instead of using existing data or modifying an existing survey. On the other hand, survey designers should at least consider trying to remedy these problems so that existing surveys can be used (perhaps with some modifications) to meet the designers’ data and policy objectives—thus avoiding any unnecessary duplication. Examples are given later in this chapter of countries in which an existing survey was modified to be more like an LSMS survey. Institutional Capacity Decisions about what kind of survey to implement also depend in part on the institutional capacity for collecting data in the country undertaking the survey. Because maintaining data quality becomes more difficult when surveys become more complex, this capacity should be carefully assessed when planning LSMS or similar multitopic surveys, which can be very complicated. In countries where the capacity to collect data is weak, it may be better to implement a limited multitopic survey yielding reliable data on a relatively small number of topics than an overly ambitious survey that could yield unreliable data on a wide variety of topics. A survey containing 10 modules is easier to plan and implement than a survey containing 15 or 20 modules.The fewer the modules, the less time is needed by survey planners to contact different sectoral agencies and thus the less time is needed to build consensus.Also, smaller questionnaires require less time to design, and less time to carry out the fieldwork, enter the data, and manage the database. However, other steps in developing and implementing a household survey, such as planning the sample design, do not vary with the size of the questionnaire.Therefore, a survey with a questionnaire half the size of the questionnaire for a full multitopic household survey will involve substantially more than half the effort required for a full survey. Despite the complexities of full multitopic surveys, some very successful multitopic surveys have been carried out in countries with very limited data collection capacity. Several steps can be taken to overcome the problems posed by limited capacity. For LSMS surveys, international experts have been brought in to draw the sample,draft the questionnaires and interviewer manuals,and write the data entry program. Such experts initially substitute for government agency staff, but they can also train agency staff to take their place in future surveys. It is highly recommended that countries with limited capacity for collecting survey data use such expert assistance. In countries with weak institutional capacity, serious consideration should also be given to improving that capacity; capacity building yields long-term benefits that gradually reduce the need to use international experts to help with data collection. Genuine capacity building takes time, money, and political and managerial effort. An international sampling expert may be able to design and draw a sample for a survey in a few days, but it will usually take him or her much longer to teach local staff how to do so. Training to MARGARET GROSH AND PAUL GLEWWE 28 build capacity requires significant resources beyond those already budgeted for a survey.Whether building a country’s data collection capacity is important enough to warrant committing these resources will vary from country to country.Where capacity building is deemed necessary, the survey’s work plan and budget must both be significantly enlarged. If capacity building is a goal, a program of annual (or biannual) surveys will work better than a program for a single survey or for a sequence of surveys that take place every three to five years. An annual survey usually has a permanent allocation of skilled staff, staff time, and equipment. Even when the team works only part of the year on the multitopic survey, staff have a chance every year (or every two years) to use the skills that they have acquired in managing such a survey. And as the staff of the agency develop survey management skills, the need for technical assistance from international experts should diminish. When some staff members working on the survey leave, their replacements can learn their jobs from other staff members who have worked on earlier rounds of the survey. In addition, the continuity provided by an annual survey may make it easier to improve survey quality; if one year a problem arises in data collection or initial analysis, the people who deal with the problem are likely to be involved in planning the next survey and can better address the problem in the next survey. In contrast, a survey carried out every four or five years may require new skills, staff, and equipment each time it is implemented. In the intervening period, many of the individuals who carried out the first survey may have moved on to other jobs either inside or outside the statistical agency.Those who remain may not have been involved in planning the previous survey, and the skills of those who were involved may have deteriorated over time. Vehicles and computers used in the first survey will have been allocated to other purposes, and some may have ceased to function altogether. Most importantly, much of the institutional memory about problems and potential solutions may have been lost. A final note of caution is needed regarding institutional capacity. Sometimes, even when a statistical agency has sufficient management and technical capacity to implement a complex multitopic survey, there may not be enough experienced supervisors, interviewers, or data entry operators. Lack of data entry operators is not a serious problem since they can be trained in a matter of weeks, and no previous experience is required. However, it takes longer to transform government staff with no household survey experience into competent interviewers.While interviewers may be trained in a month, it is not so easy to compensate for little or no interviewing experience. Experience is even more important in the case of supervisors. It may take years to overcome shortages of experienced interviewers and supervisors. Constraints Imposed by Funding Sources Surveys are always constrained by their funding. Most LSMS and similar multitopic household surveys receive some portion of their financing from sources other than the national budget, at least initially. As a result, they are subject to constraints associated with both the national budget and funds from external sources.3 The first and most obvious constraint imposed by the source of funding is the total amount of funds available. National budgets are often very restricted. Some external funding sources have upper limits for how much may be spent on a single project, and most have administrative procedures that grow in complexity as the size of a project increases.Also, the larger the survey budget, the more difficult it is for survey planners to justify using the money for the proposed survey rather than for some other purpose.Limitations on the size of the budget often constrain the size of the sample used in the survey and in some cases curtail the survey’s analytical depth and breadth. Another potential constraint relates to the time period over which funds may be spent.Funding agencies may stipulate that a survey project be completed in only one or two years, even though a single fullscale survey can easily take three years or more to complete—6 to 18 months to plan, a year for fieldwork, and 6 to 18 months for data dissemination and analysis. Moreover, chances of obtaining future funding can influence whether a proposed survey is carried out only once or is the first of a series of surveys. And funding limitations can affect such other aspects of the survey as the thoroughness of the survey designers’ work during the planning stage, whether the fieldwork is spread over a full year or concentrated into a period of a few weeks, and the amount of analytical work funded from the survey project’s budget. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 29 Finally, many funding agencies also have rules on how survey funds can be spent.These rules may impose controls on: the percentage of funds spent in local or international currency; the balance between recurrent and investment costs; the amounts that can be spent on the salaries of local staff, survey equipment, and payment of international experts; the nationalities of such experts; and various aspects of budgeting, accounting, and procurement. Spending rules rarely influence big issues of survey design (such as survey duration, sample size, or questionnaire design), but they can affect many details of the structure and implementation of a project. Rules that prohibit the shifting of expenditures between items or between time periods may limit the ability of survey planners to deal with unanticipated problems. For example, an additional international expert may be needed quickly, but may be difficult or impossible to obtain because hiring this expert was not included in the original budget.The end result can be a delay in the survey or a reduction in quality.Another example is if an accident occurs involving a survey vehicle; fieldwork may be delayed if expenditures to repair or replace the vehicle cannot be made available promptly. Summary The analytical objectives of a survey, the availability of information from other sources, local institutional capacity, and constraints imposed by funding are all key factors that typically affect whether to perform a new survey and the form such a survey will take. Many other factors are also critical, including institutional inertia and rivalry and the compromises required to build a coalition to support and conduct a survey. However, it is difficult to provide general advice because these factors usually depend on the setting of the survey; survey planners must deal with these issues as best they can given the particular circumstances they face. Classic Survey Designs There are three basic ways to combine modules into questionnaires and combine those questionnaires into a survey or sequence of surveys: the full LSMS-type multitopic survey, the scaled-down LSMS-type survey, and the core and rotating module survey. All of these survey formats must include certain “core” components. This section outlines the core survey components,discusses the strengths and weaknesses of each of the three main survey types, and describes two other survey options. The Core Any LSMS-type multitopic survey must collect certain essential information about the household, its members, and the local community, including: • A roster that lists, and collects basic information about, all household members. • Detailed information on household consumption expenditures. • Basic housing data such as type of dwelling, water source, type of toilet, and whether the dwelling has electricity. • The education of all household members,including who is currently in school. • The employment status of everyone of working age and, for those who are working, their occupation, the number of hours they worked during the previous seven days, and, if they are employees, their wage earnings. • The receipt of money or in-kind assistance from key government or NGO programs. • The use of social services and programs, such as government health facilities, schools, agricultural extension services, and social assistance programs. • Basic information related to the design of the sample and the outcome of the household interview. • Local prices of basic food and nonfood goods (unless price data are available from another source, or the country is so small and its markets so well integrated that there is very little regional price variation). These components are referred to in this book as the essential core. In addition to the essential core, it is highly recommended that the following five types of information be collected in LSMS and similar multitopic surveys: • Anthropometric measurements (height and weight) of children 0–5 years old (unless malnutrition is known to be negligible in the country). • The immunization status of children 0–5 years old. • Information on basic household assets such as durable goods, housing, land, and the capital equipment used for agricultural activities and nonagricultural household enterprises. • Information on interhousehold transfers. • Information on rental payments for those households that rent their dwellings. MARGARET GROSH AND PAUL GLEWWE 30 In this book the set of modules formed by adding these five components to the essential core is referred to as the recommended core. The essential core of an LSMS or similar multitopic survey collects the information needed to describe poverty and to monitor it over time.The recommended core adds some very basic child health information, along with information on assets, interhousehold transfers, and rental payments (the use of which will be explained below). Judgments about which data are part of the essential and recommended cores are based on many years of experience that World Bank staff have in using data from LSMS surveys to produce poverty profiles for a wide range of developing countries.Table 2.1 lists the components of both the essential and recommended cores of LSMStype multitopic surveys. The paragraphs that follow describe each of these components in greater detail. HOUSEHOLD ROSTER (ESSENTIAL). Virtually every household survey should begin by determining how many people belong to each household and collecting very basic information on each household member, including age (or date of birth), sex, nationality, relationship to the head of household, and marital status. Part A of the household roster introduced in Chapter 6 (and provided inVolume 3) collects such basic information, along with another piece of information that is less essential: questions that link each married individual to his or her spouse. CONSUMPTION EXPENDITURES (ESSENTIAL). The experience of LSMS surveys and other household surveys strongly suggests that household consumption expenditures are the single most important indicator of household welfare that can be obtained from a household survey. (See Chapters 5 and 17 for further discussion on this point.) Chapter 5 describes how to collect data on consumption expenditures, stressing that there are no costless shortcuts for collecting such data. In some circumstances it might be possible to omit questions on the ownership of durable goods and on transfers given to other households, but the rest of the consumption module is an essential part of the core and should not be reduced further. Data on household expenditures on education, health, and housing are collected in the core elements of those modules (discussed below) and need not be included in the consumption module. Consumption in the form of inkind payments (such as meals, clothing or transportation) from employers is best collected in the employment module. Table 2.1 The Essential and Recommended Cores of LSMS-Type Multitopic Surveys Module Sections used The Essential Core Household Roster All of Part A except questions 8 and 9 Consumption All questions except transfers given to other households (Part D) and ownership of durable goods (Part E) Housing Questions A1–A7, B1–B5, C1–C3, and C13–24 of the short module Education All questions in the short module Employment Questions A2–A13, B1, B2, B7-B10, D3, D4 and D8–D17 Transfers and Other Nonlabor Income All of Part B1; see text for further discussion Health Questions 10–38 of the short module Metadata Household Identification and Control submodule; Questions 1–4 in Summary of Visits and Interviews submodule Prices 30–40 food items and 10–20 nonfood items Credit Questions 9–14 and 21–28 of the short module (on credit obtained from NGOs or government agencies) Agriculture All of Part F (use of agricultural extension services), which is the same for all modules Additional components for the recommended core Anthropometry Entire module, for children 0–5 years old Health All of Part C (immunization) Consumption All of Parts D and E Housing Questions C7–C12 of the short module Household Enterprises Part G of the short module, questions 1–3 Agriculture Parts A, B, and E of the short module. Transfers and Other Nonlabor Income Questions on income from interhousehold transfers Source: Authors’ recommendations. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 31 HOUSING (ESSENTIAL). Information on housing, including the type of dwelling, the construction materials used, the number of rooms, the availability of electricity, the source of drinking water, and the type of toilet, are very basic indicators of living standards. They provide analysts and policymakers with information on a household’s standard of living that goes beyond consumption expenditures. Because housing information is simple to collect, it should be included in any LSMS-type survey. The long version of the housing module introduced by Chapter 12 (and presented in Volume 3) collects substantially more housing data than are necessary for the essential core, and even the short version is longer than the essential core. Only the following questions from the short version of the housing module need to be included in the essential core: A1–A7, B1–B5, and B11–B21. Even some of these questions can be omitted in some countries. The questions on heating (B18–B21) can be removed for countries with warm climates, and the questions that distinguish between wet and dry seasons (B1–B3) can be simplified in countries where this distinction is not important. Another part of the core data set, housing-related consumption expenditures (such as expenditures on electricity, water, and cooking fuel), are most conveniently collected in the housing module rather than in the consumption module. A final useful indicator of living standards is information on the ownership of the household’s dwelling. Questions C1–C3 in the short version of the housing module collect expenditure information, and questions C13–C24 collect ownership information. EDUCATION (ESSENTIAL). Education is both a determinant and a key indicator of living standards.The short version of the education module introduced in Chapter 7 (and presented in Volume 3) comprises all of the essential core questions on education.The only questions that might be omitted are the two questions on grade repetition. The short education module assesses education from several different angles, including school attainment, current enrollment, and education expenditures. Information on the school attainment of household members ages 5 and older is easy to collect and has many analytical uses (such as classifying households in terms of the education level of their head of the household). School enrollment among children, another key indicator of living standards, is also easy to collect.And it is usually more convenient to collect information on household education expenditures in the education module than in the consumption module. EMPLOYMENT (ESSENTIAL). Basic employment information on household members of working age (7 and older in many countries) should be collected as part of the essential core of any LSMS-type survey.The most important source of income for poor people in developing countries is their labor; employment data, including information on unemployment, indicate how this labor is being used. Essential employment information includes each person’s occupation and the number of hours that he or she has worked in the previous seven days.While it would also be useful to gather data on the incomes of all employed household members, this is not easily done for the self-employed (see Chapter 17 for further discussion). However, income data should still be collected for employees even when such data cannot be collected for the self-employed, for two reasons. First, these partial data are useful for understanding which occupations pay well and which do not. Second, since data are already needed from employees on in-kind benefits provided by their employers (in order to calculate consumption expenditures), it would seem strange to ask about those benefits without first asking about money income. The short version of the employment module introduced by Chapter 9 (and presented inVolume 3) collects more information than the core of an LSMStype survey requires.The following questions from the draft employment module constitute the essential core:A2–A13, B1, B2, B7–B10, D3, D4, and D8–D17. Job-specific information—questions in Parts B or D— should be collected both for the person’s main occupation and for any secondary occupation. (The main occupation is the job the respondent spent the most hours doing during the previous seven days.) GOVERNMENT AND NGO TRANSFERS (ESSENTIAL). Many developing countries have programs that provide money or in-kind assistance to households. Some of these are government programs and some are run by nongovernmental organizations (NGOs). Examples of these programs include cash welfare payments, pensions, unemployment insurance, food stamps, food rations, school feeding programs, community soup MARGARET GROSH AND PAUL GLEWWE 32 kitchens, scholarships, and free or subsidized textbooks.While the range of programs is very wide, there are usually only a few sizable programs in any particular country. A key policy question that LSMS surveys can address is who benefits from these programs.However, only programs that reach a substantial fraction of the population can be studied with the relatively small sample sizes recommended for LSMS and similar multitopic surveys. Questions about government and NGO transfer programs should not necessarily all be in the same module (a fact that makes this part of the core difficult to standardize).While questions about cash income fit best in the transfers and other nonlabor income module, questions on school feeding programs should probably be in the education module. However, Part B1 in the transfers and other nonlabor income module is a good place to start collecting this information. SOCIAL SERVICES (ESSENTIAL). Related to programs that provide cash or in-kind assistance are programs that provide services.The most common examples of social services are public schools, public health services, agricultural extension services, credit programs, public work schemes, electricity supply, public water supply, and sewage systems. LSMS surveys and similar multitopic surveys should always collect some information on the use of social services, at least enough to measure variation in access to and utilization of such services across different socioeconomic groups. As with direct assistance to households, the types of programs and the amount of detail needed to identify who benefits from them will vary among countries.School enrollment information is already collected in the core, as discussed above, although additional information may need to be collected on any school services that are available to some students and not to others, such as tuition waivers or afterschool programs for disadvantaged students. Information on the use of public health services is also very important; such information is collected in questions 10–38 of the short version of the health module (introduced in Chapter 8). Data on the use of agricultural extension services are collected in Part F of all versions of the agriculture module introduced in Chapter 19. Some countries have subsidized credit programs to assist poor households; information on these programs is collected in questions 9–14 and 21–28 of the short version of the credit module introduced in Chapter 21. It is possible to identify beneficiaries of public works programs by adding one or two questions to the employment module that ask whether an individual’s current employment is related to such a program. Finally, information on housing-related physical infrastructure services—such as water, sanitation, and electricity—is collected in the core of the housing module, as discussed above. METADATA (ESSENTIAL). The last type of information that must be collected in the household questionnaire of any LSMS-type survey consists of basic data on where the household fits in the sample and on the outcome of the interview. This type of information, known as “metadata,” is discussed in Chapter 4. For the essential core,it is not necessary to collect all of the information covered in the metadata module. The essential metadata are the date of the interview or interviews,the identification (ID) codes for the household and its primary sampling unit,4 the ID codes of the interviewer and the other team members who collected, checked, or entered the data from that household, information on whether an interview actually took place (and if not, why it did not), and perhaps some data on the ethnic group and religion of the household.This information is collected in the metadata module, on the Household Identification and Control submodule and in questions 1–4 of the Summary ofVisits and Interviews sub-module. PRICES (ESSENTIAL). Price information should be collected at the level of the community (the primary sampling unit) since all households in a given community face the same prices. How to collect price information is discussed in Chapter 13.The main task is to select the items for which price data will be collected. While the exact items will vary across countries, prices should be collected for at least 30–40 of the most commonly consumed food items and 10–20 of the most commonly purchased nonfood items. In a few countries other sources of reliable price data may already exist for both urban and rural areas; if these data can be matched to the communities covered in the survey, there is no need to collect new price data. And in some small countries such as Jamaica, prices vary little among regions. In these cases, no price data need to be collected as long as national price data exist that show changes in prices over time. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 33 ANTHROPOMETRIC MEASUREMENTS (RECOMMENDED). Anthropometric data, particularly on height and weight, should be collected for children 0–5 years old in almost every LSMS or similar household survey. Stunting (low height for age) and wasting (low weight for height) are common measures of children’s nutritional status; height and weight data are critical in countries where children are at risk of malnutrition. And collecting basic anthropometric data about children is simpler and more reliable than collecting other data on health status. The details of how to collect children’s anthropometric data are explained in Chapter 10. Collecting height and weight data requires some effort.The data are collected using special equipment that is bulky and troublesome to carry around to each household. One consequence of this is that another individual is often added to each survey field team. If collecting children’s height and weight data were easier, such anthropometric measurements would have been classified as part of the essential core of any LSMS-type survey. IMMUNIZATION (RECOMMENDED). Almost all LSMS and similar multitopic surveys should collect immunization records for children ages 0–5. In recent years child immunization programs have dramatically reduced the incidence of several life-threatening childhood diseases in many developing countries— significantly reducing infant and child mortality rates. However,many countries still do not have 100 percent immunization coverage.Therefore, information on the extent of coverage and on where coverage is low is important for almost any analysis of living standards.In addition, since child immunization coverage can change dramatically over a year of two, it serves as a useful indicator of changes in the provision of government services during periods of economic or social instability. Child immunization information is collected in Part C of the health module introduced by Chapter 8. ASSETS (RECOMMENDED). Household assets include information on any consumer durable goods owned by the household, the value of owner-occupied housing, and the ownership of land and capital assets related to agricultural activities and household enterprises. There are several reasons for collecting these data. First, the possession of household durable goods such as radios, televisions, bicycles, motorcycles, and cars is a simple indicator of living standards. Second, the sum of the value of all these different household assets gives a rough (and admittedly incomplete) indicator of household wealth. Third, data on the ownership of land and on capital assets used in agricultural and nonagricultural enterprises indicate productive assets. Fourth,in some countries,particularly countries of the former Soviet Union,there is evidence that adding the consumption derived from durable goods and housing to total consumption can lead to substantial changes in the relative economic positions of different types of households. Information on the ownership of consumer durable goods can be collected using Part E of the consumption module. Data on the value of owneroccupied housing are collected in the short version of the housing module,using (at minimum) questions C1 and C11, with C3 and C12 providing alternative valuations. A short set of questions on the assets used in household enterprises is provided in Part G of the short version of the household enterprise module; only questions 1–3 are needed. Parts A, B, and E of the short agriculture module collect a modest amount of information on households’ land holdings, livestock, machinery, and other agricultural assets. PRIVATE INTERHOUSEHOLD TRANSFERS (RECOMMENDED). Private interhousehold transfers, which are pervasive in many countries, are used by many households to cope with poverty and economic vulnerability.Transfers received are covered by the transfers and other nonlabor income module (introduced in Chapter 11) and transfers sent are covered by the consumption module (introduced in Chapter 5). At least the short versions of the private interhousehold transfer submodules should be used in virtually all surveys. Even in a relatively simple survey it may be worthwhile to use the standard version of the submodule on transfers received. RENTAL PAYMENTS FOR HOUSING (RECOMMENDED). Estimates of the annual rental value of dwellings are needed to estimate the consumption value of housing for households that own these dwellings. In most countries such estimates can be calculated by estimating the relationship between basic housing characteristics, which are already part of the core, and the rental payments made by households that rent their MARGARET GROSH AND PAUL GLEWWE 34 dwellings.The key piece of information needed is the rental payments of households that rent. Questions C7–C12 in the short version of the housing module collect information on rental costs. Full LSMS-type Survey In practice, the essential core—and even the recommended core—will tap only a small part of the potential policy uses of an LSMS-type survey. In most LSMS and other multitopic surveys, much more can and should be added to the questionnaires to gather information beyond what is collected in the core. This subsection and the two that follow it discuss different ways to add to the core by expanding modules and combining them to form a survey or sequence of surveys. A full LSMS-type multitopic household survey can be formed by combining the short or standard versions of most of the modules in the household questionnaire with the corresponding parts of the community and price questionnaires.This produces a household survey similar in design to the original LSMS surveys first used in 1985,except that the modules presented in Volume 3 of this book (and described in Parts 2 and 3) include revisions based on 15 years of experience with LSMS and other household surveys. Because some of the standard versions of modules presented in Parts 2 and 3 are significantly larger than versions used in the original LSMS surveys,a household questionnaire including all of the standard modules would almost certainly be too large to be practical.Thus the household questionnaire of a full LSMS-type survey needs to be trimmed, either by replacing the standard versions of some modules with their short versions or by dropping some nonessential modules. A well designed full LSMS-type multitopic survey collects information that measures or otherwise describes: • Household consumption. • Household income. • Key nonmonetary indicators of welfare such as nutritional and health status, education status, and housing conditions. • Many aspects of household behavior, such as income-generating activities, human capital investments, fertility, and migration. • The local economic environment (including prices and the availability of services). • Participation in specific government programs such as food stamps programs, job training programs, and agricultural extension services. Having all this information for a group of households makes it possible to describe many indicators of living standards, estimate the determinants of different dimensions of living standards and different types of household behavior, and estimate the relationships between dimensions of living standards and household behavior (such as the impact of children’s nutritional status on their school performance). The full LSMS household questionnaire is long and complex. In almost all cases it is too long to be completed in a single visit by an interviewer to a household. Instead, an interviewer typically visits each household twice. All of the individual-specific modules (roster, education, health, employment, and migration) are administered in the first visit, sometimes with the addition of one or two household-level modules such as housing. The interviewer makes an appointment for a second visit, usually about two weeks later, to reinterview household members who are most knowledgeable about the other householdlevel modules (such as consumption, agriculture, and household enterprises). To ensure that high quality data are collected and to keep the budget within reasonable limits, the samples in full LSMS-type surveys are usually relatively small—between 2,000 and 5,000 households. Samples of this size are still large enough to provide accurate information on the nation as a whole, on rural and urban areas, and on a small number of geographic regions. However, such samples are not large enough to provide accurate statistics for each state, province, department, or district in a country. Even at the national level, they cannot provide precise information on phenomena that do not pertain to most households or individuals—such as post-secondary education or participation in a program used by only a small fraction of the population. See Grosh and Munoz (1996) for a more thorough discussion of sample size and sampling issues. In most cases it is not worthwhile to implement a full LSMS-type multitopic survey every year. Much of the analysis for which LSMS surveys are designed does not need to be repeated annually. For example, while it is important to understand the determinants of fertility, it is unlikely that these determinants change greatly from one year to the next. Sizable changes are likely to occur only over the course of several years, as CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 35 economic conditions and people’s attitudes change. Another reason not to implement a full survey every year is that it is costly to administer such a comprehensive household questionnaire,and requires substantial work at each stage. Therefore, a full LSMS-type survey should be implemented only once every three to five years. During 1985–99 the following countries implemented full-size LSMS surveys: Algeria, Brazil, Côte d’Ivoire, Ecuador, Ghana, the Kyrgyz Republic, Mauritania, Morocco, Nepal, Pakistan, Panama, Peru (in 1985–86, 1991, and 1994), Turkmenistan, and Vietnam. Scaled-down LSMS-type Survey A scaled-down LSMS-type survey can be constructed by omitting some modules from the household questionnaire of a full LSMS-type survey and by abridging other modules. Such a survey will still be a multitopic survey, but will cover fewer topics than a full-size survey would. Substantial reductions in the size of the household questionnaire may mean that the questionnaire can be completed in a single visit by the interviewer to the household, as compared to the two visits needed for a full LSMS-type multitopic survey. The extent to which various modules should be reduced or eliminated will depend on which policy questions are most important in the country in question. However, there is a limit to how much the questionnaire can be cut.The essential core of an LSMS or similar multitopic survey, as described above, must remain. In addition, the elements that are added to form the recommended core (data on anthropometrics, child immunization coverage, basic household assets, interhousehold transfers, and rental payments of households that rent their dwellings) should almost always be included. The community questionnaire may or may not be included in a scaled-down survey, but the price questionnaire should always be used, except in those rare cases in which fully adequate price data already exist or price variation across regions is negligible. Overall, the analytical objectives of a scaled-down LSMS-type survey are more modest than the objectives of a full-size survey. One common way to abridge the questionnaire is to decide not to collect the data needed to measure total household income. Not measuring total household income allows survey designers to delete most of the agriculture and household enterprise modules, retaining only questions on agricultural extension services that are part of the essential core and questions on assets that are part of the recommended core.Yet the questions on wages from the employment module and the questions on public and private transfers from the transfers and other nonlabor income module should be retained,as they are part of the essential core of any LSMS-type survey. Other questions that can be dropped are questions on any aspects of household behavior that are of little interest to policymakers. The savings, credit, fertility, and migration modules have often been deleted from previous LSMS surveys. Because the new time-use module is quite lengthy, it is also a candidate for omission, unless data on time use are of particular interest to policymakers. If analysts aim to measure use of social services but not to estimate the determinants of demand for them, survey designers could choose to use the short, rather than the standard, versions of the health and education modules. An alternative way to obtain a scaled down multitopic survey is to “scale up” an existing single-topic household survey, such as a labor force or household expenditure survey. In Romania, Latvia, and Bangladesh, new modules on the use of social services and programs were added to existing household income and expenditure surveys. In Guyana, households that had been interviewed in a previous income and expenditure survey were revisited to collect information on health, education, and anthropometrics; the separate data files were later merged for purposes of analysis. In Jamaica, households from the Labor Force Survey were revisited by interviewers who administered the Survey of Living Conditions; the two data files were later merged. In Paraguay, additional modules were added directly to the Labor Force Survey questionnaire. Scaling down the household questionnaire of a full LSMS-type survey reduces the analytical potential of data collected, especially in parts of the questionnaire that are dropped or abridged. A reduced questionnaire produces fewer descriptive statistics on many dimensions of household welfare than would be possible using a full-size survey. Data from a scaled-down questionnaire can be used to analyze only a few of the determinants of living standards.And such data reductions substantially reduce the range of analytical methods that can be used. A scaled-down LSMS-type survey can be implemented fairly often, perhaps annually or every other MARGARET GROSH AND PAUL GLEWWE 36 year. Such frequent implementation is desirable because one of the main uses of data from a scaleddown survey is to monitor changes in poverty and other dimensions of welfare over time. Also, the fact that a scaled-down survey collects less data on the determinants of household welfare and behavior than does a full-size survey means that implementing it frequently wastes less resources than would implementing a full LSMS-type survey every one or two years. Another advantage of a scaled-down survey is that it is easier and less expensive to carry out than a full-size survey. Finally, a scaled-down survey can be carried out using somewhat larger samples than a full LSMStype survey because it is subject to fewer managerial and budget constraints. Scaled-down LSMS surveys have been carried out, with World Bank support, in Albania,Azerbaijan, Bolivia, Bulgaria, Pakistan (1995/96 and 1996/97), Peru (1990), and Tanzania. Core and Rotating Module Design The “core and rotating module” design for a multitopic household survey is an attempt to combine the advantages of full and scaled-down LSMS-type surveys. In this design, a scaled-down LSMS-type survey forms the “core,” while one or two modules are added or greatly expanded each time the survey is carried out. Modules that are added or expanded in any given year revert back to their “core” size the following year, creating a module “rotation” scheme for the modules that go beyond the core. In most cases the survey is fielded annually, although it can also be a semiannual or biannual survey.The core that is repeated each time the survey is implemented must include the essential core described above, and in almost all cases it should include everything in the recommended core.In many cases the core of a core and rotating module design should collect additional information as well, in order to provide a more detailed picture of household welfare each time the survey is implemented. An example of how to implement this approach would be to use only the core in the first year of the survey, in order to focus on making sure that the core works well. In the second year the health module in the household questionnaire would be expanded to gather more detailed data on individuals’ health status and behavior, the kinds of health care sought, and the cost and quality of that health care.In addition,a health facility questionnaire could be added (see Chapter 8 for further details on health facility questionnaires). In the third year the health module would return to its original “core” size and a new subject, such as education or savings, would be given special emphasis. Expansion of any particular module might require making some additions to other modules in the survey to ensure that the analytical potential of the data collected in the expanded module could be fully exploited. Each chapter in Parts 2 and 3 of this book explains what data are needed from other modules to complement the data collected in the module covered by that chapter. The core and rotating module design is a hybrid of a full LSMS-type multitopic survey and a frequently implemented scaled-down LSMS-type survey. Implementing a core and rotating module survey annually would allow for the same monitoring of poverty and welfare that is possible with data from an annual scaled-down survey. In addition, in each rotation of a particular module, this kind of survey would collect the data analysts need to study the determinants of household behavior for a specific topic—in other words, data comparable to what are collected in a full-size survey. It might even be possible to use data from the scaled-down modules to study topics that are not emphasized by the survey in a particular year. The cost and sampling implications of the core and rotating module design lie somewhere in between those of a full-size LSMS-type survey and those of a scaled-down survey. Perhaps of greatest concern in the core and rotating module design are the institutional arrangements for developing, implementing, and analyzing the special modules.While for both full-size and scaled-down LSMS-type surveys it is possible to put a lot of effort into the design of the first survey and give less attention to improving its design in subsequent years, implementing the core and rotating module design means that the questionnaire needs to be significantly modified each year—requiring much more attention from survey designers after the first year. Indonesia’s SUSENAS is a long-standing example of a core and rotating module survey design. Jamaica’s Survey of Living Conditions, which began in 1988, was the first LSMS survey to adopt this approach. A new LSMS survey in Cambodia is just starting to develop such a system, as is the Bangladesh Household Expenditure Survey. (The Bangladesh Household Expenditure Survey is not usually regarded as an LSMS survey; however, it has adopted much of the LSMS methodology.) CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 37 Special Purpose Sample Designs There are two other possible survey designs, both of which use special purpose samples (that is, samples that are not nationally representative). The first is a survey that samples a special population that is of particular interest for analytical or policy purposes.5 An example of this is a sample of households within a single city that is used to study issues pertaining to that city, such as the housing market, the water supply system, or urban air pollution. Two LSMS surveys of this type have been performed:one in the Kagera region ofTanzania,focusing on areas with high prevalence of AIDS, and one in rural areas of Northeast China, focusing on the agricultural activities of rural households. A second kind of special purpose survey is one in which the sample is drawn solely for purposes of program evaluation. In this type of survey, a group is observed both before and after the benefits of a particular service or program are made available to this group. Alternatively, the sample may be composed of two groups, one consisting of the households who benefit from the service to be evaluated—the treatment group—and the other consisting of households that are similar to the first in every respect except that they do not benefit from the service—the control group. These special-purpose samples usually gather detailed data on the topic being studied, whether it is a specific sectoral issue (such as agriculture) or a program to be evaluated. There are so many ways to design such surveys that this book cannot hope to cover all of them. However, since special purpose surveys typically collect data on many general characteristics of the sampled households (such as size, composition, living standards, labor force status, and education), designers of this kind of survey can use the modules proposed in this book as a guide for collecting this supplemental information.The experience of past LSMS surveys has been used in designing special purpose surveys to evaluate the impact of educational reforms in El Salvador.And the Nicaragua LSMS survey included a special sub-sample designed to evaluate the impact of that country’s Social Investment Fund. Matching Circumstances and Designs This section provides some approximate rules of thumb for choosing among the three common survey design options discussed in the previous section.These recommendations should not be thought of as rigid or beyond debate; instead, they should be thought of as a starting point for making survey design decisions.This is the case for several reasons. First, the dividing lines between the three basic survey designs are flexible, as it is possible to develop “hybrid” surveys that merge characteristics from the different survey design options. Second, individual countries may not fit neatly into the categories implicit in the rules.Third, surveys may have multiple analytical objectives. Finally, funding constraints are not explicitly considered here. Survey planners should consider the following general “rules of thumb” when deciding what kind of survey to implement: 1. Countries with sufficient institutional capacity to implement a complex survey should use either a full LSMS-type multitopic survey every three to five years or a core and rotating module design; both options can serve a broader range of analytical objectives than can a sequence of scaled-down LSMS-type surveys. 2. If annual (or biannual) monitoring of living standards or poverty is the most important analytical objective, either a sequence of frequent scaleddown LSMS-type surveys or a core and rotating module survey should be adopted. In contrast, a full-size multitopic survey is inappropriate because cost and efficiency considerations imply that such surveys should be implemented only every three to five years. 3. No new survey is needed if the main objective is to provide periodic descriptive information (say, every three to five years) or to examine the coverage of government programs in countries where ample data are already available from other sources. 4. If the main objective is to gather periodic descriptive information or to examine the coverage of government programs, a core and rotating module design should not be chosen. Such a design would collect data much more frequently than is necessary. 5. If the main objective is to model household behavior, either a full LSMS-type survey or a core and rotating module survey should be chosen. A series of scaled-down surveys would be insufficient for modeling household behavior. 6. If the main objective is to model household behavior and very little other data are available, a full-size multitopic survey is preferable to a core and rotating module survey since the latter cannot MARGARET GROSH AND PAUL GLEWWE 38 supply detailed information on all topics until it has been in operation for several years. The core and rotating module design can be adopted after one or two full LSMS-type surveys have been carried out. 7. If the main objective is to model household behavior and a large amount of other data are available, the core and rotating module survey is preferable to a series of periodic full LSMS-type surveys because the core and rotating design allows poverty to be monitored more frequently over time. 8. If the institutional capacity in the country is limited and the survey aims either to monitor poverty and living standards annually or to provide descriptive information (including coverage of government programs) periodically, a scaleddown LSMS-type survey should be chosen. This survey may be either frequent (for annual monitoring) or periodic (for descriptive information every three to five years). The other options, full multitopic and core and rotating module, are too complex for countries with limited institutional capacity. Table 2.2 summarizes the implications of these rules, showing which rules lead to which choices. Because countries with little institutional capacity cannot implement a full LSMS-type multitopic survey or a core and rotating module design on their own, they will not be able to collect data that are useful for analyzing household behavior unless their institutional capacity is either permanently improved or supplemented in the short run by using international experts. In addition, significant purchases of new equipment may be required in some countries. Choosing the Modules, Defining Their Objectives, and Setting Their Size Once the basic blueprint of the survey has been selected, survey designers must decide which modules to include in the household and community question- naires.6 Designers must also define specific objectives for each module and decide on each module’s approximate length. The procedures for these steps are discussed in this section. Because decisions about length and objectives ultimately depend on many countryspecific details, specific recommendations cannot be provided for each possible scenario.Instead,some general guidelines and procedures are provided that should prove useful for completing this step efficiently and effectively. Two general points must be made at the outset. First, the tasks of choosing modules, defining their objectives, and setting their approximate size are all closely related and thus must be done simultaneously rather than sequentially.The type of objectives and the number of objectives have considerable implications for the size of each module; more objectives, and more complex objectives, necessitate a larger module. Second, the objectives of each module should be consistent with the overall objectives of the survey, in terms of both the analytical objectives (describing living standards,monitoring poverty and living standards, examining the coverage of government programs,estimating the impact of policies) and the specific topics in which policymakers are interested. The overall objectives of the survey already provide some information on what the objectives of many of the modules will be. Table 2.2 Recommended Survey Designs for Different Settings Analytical objective Describing living standards or Monitoring living Availability of other data examining program coverage standards or poverty Modeling household behavior Countries with sufficient institutional capacity Limited Full LSMS-type survey Core and rotating module Full LSMS-type survey (Rule 5 + Rule 6) (Rule 1 + Rule 4) (Rule 1 + Rule 2) Ample No new survey needed Core and rotating module Core and rotating module (Rule 5 + Rule 7) (Rule 3) (Rule 1 + Rule 2) Countries with limited institutional capacity Limited Periodic scaled-down LSMS- Frequent scaled-down LSMS- Full LSMS-type survey (Rule 5 + Rule 6)a type survey (Rule 8) type survey (Rule 8) Ample No new survey needed Frequent scaled-down LSMS- Core and rotating module (Rule 3) type survey (Rule 8) (Rule 5 + Rule 7)a a. International experts must be hired to carry out key tasks. Source: Authors’ recommendations. CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 39 Choosing modules A good first step in choosing modules is to set the upper and lower limits of what can be included in a multitopic survey.The lower limit is the essential core discussed above; in almost all cases this lower limit should be expanded to include the additional elements that are in the recommended core. The upper limit will depend on country-specific circumstances such as the capacity of the statistical agency and the willingness of households to participate in lengthy interviews. It is never possible to include all of the modules in any one survey. An important question to address relatively early when making decisions about modules is whether the survey will attempt to collect enough data to calculate total income.The advantages and disadvantages of collecting these data are discussed at length in Chapter 17. Clearly, if survey designers decide to collect the data needed to calculate total income, the agriculture and household enterprise modules need to be included in the questionnaire.7 If designers decide not to collect income data and there is little interest in these two modules, they can be dropped, except for the questions on use of agricultural extension services that are part of the essential core and the asset questions that are part of the recommended core. It will probably not be possible to collect total income data in a scaled-down survey because it is not feasible in a single visit to a household to collect the recommended core data plus the data from the agriculture and household enterprise modules and still have room to examine other topics.This implies that it is also difficult to collect total household income in a core and rotating module survey, except when the module featured is either the agriculture or the household enterprise module; even when one of these two modules is featured in such a survey, collecting total income data may not be feasible in some countries. Two other specific decisions to make early in this step of the survey design process are whether to collect time-use data and whether to implement a large number of the detailed environment modules (see Chapters 22 and 14, respectively).The time-use module is very long, and as such should be thought of as an expanded module. If survey designers choose to include this module, they may have to omit several other short or standard modules. While it would be feasible to include the time-use module in a full-size LSMS-type survey, this module is probably too large to be included in a scaled-down LSMS-type survey and would work in a core and rotating module survey only if it were chosen as the topic emphasized in a particular year. The same circumstances apply to the environmental module.The full set of these environmental submodules is equivalent to a very large expanded module, and for this reason it is difficult to imagine the full set used in a single survey.The contingent valuation modules should be used only when specific improvements in services (such as urban water supply, urban sanitation, urban air quality, or rural water supply) are being contemplated. Even a subset of the expanded environmental modules is likely to be equivalent to a large expanded module, especially if the water, sanitation, and fuel use modules are included.This being the case, it is feasible to include a large subset of the environmental modules in a full LSMS-type multitopic survey, but only a relatively small subset can be included in a scaled-down survey. In a core and rotating module survey a large subset of the environmental submodules can be used only if environmental topics are emphasized in that particular year; for all other years only a small subset would be feasible. At this point,it is useful to give some general rules about how much room there is for modules in different kinds of multitopic surveys. For a full LSMS-type survey, the household questionnaire should be roughly large enough to include a mixture of about 15 standard or short modules. The number of modules that can be included in a scaled-down survey is probably closer to 8 or 10, most of which have to be short versions. A core and rotating module survey lies somewhere in between but is probably closer to the scaleddown survey if only one visit is made to the household. The modules chosen must in all cases include the components of the essential core;in almost all cases the modules should also include the additional components found in the recommended core. Using these starting points for what is feasible, the next task is to consult with policymakers at the highest level to get a detailed idea of which topics are of greatest interest to them (if this has not already been done). Policymakers need to specify which topics are of overriding concern, which are of moderate interest, which are of minor interest, and which are of little or no interest. Expanded modules, if they exist, should be used for topics of overriding interest.8 Standard modules should be used for items of moderate interest. MARGARET GROSH AND PAUL GLEWWE 40 Short modules may be appropriate for items of minor interest. Items of little or no interest need not be covered in the survey unless they are part of the essential or recommended core. The core and rotating module survey design is inherently more flexible than other classic designs; if the core and rotating survey is implemented annually, it can cover four or five topics in great detail over the same number of years by including the expanded version of one of these modules each year. Of course, survey designers still have to set priorities about which expanded module is included in the first year, which is included in the second year, and so on. The above paragraphs provide survey designers with a scheme for generating a draft list of the modules to be included in the survey, their approximate length, and, to an extent, the objectives of each module. Needless to say, this draft list needs to be refined. This can be done by adding two new “ingredients” to the process: discussions with policymakers who specialize in particular topics or programs, and a careful reading of the chapters in Parts 2 and 3 of this book. The task is to reconcile the specific policy questions raised by these more specialized policymakers with the feasibility of collecting data to analyze them (as discussed in detail in Parts 2 and 3 of this book) given the approximate sizes of each module as specified by highlevel policymakers.This process is not simple and consequently involves a certain amount of iteration. Unfortunately, policy issues raised by specialist policymakers often require more questions than can fit into a module of the size specified in the first draft of the modules.The choice at this point is between not including many of these policy issues in the survey and expanding the module containing these questions at the expense of other modules. A third alternative is expanding the relevant module without reducing the size of any other module, but the feasibility of this option is open to question and will not become clear until a draft questionnaire is field tested. Given this situation and the uncertainty regarding what is feasible and what is not, survey designers should use the following procedure to reconcile the specific objectives of each module with any constraints on module size. First, designers should ask policymakers who specialize in a given topic to rank the policy issues in order of importance, so that the module can collect the data needed to analyze the most important policy issues despite the inevitable constraints. Second, for each module, survey designers should match the policy issues raised by policymakers with the data required to analyze them, as laid out in each chapter of Parts 2 and 3. One way to do this is to choose the smallest version of each module that can address all of the relevant policy issues,and remove any questions in that module that are not needed to analyze these policy issues. If the module is still too long, questions needed only to address the least important policy issues are deleted. This shorter module is checked again to see if it exceeds the provisional size limit.The general principle is that the most important policy questions are addressed first and additional issues are added until the module has reached the length that survey designers, in consultation with high-level policymakers, have set for it. Third, after this has been done for all modules, survey designers should prepare a list of issues they think can be covered by the survey and give this list to the high-level policymakers, who will decide whether they would like to change the amount of space allocated to each module. The survey designers should tell the policymakers about the tradeoffs involved, working with them to ensure that the issues policymakers deem most important are addressed. Ultimately, this process produces a list of modules to be included in the survey, the proposed length of each module to be included, and the specific objectives for the modules.This completes the second step of survey design. This step may need to be revisited later if results of the field test show that the questionnaires are too long or that there is room to expand the questionnaire. Notes The authors would like to express their gratitude to Jere Behrman, Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto Martini, Raylynn Oliver, Kinnon Scott, and Salman Zaidi for comments on an earlier draft. 1. This book is designed to provide a thorough review of international experience. However, new experience and knowledge will continue to accumulate after the book has been published.Therefore, until a new book is written, any new international-level information is probably most easily obtained from international researchers. 2. If geographic areas—rather than households—are the unit of observation, it may be possible to merge data from different surveys. However, this high level of aggregation yields less precise CHAPTER 2 MAKING DECISIONS ON THE OVERALL DESIGN OF THE SURVEY 41 results, raises issues of aggregation bias, and generally requires surveys with very large sample sizes. 3. A variety of external sources have been used to fund past LSMS surveys.World Bank loans have partially financed several LSMS surveys. Grants from various bilateral development agencies (especially from the United States, Scandinavian countries, and Japan) and multilateral development agencies (particularly the United Nations Development Programme and the United Nations Children’s Fund) have wholly or partially financed a large share of LSMS surveys. In a few cases, grants from theWorld Bank research budget have supported LSMS surveys. Similar surveys, such as theWorld Bank’s SDA surveys, RAND’s Family Life Surveys, and a few other surveys in Africa, all receive a large share of their funding from external sources. 4. Most previous LSMS surveys have used two-stage sample designs. If a three-stage sample design is used, ID codes will be needed that identify both the primary and secondary sampling units of each household.An analogous comment applies to surveys that use four or more stages in their sample designs. 5. In large countries with federal systems, surveys can be performed for individual states. Such surveys usually have the same general purposes as national surveys, and have samples that are representative of the whole state. 6. Each module in the household questionnaire should also be included in the community questionnaire. See Chapter 13 for further discussion of the community questionnaire. 7. While all versions of the household enterprise module collect income information, only the standard and expanded versions of the agriculture module collect sufficient data for use in the measurement of total income. 8. A full LSMS-type survey could accommodate two and possibly three expanded versions of modules;a scaled-down survey could accommodate at most one.Volume 3 presents expanded versions of the following modules: roster, education, health, employment, migration, environment, household enterprises, and agriculture.The time-use modules introduced in Chapter 22 should also be treated as expanded modules, and the same is even more true for the full set of environmental modules introduced in Chapter 14. References Grosh, Margaret, and Juan Muñoz. 1996. A Manual for Planning and Implementing the Living Standard Measurement Study Survey. Living Standards Measurement Study Working Paper 126. Washington, D.C.:World Bank. 43 3 Designing Modules and Assembling Them into Survey Questionnaires Margaret Grosh, Paul Glewwe, and Juan Muñoz Chapter 2 outlined the five-step process that survey designers should follow to design LSMS and similar multitopic surveys. It also provided detailed recommendations on how to undertake the first two steps, which are deciding on the overall design of the survey and deciding which modules to include in the survey questionnaire.This chapter discusses the last three steps of the fivestep survey design process.The first section of this chapter describes the third step—drafting each module, question by question, to ensure that it will collect the data necessary to meet the module’s objectives (which were laid out in the second step).The second section guides survey designers through the fourth step—coordinating the different modules and combining them to create a consistent and comprehensive set of questionnaires.The third section explains the procedures for the last step —translating the questionnaires into local languages and conducting a field test.The fourth section discusses the formatting of the questionnaires, which is an extremely important but often neglected aspect of designing successful multitopic surveys. Survey designers should refer to the material contained in the fourth section many times during the last three steps of the survey design process. In practice, the survey design process rarely moves smoothly and sequentially from one step to the next. Instead, survey designers often find themselves moving backward and forward among the various steps. For example, if designers encounter difficulties when drafting a specific module, they may need to reconsider and modify their original objectives for that module. Developing survey questionnaires is an iterative process, and survey designers should expect to go through at least three or four drafts of each module. It is not unusual for the different versions of the drafts to add up to a stack of paper one foot (30 centimeters) high. Each major redraft of a module or questionnaire should be reviewed by all interested parties, not only the people involved in carrying out the fieldwork (the data producers) but also policymakers (who will make decisions based on the data), members of the research community (who will analyze the data), and the staff of any agencies financing or providing technical assistance to the survey. Eventually, what should emerge from the process is a well-designed set of questionnaires for a multitopic household survey. Producing Draft Modules The third step in survey design, producing draft modules for the household and community questionnaires, is one of the most time-consuming steps in the MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 44 process. Detailed guidance on this step is provided in the chapters in Parts 2 and 3, so the discussion here will be general and relatively brief. Once the objectives for each module are finalized (at least tentatively), survey designers can begin to develop detailed draft modules for the household and community questionnaires. Survey designers should use the draft “prototype” modules introduced by the chapters in Parts 2 and 3 (and presented inVolume 3) as their starting point.As explained in Chapter 2, survey designers will already have decided on the policy and analytical objectives of each module.They should now choose the shortest versions of the modules that will allow for analysis of the most important of these objectives; any questions not relevant to these objectives should be removed. If the resulting module is still too long, survey designers should remove any questions that are needed only for the analysis of the least important of the policy issues. This process should continue until the module meets the length constraint. In some cases the module may be shorter than expected, in which case a policy issue and its accompanying questions can be added.The general principle is that the most important policy issues should be addressed first, and additional ones should be included only if space allows. This approach is a good start, but much more remains to be done. For some modules the information and guidance given in the relevant chapter in Parts 2 and 3 of the book may be incomplete. For example, the chapter may not address certain policy issues that are important in a given country or setting, in which case the designers of that survey will need to develop new modules or submodules. Even in these cases the information provided in the relevant chapter is usually a good base for developing such modules. However, if the designers intend to implement major innovations in their survey, they should seriously consider adding to the survey team a specialist with the relevant experience in both data collection and data analysis. Once each draft module has been written out in its entirety, the next task is to verify that the design of each module reflects the economic and institutional structures of the country in question. For example, the designers need to check whether common living arrangements are reflected in the definition of the household used in the household roster and in the housing and interhousehold transfer modules. They must also review all questions and response codes and, if necessary, modify them to reflect local institutions and terminology. For example, the transfers and other nonlabor income module discussed in Chapter 11 must explicitly refer to each public transfer program by name. The consumption module will need even more work; in particular, the lists of items selected must closely reflect items consumed in the country. The agricultural module will need careful attention, as this module must reflect the country’s landholding and cropping patterns. For many of the modules, survey designers may find it useful to collect some preliminary data using qualitative techniques, which may help them determine how best to design these modules to collect quantitative data. Chapter 25 provides a detailed discussion on how to collect qualitative data. Such data can be particularly useful in countries where successful quantitative surveys have never been done for the topic to be studied. A final general issue to consider when drafting modules is the role played by the fieldwork schedule. A prototypical full LSMS survey spreads fieldwork evenly over a 12-month period, for two reasons. First, this makes it possible to study or average out any seasonality effects. Second, and more importantly, surveys with this fieldwork schedule require a smaller number of survey field teams than do surveys that compress the fieldwork into a shorter period of time. This smaller number of teams reduces costs and allows for improved quality control. All of the interviewers can be trained together and thus to a uniform standard; in addition, the cost of training interviewers—which takes about four weeks—will be proportionately cheaper. Each interviewer will complete more interviews and thereby gain more experience.Finally,fewer computers and vehicles will be required. Despite these advantages of a year-long survey period, many past LSMS surveys have compressed fieldwork into a period of just two or three months. This has often been done when there was pressure on the survey team to collect data for analysis as quickly as possible. In other cases interviewers may have been available for only a short period of time, or the organization funding the survey may have required that the project be completed in a relatively short amount of time.The fieldwork schedule can also be modified to accommodate analysis of certain topics. For example, analysis of some agricultural issues may require inter- CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 45 viewers to make two or more visits to each household at different times during the year. Variations in the fieldwork plan may require changing the wording of some modules. This means that survey designers should ensure that the design of each module in the questionnaire is consistent with the fieldwork plan. When a survey is conducted over a relatively short period, such as a few weeks or months, careful attention must be given to the wording of questions concerning events that are seasonal in nature.Will school be out of session for a large portion of the survey period? If so, the education module may need to be changed to reflect this. In particular, questions referring to school activities during the previous week, such as the number of days that a child attended school or the number of hours of homework done by the child, would clearly be inapplicable.Also, questions about water supply during both wet and dry seasons should be reviewed to ensure that they reflect the circumstances of these seasons. The largest seasonal changes may need to be made to the agricultural module. A detailed discussion of the implications of seasonality for that module is provided in Chapter 19. More substantial changes will be required if the household is to be visited more than once at different times of the year. In such cases it may be desirable to have the interviewer administer modules for which the answers are expected to vary by season (such as the consumption, agriculture, water, or time use modules) each time he or she visits the household. In contrast, the modules that are unlikely to be affected by seasonality, such as housing, education, fertility, or migration, probably need to be administered only once. Any modules that are to be administered more than once usually need to be modified, particularly with respect to their recall periods. For example, if the interviewer makes two household visits six months apart, the consumption module should be administered in both visits and should have a recall period of six months rather than one year.Also, the water module should ask only about the particular season (wet or dry) during which the interview is to be conducted. The guidelines given in this chapter are general, since very detailed information is provided in Parts 2 and 3 of this book. Other information on adapting LSMS questionnaires to fit local circumstances can be found in Oliver (1997), which focuses on survey design in the countries of the former Soviet Union, as well as in Ainsworth and van der Gaag (1988).A good general reference publication for developing and designing household survey questionnaires is United Nations (1985). More recent general references are in Babbie (1990), Fink (1995), and Fowler (1993); although these books focus more on developed countries, much of the material they contain is also relevant for developing countries.A final point to bear in mind is that a good deal of attention must be given to correct and consistent formatting. This is described in great detail in the fourth section of this chapter; survey designers should read that section very carefully before they begin designing any survey modules. Integrating and Combining Modules to Create Complete Questionnaires Once draft versions of each of the individual modules have been written, these drafts must be combined to form complete household and community questionnaires. Merely stapling the various modules together will not produce a well-designed questionnaire; much more work has to be done to ensure that the different modules fit well together.This section describes how to do this important task. It focuses primarily on making the modules of the household questionnaire consistent with each other. Similar, though less difficult, issues arise when integrating the modules of the community questionnaire; in most cases the approach to take for the community questionnaire can be inferred from the discussion of the household questionnaire. This section will also highlight particularly important points to consider when combining the household, community, and price questionnaires to form a comprehensive household survey. Gaps and Overlaps Survey designers must scrutinize and compare the different questionnaire modules for gaps and overlaps in the information that the modules collect. Analysts often need to combine data from different modules in the household questionnaire. Perhaps the most important example of this is the calculation of each household’s total consumption, which requires information not only from the consumption module but also from the education, health, employment, and housing modules—and from the water,sanitation,or fuel modules (see Chapter 14) if they are included as separate modules in the questionnaire (as opposed to using the MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 46 housing module to collect information on these topics). Likewise, income data are collected in the employment, agriculture, household enterprise, and miscellaneous income modules. It is important to check that a questionnaire includes the data needed to construct these and other complex variables. Another example of this general issue is that survey designers often have a choice regarding the module in which to collect some kinds of information. For example,data on expenditures on fuel for cooking and heating can be collected in the consumption module, the housing module or, if it exists, the expanded fuel module. Questions on child immunization can be placed in the fertility module, the health module, or the anthropometry module.An argument can be made for choosing any of these options (see the pertinent chapters in Parts 2 and 3 of this book), but the essential point is to ensure that the information is collected at least once, and is collected twice only if there is a reason to do so.1 Appendix 3.1 provides a list of the most common types of gaps and overlaps to check. In cases in which information could be plausibly be collected in more than one part of the questionnaire, there may be no absolute right or wrong place to collect it. Rather, survey planners must take into account who the respondent is in each module, how well the best recall period for that information matches the recall periods of modules in which it might be collected, at what point in the interview the respondent might discuss the topic most naturally, and whether the topic is a sensitive one that should therefore be addressed near the end of the questionnaire (for reasons discussed further below). The survey designers should also examine any overlaps among the household, community, and price questionnaires. In general, the community and price questionnaires should collect information on any topic that varies only slightly from household to household within the primary sampling unit.2 While much of the information collected in the community and price questionnaires could be collected in the household questionnaire, it is better to collect it in the community questionnaire in order to shorten the length of each household’s interview. Collecting this information in the community questionnaire is also more efficient; why collect it for all households in a primary sampling unit (often 16 or 20 households) when it need be asked only once in the community questionnaire? Some simple examples illustrate this point. The expanded water module contains questions about the price and quality of water from different potential water sources. If the primary sampling units are geographically compact, all of the households in each primary sampling unit are likely to have the same alternative water sources, implying that the water price and quality questions can be put in the community questionnaire (which should be administered in each primary sampling unit) rather than in the household questionnaire. On the other hand, if the primary sampling unit is not compact so that the households are widely dispersed, it is likely that some households will be nearest to,say,a particular spring or well while other households will be closer to other springs or wells. In such cases these questions about alternative water sources should remain in the household questionnaire. Another example concerns the distance to schools and health facilities. In a compact primary sampling unit, the distance to the nearest school or health facility probably varies little among the households in the primary sampling unit. This means that information on the distances to schools and health facilities can be collected in the community questionnaire as opposed to the household questionnaire. Length The overall length of the household questionnaire must be manageable. In general, it is not feasible to include, say, the standard version of each module presented in Volume 3, even though past full LSMS surveys typically included 15 modules, many of which were similar to the standard versions in this book. There are several reasons why using all of the standard draft modules in this book is not feasible. First, this book introduces several new modules, including the time use module and several environmental modules. Second, some of the standard draft modules, such as those on health, migration, and household enterprises, are much longer than the modules on those topics that were used in previous LSMS surveys. Finally, in some of the chapters in this book (including Chapter 18 on household enterprises and Chapter 19 on agriculture) it is argued that collecting more detailed data will greatly increase their value for analytical purposes. Thus survey designers should not combine the standard versions of all of the modules presented in the book into a single household questionnaire.Instead,the short versions should be used for CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 47 some modules, and in almost all cases at least one or two modules should be dropped. Assessing whether a draft questionnaire is too long is not simply a matter of counting the pages or questions in it, since many questions, and sometimes even entire pages or modules, will apply only to some households. Moreover, in some cases adding questions does not lengthen the interview time because the respondent cannot avoid going through the thought process made explicit in these questions,which implies that a supposedly abbreviated set of questions will not reduce the time required to complete the interview. An example of this is the calculation of income derived from agricultural activities. There are also several ways to implement long questionnaires that minimize the time required by (and the fatigue induced in) each survey respondent. These include conducting individual “miniinterviews” with each household member to collect all of the information needed from that individual at one time (which allows him or her to leave when questions are being asked of other household members); using the best-informed respondent for each household module; and dividing the interview into multiple visits (for example, going through all the individual-specific modules in one visit and returning on a different day to conduct the consumption module and other household-level modules). LSMS surveys use all of these techniques. Still, there is a limit to the amount of information that can be gathered from a single household. How can survey designers determine whether a household questionnaire is too long? A rough idea of the effective length of the questionnaire in different circumstances can be obtained by calculating how many households will go through the different paths created by the skip patterns and how many questions will be asked for each possible path. An excellent example of this is provided in Chapter 18 on household enterprises, in Table 18.5. A more precise estimate of the time required to administer a household questionnaire can be obtained when similar surveys have already been done in the country or region studied. In this case, the designers of the new survey will be able to find out how long the interviews took in the previous survey, provided that the earlier survey collected metadata along the lines suggested in Chapter 4. If such information is not available,survey designers will need to rely on the field test, which is discussed in detail in the fourth section of this chapter. If field test interviews require many hours to complete and exhaust the cooperation and patience of households, this is an indication that the questionnaire is too long. At the same time, survey designers should realize that field test interviews normally require much more time than do similar interviews during an actual survey, because interviewers have little training or experience with the questionnaire at the time of the field test. In addition, the questionnaire used in the field test is not a final draft and thus is likely to contain some problems that will slow down the interviews. A handy rule of thumb is that interviews in the actual survey take only about half of the time that they take in the field test—and sometimes even less than that. A general goal to aim for in the actual survey is that any given respondent should not be interviewed for more than one hour on a given day. Of course, people’s tolerance for being interviewed will vary from country to country, and this general guideline must be adapted to suit local conditions.Experience in LSMS surveys to date suggests that people’s tolerance for long interviews is lower in urban areas than in rural areas, lower among wealthy households than among poor ones, and lower in wealthier countries than in poorer ones. Recall Periods The recall periods proposed for each module introduced in this book are mostly those that the authors have deemed appropriate for that particular module. This can be a problem when analysts want to combine or compare data from several modules. For example, in many LSMS surveys the employment module uses a one-week recall period. Since most adults work, this yields a large number of observations, and the period of time is short enough to yield accurate answers to such basic questions as the number of hours worked and the payments received during this recall period. In contrast, the health module uses a four-week recall period. This relatively long recall period is used because most people are not ill in any given week.The four-week recall period allows more observations of illness for a given sample size than would be obtained using a one-week recall period. Since illnesses are important events, respondents can be expected to remember many details of their episodes of illness during the past four weeks. MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 48 However, if an analyst wants to study the impact of illness on earnings or work effort, these different recall periods will complicate the analysis.The analyst cannot tell whether the illness took place before or during the period for which the earnings and hours data were collected.This could be resolved either by adding questions to the health module to specify the days during the recall period on which the respondent was ill or by making the recall periods coincide, perhaps with a compromise of two weeks for both modules (bearing in mind the disadvantages in sector-specific analyses of using a recall period different from the “ideal” one for that module). Part of the job of integrating the draft modules is to determine and judge the tradeoffs being made, either confirming that they are acceptable or altering them until a more appealing tradeoff is reached. Nomenclature and Coding Schemes The questionnaire should be reviewed to check that wherever similar questions are asked, the nomenclature and coding schemes are the same. This should reduce coding errors and simplify data analysis. For example,many different modules allow the respondent to choose the time unit (for example, hour, day, week, or month) that they find most convenient when responding to questions regarding time or payments over time (such as wage rates, the length of time spent gathering firewood, and the length of time covered by a payment for water). The code numbers for these time units should be the same throughout the entire questionnaire; in the draft modules presented in this book “day” is always coded as “3,” “week” is always coded as “4,” and so on for other units of time. Another example concerns the migration module, the transfers received page of the transfers and other nonlabor income module, and the transfer payments page of the consumption module. All have questions about where the migrant, donor, or recipient lives. The coding scheme that categorizes this information, whether it is the type of place (capital city, other urban area, rural area, or overseas) or the name of the place, should be uniform. Likewise, several modules include questions about the relationship between two individuals. It is usually a good idea for these questions to use the same codes that are used in the household roster module to indicate the relationship of each household member to the head of the household. A particularly important task is to coordinate the coding of items in the consumption expenditure module with items in the price questionnaire. As explained in Chapter 2, price data are needed to generate regional and temporal price indices that enable comparison of real expenditures of households interviewed in different places and at different times.This is done by matching the prices collected in the price questionnaire with the consumption expenditure information gathered in the consumption module. If the items are not well matched, this task becomes more difficult, and the resulting price indices will be less accurate. In general, the goal should be a one-toone correspondence between the items listed in the consumption module and the prices collected in the price questionnaire.For example,if questions are asked on two or three varieties of rice or wheat in the consumption module, a price for each variety should be collected in the price questionnaire. This should be relatively simple to do for almost all food items. Nonfood items are more difficult. It is usually not possible to obtain prices for durable goods because they often come in many varieties (for example, there are many kinds of bicycles or televisions). However, for nondurable items, prices can be obtained for well-defined examples. For example, there are many kinds of shirts, but if a specific widely purchased type of shirt can be defined, data on that type of shirt can be collected in the price questionnaire and used as an indicator of prices for all kinds of shirts. See Chapter 13 for a detailed discussion of the price questionnaire, including a list of suggested food and nonfood items to include in it. Choosing the Order of the Modules in the Household Questionnaire A final and very important question to address is the order of the modules in the household questionnaire.3 It is natural and convenient to arrange the modules in the order that they will be administered, so the key issue here is the order in which the modules will be administered and how this affects the physical design of the questionnaire. To put this issue in context, consider the traditional fieldwork plan for a full LSMS survey.Each field team works in its assigned primary sampling units (communities) twice.The first time a team arrives in a primary sampling unit, it works there for about one week. The first half of the questionnaire, most of CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 49 which usually consists of the individual-specific modules, is completed for each household. In addition, a short module is administered that asks which household members are best able to answer questions concerning the specific household-level modules (agriculture, household enterprises, consumption, and savings) that will be filled out when the team returns to the primary sampling unit about two weeks later. Figure 3.1 provides an example of such a module. The field team works in a different primary sampling unit during the following week, while the data in the half-completed questionnaires from the first primary sampling unit are entered into a computer by a data entry operator (who does not travel with the team) using a data entry program.The data entry program checks the first half of the questionnaires for a wide range of errors and inconsistencies. (This is discussed more fully in Grosh and Munoz 1996.) The team returns to the first primary sampling unit in the third week, administers the rest of the questionnaire (which mainly consists of household-level modules), and resolves any problems or inconsistencies found by the data entry program when the data from the first half of the questionnaire were entered. In several recent LSMS surveys, two different procedures have been used in the fieldwork stage. One procedure is that the data entry operator travels with the field team. This option has become feasible with the advent of small laptop computers that can be powered by batteries, vehicle cigarette lighters, or solar panels.This allows the whole questionnaire to be filled out and checked using the data entry program during a single trip to the primary sampling unit. In addition, the second half of the questionnaire can be checked by the data entry program almost immediately, so that interviewers can return to the sampled households to resolve any problems detected by the program. The other procedure, used when a scaled-down LSMS survey is being implemented, is to complete all of the interviews in a single trip to the primary sampling unit and sometimes even in single visits to each household. This procedure will have a serious disadvantage if the data entry operator does not travel with the team, because none of the data can be checked in time to return to the households to resolve problems detected by the data entry program. If the data entry operator travels with the team, there is little difference between this procedure and the former procedure, except that a full LSMS questionnaire will require more visits to each household during the sole trip to the primary sampling unit. Given these different possible fieldwork plans, there are several basic principles about how to order the modules in the household questionnaire.The first principle is that any modules on topics that respondents might consider sensitive should be put at the end of the questionnaire.This gives the interviewer time to develop a rapport with the household members, which should increase the probability that they will answer questions on sensitive issues, and do so truthfully. It also means that if the respondent breaks off the interview in response to a sensitive question, only the data from that last module or modules are lost. Finally, by this point in the interview, any interested onlookers, such as family members and neighbors, may have wandered away, making it possible to administer the more sensitive portions of the questionnaire with greater privacy. Education, housing, migration, and in some cases health4 are usually good topics with which to open the interview,because people generally do not mind talking about these topics. In contrast, fertility, savings, credit, and transfers and other nonlabor income are among the most sensitive topics in the household questionnaire. A second principle concerns bounded recall periods. In past LSMS surveys in which the interviewer made two visits two weeks apart to each household, some parts of the questionnaire used bounded recall periods; in other words, questions were asked such as “How much has your household spent on rice since my last visit?”As explained in Chapter 5,using bounded recall periods can increase the accuracy of the respondents’ answers. Obviously, if bounded recall periods are used in certain modules, these modules must be administered in a second visit to the household and thus be included in the second half of the questionnaire. The two modules in Volume 3 that explicitly use bounded recall periods are those on consumption (Chapter 5) and household enterprises (Chapter 18).5 A third consideration is the selection of respondents. As explained above, several modules (including the consumption, agriculture, household enterprises, savings, housing, and environmental modules) collect much or all of their data at the household level, which means that the questions are answered by the household member most knowledgeable about that topic. With the exception of the housing module, these MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 50 FIGURE3.1:MODULEFORCHOOSINGRESPONDENTSTOBEINTERVIEWEDINTHESECONDHALFOFTHEQUESTIONNAIRE RESPONDENT:THEPERSONBESTINFORMEDOFTHE ACTIVITIESOFTHEHOUSEHOLDMEMBERS FULLNAMEOFTHERESPONDENT:_____________________________________________IDCODE 1.Whoshopsforthefoodforyourhousehold?5.Duringthepast12monthshasanymemberof yourhouseholdparticipatedinagricultural NAME:_____________________________________IDCODEproduction,forestry,orraisinglivestock? YES.....1 NO......2(¨8) 2.Whoinyourhouseholdknowsmostaboutthe6.Whoisthepersonwhoknowsmostaboutallthe non-foodexpensesofthemembersofyourhousehold?agriculturalandlivestockactivitiesofthemembers ofyourhousehold? NAME:________________________________________IDCODE NAME:__________________________________ 7.Inadditiontothisperson,whoelseinyour householdmanagesplotsoflandowned 3.Whoinyourhouseholdknowsmostaboutthemiscellaneousorrentedinbythehousehold?Whoisresponsible incomeandtransfersreceivedfromotherhouseholds?forplotsthatarerentedoutbythehousehold? NAME:________________________________________IDCODEID NAMECODE 4.Whoinyourhouseholdknowsmostaboutthesavingsinyourhousehold? NAME:________________________________________IDCODE RESPONDENTSFORSECONDHALFOFQUESTIONNAIRE IDCODE CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 51 FIGURE3.1:MODULEFORCHOOSINGRESPONDENTSTOBEINTERVIEWEDINTHESECONDHALFOFTHEQUESTIONNAIRE 8.9.10. YES..1ENTERPRISEID NO...2IDFULLWRITTENDESCRIPTIONCODENAMECODE (»NEXTMODULE)1 2 3 4 5 Overthepast12months,hasanyonein yourhouseholdoperatedanynon- agriculturalenterprisethatproducesgoods orservices(forexample,artisan, metalworking,tailoring,repairwork,and processingandsellingyouroutputsfrom yourowncrops,ifdoneregularly)orhas anyoneinyourhouseholdownedashopor operatedatradingbusiness? Whatkindofenterprisesdoesyourhouseholdoperate? PROBETODETERMINEINDUSTRIALSECTORINWHICHENTERPRISE OPERATES. Whoismostinformedaboutand/orinchargeof day-to-dayoperationsoftheenterprise? MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 52 modules are quite lengthy. Thus for each of these modules it is usually best for the interviewer to ask which member would be the most appropriate respondent during the first visit to the household, and then make an appointment to interview that person at a later, more convenient time. In the traditional twovisit fieldwork plan, this implies that these modules, except perhaps housing, should be administered during the second visit and thus should be located in the second half of the questionnaire. However, if the team travels only once to the primary sampling unit,it is still feasible to make appointments for later in the day or for another day, which gives survey designers more flexibility in deciding where to place these modules in the questionnaire. A fourth principle relates to the logistics of data entry. The individual-level modules include many more questions for which strict range and consistency checks can be built into the data entry program than do the modules on consumption, agriculture, and household enterprises.6 If the whole questionnaire is completed using the traditional LSMS fieldwork plan (two visits two weeks apart), all the individual-specific modules except the credit module should be administered during the first interview. (The credit module is probably too sensitive to be administered during the first interview.) This will allow the survey team to enter the data from these modules and to detect any apparent errors or inconsistencies that could then be resolved in the second interview. If the data entry operator travels around with the interview team, the data from the interviews can be checked in a matter of hours; thus where these modules appear in the order of the household questionnaire becomes less important for the purposes of data entry. Given these principles, and some common sense, more specific advice can be given. Each household questionnaire should have the metadata module at the very front, since much of the information that module collects (such as whether the interviewer successfully located the household, the date of interview, and the language in which the interview was conducted) becomes apparent at the very beginning of the interview.The next module should be the household roster; this must be completed before any other module because it determines who is and who is not a household member, and thus determines the people to whom all the other modules will apply. However, at least one page of the household roster, the one with the names of all the household members, is usually placed further back in the questionnaire so that the names on that page can be seen during the administration of all individual-level modules.Thus the physical placement of this page will not reflect the time during the interview when it is filled in. (For details see the discussion on the fold-out roster page in the fourth section of this chapter.) After the household roster, it is useful to fill out the form on selecting household respondents shown in Figure 3.1; this form can be administered to the same person who answered the household roster questions (usually the head of household or the person most knowledgeable about other household members). It is useful to collect this information early because it can be used to save household members’ time by interviewing them sequentially using “miniinterviews.”That is, after the interviewer has administered the form that identifies the relevant respondents for the household-level modules, he or she should administer all of the modules that are clearly individual-specific (except the credit and fertility modules) to each household member, finishing all such modules with one member before interviewing another member. These are the education, health, employment, migration, and time use modules. Some household members will not need to be interviewed further, and thus their mini-interviews will consist of the interviewer administering only these modules. In contrast, other household members will also be the respondents for some of the household-level modules. For example, the respondent for the housing module can also provide answers for the questions in that module as part of his or her mini-interview. Using this method, the interviewer can obtain all of the information needed from each individual in a way that minimizes the use of respondents’ time; once a respondent finishes the mini-interview he or she can leave or start some activity without further interruption. Within this group of individual-level modules, those on education and migration should be administered first since the information they collect is not very sensitive. Some employment information can be sensitive, particularly questions concerning wages, so this should be one of the last of the individual-level modules to be completed, if not the very last. If the short health module is used, it can be put near the front. However, if the standard or expanded version is used, it should be placed toward the end of the individual- CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 53 specific modules because of the sensitive nature of some of the questions in this version (see endnote 4). Which modules should go near the end of the questionnaire? Because the three most sensitive modules are those that collect information on savings, credit, and transfers and other nonlabor income, these three modules should probably be put at the end of the questionnaire.Another potentially sensitive topic is fertility. In countries in which fertility is particularly sensitive, it should come immediately before the savings, credit, and transfers and other nonlabor income modules. Where should the other modules go? If the traditional “two visits two weeks apart” interview system is used, the consumption and household enterprise modules should be in the second half of the questionnaire since these modules often use a bounded recall period, namely the time since the interviewer’s previous visit. Modules that are long and also need to be administered to specific respondents—the consumption, agriculture, household enterprise, and environment modules—should also be completed in the second visit. Finally, as discussed above, the housing module can be administered in the first interview because it is unlikely to contain any sensitive questions.If the two-visit system is not used,these modules can be put anywhere between the individual-specific modules and the more sensitive modules. Finally, the goal of saving respondents’ time by conducting mini-interviews with each respondent (who can leave after his or her mini-interview is finished) is complicated by the fact that the household enterprise, agriculture, miscellaneous income, and credit modules consist of a mixture of household-level and individual-level questions. For example, in the standard and expanded versions of the agriculture module, individual household members are asked whether they have worked on specific plots of land. However, these questions cannot be asked until several other questions have been asked about the different plots of land owned and rented in by the household members—and such questions would be awkward to ask in a form as simple as the one shown in Figure 3.1. The best way to resolve this problem will depend on which modules and which versions of these modules are included in the household questionnaire, so it is difficult to provide general advice. However, one way to reduce the time burden on household members is to identify all of the people who still need to be interviewed after the “miniinterviews” are completed, as this will allow people to leave if they do not need to be interviewed further. Continuing the agriculture example, note that the form in Figure 3.1 identifies all of the household members who either manage or work on a plot of land. Household members who do not fit this description and who are not needed to complete any other household-level module can leave after their “mini-interview” is finished. This completes discussion of the fourth step of integrating the draft modules and combining them into a complete set of questionnaires. The primary focus has been on the household questionnaire, since the community questionnaire is much smaller. (See Chapter 13 for a detailed discussion of the community questionnaire.) Designers of prospective surveys can consult the questionnaires used in previous LSMS surveys by downloading them from the LSMS website, http://worldbank.org/lsms/lsmshome.html. Translating and Field Testing the Draft Questionnaires After the draft modules have been combined into a complete set of household, community, and price questionnaires, they need to be translated and field tested.7 The field test is particularly important because it is the last check on the design of the questionnaires before the survey is implemented. Translation It may be necessary to translate the questionnaires for three reasons, each of which has different implications for the design of the survey. The most common and most important reason is that respondents may speak a range of different languages. In many countries more than one language is spoken. In these countries quality control requires that a separate questionnaire be produced for each of the major languages spoken in the country,with every question written out verbatim. Scott and others (1988) demonstrated how this procedure greatly increases the accuracy of the data collected.They conducted an experiment designed to measure interviewer errors when the interviewer had to translate each question during the interview. For example, the interviewer may have had to use a questionnaire written in English to conduct an interview in Tagalog or Cebuano or a questionnaire written in MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 54 French to conduct an interview in Baoule or Dioula. The interviewers’ error rates were two to four times higher when they translated questions during the interview than when they used questionnaires already written in the languages used by the respondents. While the final versions of the questionnaires must be translated from the national (official) language to produce verbatim questionnaires in the other languages used in the country, the preliminary drafts of the questionnaire can be developed using only the national language. Ideally, the version of the questionnaire to be used for the field test should be translated into each of the languages that will have a final written version of the questionnaire. In practice, field tests are often done using only oral translations of the national language version of the questionnaire. Thus the wording in the local language interviews during the field test may not correspond exactly to the wording that will be used in the written translations of the final questionnaire.While this is an imperfect way to proceed, it is often a reasonable tradeoff given the high costs, in both time and in money, of field testing the questionnaires in each language. After the final version of the household questionnaire has been translated into another language, the translation needs to be carefully checked.The best way to do this is to use “back translation.”That is, after the questionnaire is translated from the language in which it was developed into the languages in which it will be administered, someone should translate the versions in those languages back into the original language.After this “back translation” has been accomplished, the two versions in the first language should be checked. Where there is a discrepancy in wording or meaning between the two versions, the translation should be carefully checked.A person or group of people familiar with the purpose of the questions should do the first translation. The back translation should be done by someone who was not intimately involved in designing the questionnaire. Any ambiguities and errors must be noted and corrected in the translated version rather than being “fixed” only in the back translation version. Most previous LSMS questionnaires were printed only in the national languages of the countries studied, so multilingual interviewers had to be employed to conduct interviews in the most commonly used local languages. Occasionally a few key questions or phrases were translated into the local languages and written down in the interviewers’ manual. In the case of the least common languages, local interpreters were used when none of the interviewers spoke the language. In this respect, while previous LSMS surveys have conformed to normal survey practice, they have not reached the cutting edge of quality control as defined by the World Fertility Surveys.The guidelines used in those surveys require that questionnaires be prepared in all languages used by more than 10 percent of the sample and that a minimum of 80 percent of the sample be interviewed using questionnaires written in the respondents’ native language. Future LSMS and similar multitopic surveys should make greater efforts to translate the household questionnaire into local languages. When preparing these translations, the questionnaire should always be worded in the way that the language is commonly spoken,using relatively simple terms and avoiding academic or formal language.The gap between the spoken and written languages and the difficulty of striking a balance between simplicity and precision may be greater in local languages, especially ones that are not commonly used for reading and writing.The translators should therefore be especially careful to try to find an appropriate balance. Two examples illustrate the kind of problems that can occur.The question “¿Estuvo enferma en las últimas cuatro semanas?”literally asks,in Spanish,whether the respondent was sick in the past four weeks. However, in spoken Spanish in Chile it could be understood as a polite euphemism for asking whether a woman has had a menstrual period in the last four weeks.An even more difficult problem in wording was revealed in the field test in Nepal.Apparently the most natural Nepali phrasing for “Have you been ill?” is closer to “Have you been to the doctor?”The change in meaning from what was intended appeared in the field test several times when respondents answered “No, I couldn’t afford to go,” clearly an inappropriate response to the question “Have you been ill?” The second reason why the questionnaires may need to be translated is that sometimes the international experts working on the survey design team do not speak the national language well enough to design the questionnaires in that language. This happened in the case of the Vietnam LSMS questionnaires, which were developed jointly in English and Vietnamese. In contrast, the LSMS questionnaires used in Latin American countries have been drafted only in Spanish by teams of CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 55 local and international experts who are fluent in that language. When translation is a necessary part of the development of the questionnaires, each draft of the questionnaire must be translated, which may require a substantial amount of money and can also increase the time needed for designing the questionnaires. The third and final reason for translating questionnaires is to produce a questionnaire in one or more of the major international languages (English, Spanish, or French) in order to encourage the international research community to use these data in their policy analysis. Such translations need not be done until after the final questionnaire is developed, and back translations are not needed. Field Testing After draft versions of the household, community, and price questionnaires have been assembled and (if necessary) translated, they must be tested in the field.The field test is one of the most critical steps of the survey design process. The goal is to ensure that the questionnaires are capable of collecting the information that they are intended to collect. A field test should address the adequacy of the draft questionnaires at three levels: • The Questionnaire as a Whole. Is the full range of required information collected? Is the information collected in different parts of the questionnaire consistent? Are any variables unintentionally dou- ble-counted? • Individual Modules. Does the module collect the intended information? Have all major activities been accounted for? Are all major living arrangements, agricultural activities, and sources of in-kind and cash income accounted for? Are some questions missing? Are some questions redundant or irrelevant? • Individual Questions. Is the wording clear? Do any questions allow for ambiguous responses? Are there multiple interpretations? Have all responses been anticipated and coded? It is important for a field test to include households from all major socioeconomic groups. For example, a sample should include: rural and urban households; individuals employed in the formal sector, in the informal sector, and in agriculture; and farmers in each main agroecological region, in each production scheme (independent farming,renting,sharecropping, and cooperative farming), and so forth. The households should not be selected at random. Instead, different types should intentionally be included so that all of the various situations likely to be found during the survey are observed during the field test. Experience with LSMS surveys has shown that field tests should be conducted using at least 100 households.To get enough responses for each module of the questionnaire, it may be necessary to visit additional households to conduct partial interviews in which only those modules that apply to a relatively small number of households are administered. For example, the original 100 households may not include enough pregnant women or people who have been ill in the month preceding the interview to determine whether the fertility and health modules, respectively, are well designed. In such cases survey designers should find additional households that contain pregnant women or ill people and have interviewers administer only the fertility or health module to those households.8 A field test usually takes about one month to complete—about one week for interviewer training, two to three weeks of fieldwork (interviewing), and one or two weeks to discuss the findings and finalize the questionnaires.More time is required if the final questionnaires are to be produced in more than one language, because each version of the questionnaire should be field tested. While the full field test should cover 100 or more households, much can also be learned from preliminary smaller tests. A general rule of thumb is that about half of the problems will show up in the first 10 households interviewed. In one recent field test, international experts wrote six pages of comments about a single module after interviews were completed for only three households. Such small-scale preliminary field tests are often particularly appropriate for new or difficult modules. Yet survey designers must understand that these are precursors to a full-size field test of the whole questionnaire, and not a substitute. The personnel involved in a field test should include the survey design team, a few experienced interviewers or field supervisors, and a few of the people consulted by the survey design team, including both policymakers and research analysts. It may also be helpful to include people with experience working on past LSMS or similar multitopic surveys.All of the participants should divide into a small number of teams, each of which includes at least one person with each kind of expertise. MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 56 There should only be a few teams involved in the field test, usually around three or four. Mechanisms should be set up to enable the teams to contact each other during the field test so that they can compare notes on the problems they encounter and the solutions they have tried. A good way to set up such mechanisms is to have all of the teams working together for the first few days, perhaps in the capital city.This means that the teams will be in contact with each other every evening during the period when the first and often biggest flaws in the draft questionnaire are uncovered. In some cases the team members can agree on modifications to the questionnaire during the field test itself, which allows these modifications to be field tested. Each interview during the field test should include, at minimum, the respondent, the interviewer, and an analyst or senior survey specialist. During the field test it is acceptable for the analyst or survey specialist to interrupt the interview tactfully in order to refine the wording of a question or the responses coded for it. Of course, in the actual survey the interviews should be conducted in private, and the interviewers should adhere to the wording of the questionnaire. The interviewers used in the field test should be drawn from the experienced staff of the statistical agency. They should be good interviewers—familiar with basic interviewing practices and able to distinguish between problems caused by deficiencies in the questionnaire and problems caused by their lack of familiarity with the questionnaire. The interviewers’ training should focus on the purpose of the survey and the structure and format of the questionnaire. One week of training is usually sufficient, followed by two or three weeks of household interviews. Survey planners should set aside 1–2 weeks immediately after the field test to review the field test results and debate how to modify the questionnaire in light of those results.The group involved in the field test should go through the questionnaires, module by module, and discuss any problems that arose.At this stage, the team should bear in mind that the length of time required for each interview will fall dramatically when the interviewers are well trained and have become familiar with the questionnaire. As mentioned above, the typical field test interview will be at least twice as long as the average interview in the actual survey. The data from the field test should not be entered in the computer or examined for any analytical purposes, because in most cases the sample is both nonrandom and very small. However, the questionnaires from the field test can be used to check the performance of the data entry program. The personal participation of all senior staff (including analysts) is fundamental for both the field test and its evaluation. The following anecdote illustrates this point. In one country, before the field test a manager in the statistics office asserted that collecting information on family assets would be impossible because respondents would fear that the information would be used for tax purposes. The module was included in the field test, and no unusual difficulties were encountered. But the manager who opposed the module did not witness the field test, and some of those who did participate in the field test did not participate in the module’s evaluation. Despite the successful field experience,the module was removed from the questionnaire, largely because key decisionmakers did not fully participate in the survey design process. Many small changes are generally made to questionnaires as a result of field testing, including changes in the wording of some questions, in questionnaire format, and in answer codes. If either the questionnaire’s structure or the way in which certain variables are measured is changed substantially, all of the parts of the questionnaire that have been so modified must be tested again.This can delay the survey, but one way to reduce the probability of such a delay is to begin the field test with two or more versions of the most difficult, contentious, or important modules in the questionnaire. If one version clearly works the best, there is no need to do another field test because that version has already been field tested. Ideally, the household, community, and price questionnaires should all be field tested at the same time.This allows the survey design team to evaluate all of the questionnaires together, taking into account the possibility that changes in one questionnaire may have implications for the design of the others. Simultaneous testing of the three questionnaires can also reduce travel costs since, like the household questionnaires, the community and price questionnaires should be tested in a variety of locations. Regrettably, in several past LSMS surveys the survey teams neglected to field test the community and price questionnaires,concentrating solely on the household questionnaire.The community and price modules were tested late and haphazardly or, in some cases, not CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 57 tested at all.It is probably not coincidental that the users of the data from many previous LSMS surveys have often had more complaints about the community and price data than about the household data. If there is not enough staff time to test all three questionnaires at once, it is important to ensure that separate,rigorous field tests are done of the community and price questionnaires. The health and education modules discussed in this chapter often include detailed facility questionnaires (in other words, school or health clinic questionnaires), which can be very complex (see Chapters 7 and 8 for details).It is essential to field test these facility questionnaires.During the field test the survey team should be sure to visit each type of facility covered by the facility questionnaire. For example, field testing a health facility questionnaire should involve visits to public health posts, public clinics, private doctors’ offices, public hospitals, and private hospitals in both urban and rural areas. Similarly, field testing a school questionnaire should involve visits to public and private schools, primary and secondary schools, and schools in urban and rural areas. Since field testing a facility questionnaire is a major undertaking in its own right, it is probably best to conduct such a field test separately from the field tests of the other questionnaires. Rules for Formatting Survey Questionnaires The formatting of survey questionnaires is not a separate step in the overall survey design process. Rather, it influences how the third, fourth, and fifth steps are carried out. Good questionnaire formatting can make a tremendous difference in the quality of the data collected.This section discusses formatting in detail, making very specific recommendations about how questionnaires should be formatted.9 There is, of course, more than one way to format household survey questionnaires. Most of the benefits of good formatting come from selecting a formatting convention and following that convention consistently, rather than choosing the “best” convention from among several possible options. For example, in LSMS questionnaires uppercase and lowercase letters are used to distinguish words spoken aloud during the interview from instructions to the interviewer, but this could be done in other ways, such as using different colors or different fonts. Once a convention is selected, it is extremely important to use it consistently throughout the whole questionnaire.The convention chosen should be the one that is clearest and most likely to minimize the possibility of errors.The draft questionnaires presented in this book follow the formatting conventions explained in this section, which have been used frequently in past LSMS surveys, with successful results. Questionnaire format is important because a good format minimizes potential interviewer and data entry errors, which improves the accuracy of the data and reduces the time needed to check the data before making them available to data analysts.The objectives underlying a given survey can occasionally have implications for formatting, so some aspects of formatting will vary from country to country. Even so, almost all of what has been learned about questionnaire format in previous LSMS surveys will be applicable to new surveys. Thus the formatting guidelines presented in this section are recommended for all LSMS and similar multitopic surveys, and for other surveys as well. Identifiers Every person or object for which data are collected in a survey must be uniquely identified. This usually requires two or three separate codes. The first code identifies the household. The second code identifies the person or object of interest, such as an individual household member, a household business, or a plot of land. Sometimes there is a third code, which applies, for example, to all children ever born to each woman in the household or to the assets of each business operated by the household. Whenever possible, the identification codes for the second or third levels of observation should be preprinted on the questionnaire pages to which they pertain. For example, the individual identification code for each household member should be printed on all pages that collect data on individual household members.This ensures that the codes cannot be omitted and avoids any errors that would occur if the interviewer were to write down the wrong codes. An example of these codes appears in the far left column of Figure 3.2, which presents the short version of the education module. The importance of adequate identifiers is so obvious that it is hard to believe mistakes can be made, but they can. In one health survey the questionnaire consisted of two sheets of paper stapled together. One contained information on the household, while the other contained information on individuals. In order MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 58 FIGURE3.2:ILLUSTRATIONOFINDIVIDUALIDENTIFICATIONANDSKIPCODES(EDUCATIONMODULESHORTVERSION) 1.2.3.4.5.6.7.8.9.10.11. I D C O D E Haveyou ever attended school? Areyou currently enrolled in school? Whatisthe highest gradeyou have completed inschool? Whatis the highest diploma youhave attained? Were you enrolled inschool during thepast 12 months? Inwhat gradeare you currently enrolledin school? Whatisthe highest diploma youhave attainedso far? Isthe schoolyou are currently enrolledin publicor private? Haveyou ever repeated agradeof school? Howmany timeshave yourepeated agradeof school? A.TuitionB.ParentC.Uni-D.Text-E.OtherF.Meals,G.Other andotherAssoci-formsbooks?educational materials transpor-expenses YES..1YES..1requiredationandtation(extraYES..1 NO...2YES..1(»9)fees?fees?other(exerciseand/orclasses,NO...2NUMBEROF (»NEXT(»6)NO...2clothing?books,lodging?optional(»NEXTREPEATED PERSON)NO...2(»10)pens,etc.)?fees)?PERSON)GRADES 1 2 3 4 5 6 7 8 9 10 11 12 PUBLIC..1 PRIVATE SECU- LAR....2 PRIVATE RELI- GIOUS..3 ¨¨NEXT PERSON Howmuchhasyourhouseholdspentonyoureducationinthelast12 monthsfor: PUT CODES FOR DIFFER- ENT GRADES HERE PUT CODES FOR DIPLOMAS HERE PUT CODES FOR DIFFER- ENT GRADES HERE PUT CODES FOR DIPLOMAS HERE CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 59 to facilitate data entry, the two pages of the questionnaire were separated. Unfortunately, the household identifier was not put on the page for individuals, making it impossible to link the two parts of the survey with each other after the data were entered. Questionnaire Layout The LSMS questionnaires are designed so that only one copy of the questionnaire is needed for each household. In contrast, some surveys use one household questionnaire and a separate set of individual questionnaires.This requires that household identification codes be copied perfectly onto all of the individual questionnaires.While perfection is always sought,it is rarely achieved,and separate questionnaires create the risk of improper matching. This is illustrated in the case of the Russian Longitudinal Monitoring Survey. Although care was taken to ensure accurate coding and matching, many errors were introduced. For the first round of the survey, which was held in the summer of 1992, there were 3 percent fewer individual questionnaires than had been expected given the number of household members identified in the household questionnaires. By the summer of 1993, in the third round of the survey, this discrepancy had grown to about 9.5 percent. Putting all of the information into a single household questionnaire implies the need for a grid of some kind whenever there are two or more of a particular unit of analysis in a household. For example, a household often includes several people, may have several plots of land,and may grow several different crops.The grid typically used in LSMS surveys has questions arranged across the top and units of observation (people,plots,or crops) down the side;in other words,each question is a column and each unit of observation is a row. An example of this is shown in Figure 3.2; note that the identification codes for the units of observation (household members) are printed on the left side of the grid page. Sometimes the interviewer must fill in the code in the first column,as in Question 2 of Figure 3.8 (which is discussed below), but this practice should be minimized to reduce the possibility of introducing errors when writing down such codes. In the grids for individuals, the lines can be differentiated by alternating shaded and unshaded blocks (as in the draft modules inVolume 3 of this book) or by using a different color for each row or block of rows. This helps an interviewer record the information on the correct line. Exceptionally large households sometimes have so many members that there are not enough lines in the grids for all household members. In these cases a second copy of the household questionnaire will be required, and care must be taken to ensure that the right household and individual numbers are used. As explained in Chapter 4, a coding scheme is needed to distinguish between the first and second copies of the questionnaire filled out for large households. For example, the individual numbers in the second copy should be changed to start with 13 instead of 1 (assuming that the first questionnaire has room for 12 household members).This is a reasonable approach for large households, but it also introduces a potential source of error; survey designers should set the format of the grids to accommodate as many individuals as is practical. Previous LSMS questionnaires have typically had space for 12–15 individuals. In cases where the unit of analysis is such that there is only one observation per household (for example, one dwelling per household), the questions pertaining to that unit can be arranged in a single column down the page. One problem with a single column of questions is that much of the page is left blank. To save paper, two or more columns may be put on one page, as long as it is clear that there is no horizontal relationship among the questions in the different columns.An example of this format is provided in Figure 3.3, which shows the first page of Part C of the standard housing module. Fold-Out Roster Page The household roster page of the household questionnaire is printed so that it extends to the left of the pages that pertain to individuals in the household. Most importantly, the names of each individual member of the household on the roster page are visible when filling out the other individual-specific pages of the household questionnaire.This has been done four different ways in LSMS surveys,as illustrated by Figure 3.4. In the method shown in Format 1, the sheets in front of the roster are shorter than the cover, the roster, and the sheets that follow the roster. The most common method is shown in Format 2. The roster sheet is folded out to extend beyond the body of the questionnaire and its covers. In Formats 1 and 2 the roster page is placed behind all of the pages that pertain to individuals, so that the names on the household MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 60 FIGURE3.3:ILLUSTRATIONOFPRECODING(PARTCOFHOUSINGMODULE) 7.Doyouhavelegaltitletothedwellingoranydocumentthatshows ownership? 1.Isthisdwellingownedbyamemberofyourhousehold?YES...........................1 NO............................2 YES.......................1 NO........................2(»11)8.Whattypeoftitleisit? 2.Howdidyourhouseholdobtainthisdwelling? 9.Whichpersonholdsthetitleordocumenttothisdwelling? WRITEIDCODEOFTHISPERSONFROMTHEROSTER 1STIDCODE: 3.Howmuchdidyoupayfortheunit?2NDIDCODE: 4.Ifyoumakeinstallmentpaymentsforyourdwelling,whatistheamountof10.Couldyousellthisdwellingifyouwantedto? theinstallment? YES.......................1 WRITEZEROIFTHEHOUSEHOLDDOESNOTMAKENO........................2(»13) INSTALLMENTPAYMENTS AMOUNT(UNITSOFCURRENCY)11.Ifyousoldthisdwellingtodayhowmuchwouldyoureceiveforit? TIMEUNIT AMOUNT(UNITSOFCURRENCY) 5.Inwhatyeardoyouexpecttomakeyourlastinstallmentpayment? 12.Estimate,please,theamountofmoneyyoucouldreceiveasrentifyou YEARletthisdwellingtoanotherperson? 6.DoyouhavelegaltitletothelandoranydocumentthatshowsAMOUNT(UNITSOFCURRENCY) ownership?TIMEUNIT YES.......................1 NO........................2 13.Doyourentthisdwellingforgoods,servicesorcash? YES.......................1 NO........................2(»26) FULLLEGALTITLE,REGISTERED..1 LEGALTITLE,UNREGISTERED.....2 PURCHASERECEIPT..............3 OTHER.........................4PRIVATIZED.............................1 PURCHASEDFROMAPRIVATEPERSON........2 NEWLYBUILT............................3 COOPERATIVEARRANGEMENT................4 SWAPPED................................5(»6) INHERITED..............................6(»6) OTHER..................................7(»6) TIMEUNITS:DAY........3MONTH.......6YEAR..9 WEEK.......4QUARTER.....7 FORTNIGHT..5HALFYEAR...8 ¨¨QUESTION28 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 61 FIGURE3.4:ROSTERARRANGEMENTS A4 A4 MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 62 roster page are visible whenever individual questions are asked. An innovation in the Kagera Health and Development Survey inTanzania was to make the roster page a removable card, as shown in Format 3.This was useful because the survey was designed to be administered four times—every six months for two years—to the same households. The roster card was inserted into a pocket in the back of the questionnaire in the first round of the survey. When the second round started, the roster card was removed from the first questionnaire and placed in the back pocket of the second questionnaire. In this way, individuals retained the same identification codes in each round. A few follow-up questions guaranteed that individuals who moved in or out of the household or were born or died between rounds were counted appropriately. In four rounds of interviews conducted over two years for 800 households, none of the roster cards was lost. However, this success may reflect the intensive supervision carried out by the organizers of that survey, as well as the relatively small sample size. This option should probably not be used in situations with significant quality control problems. Format 4 was used in theTunisia questionnaire. In this format each page is oriented as “portrait” (a vertical page) rather than as “landscape” (a horizontal page) and is spiral-bound so that it opens flat. Each questionnaire page then consists of the full 11 x 17 inches of the two-page spread.The roster folds out to the left. In all four cases the line for each individual member of the household on the roster page is aligned with the corresponding lines on the other individual-specific pages of the household questionnaire. A final point regarding the fold-out roster page is that it may be useful to have more than one such page per questionnaire. A fold-out roster will be useful whenever there are several pages of questions for the same level of analysis and especially when there are many rows on the grid. For example, in the agricultural module one might make rosters for crops grown or for plots of land.A fold-out roster page would be particularly helpful for the household enterprise module. Precoding All of the potential responses to almost all of the questions in the questionnaire should be given code numbers so that the interviewer records only code numbers, as opposed to words or phrases, on the questionnaire. In most cases these response codes should be printed directly in the box where the question appears, or next to the question if there is no box around it.Where the list of codes is lengthy and applies to several questions, it should be placed in a special box on the border of each page for which it is needed.Alternatively, if a list is very long it can be printed on the back of the preceding page (making it visible when the interviewer fills out the page in question). An example of a box on a border of a page is the time unit box shown at the bottom of Figure 3.3. In past LSMS surveys fewer than a dozen questions on the household questionnaire have required the interviewer to write down words or phrases that are given codes, usually by someone else, after the interview.Precoding allows the data to be entered into the computer straight from the completed questionnaire, thus eliminating the time-consuming and errorprone step of transcribing codes onto data entry sheets. Precoding requires that response codes be clear, simple, and mutually exclusive, that they exhaust all likely answers, that respondents will not all provide the same response, and that none of the codes apply to only a handful of respondents. Designing adequate response codes requires extensive knowledge of the phenomenon being studied as well as careful field testing.A standard technique to ensure that the codes are mutually exclusive is to add a qualifier where more than one answer could apply—asking, for example, “What was the main reason for dropping out of school?” Other standard qualifiers are “What was the first (or last, or principal) reason for...?” Alternatively, spaces can be provided for multiple responses, with an instruction to code all responses (up to, say, the two or three most important) that apply. A standard technique to ensure that codes encompass all possible answers is to add an “other (specify _______)” code to questions for which an explicit enumeration of each possible response is impossible or inconvenient. In past LSMS surveys the detailed answers were almost never coded, so analysts usually put all “other” responses into a single residual category. One way to increase the probability that the information recorded in the “other (specify _______)” answers will be used at a later date is to enter it (as text) into the computer, without assigning any codes to the responses. This allows analysts to code any answers that were not precoded in the data released to CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 63 the public. It also allows the designers of subsequent surveys in the same country to review the answers that were written in (especially in cases in which a significant percentage of the responses were coded “other”) and to modify their coding lists accordingly. In particular, if most of the “other” responses fall into a single, well-defined category, this category should have its own code in any subsequent survey. There is, of course, a limit to the kind of material that can be covered even by well-designed, precoded questions. But this limit may be less of a disadvantage than it first appears. Because most analyses of LSMS surveys use sophisticated quantitative techniques, it is difficult for these analysts to make use of the exploratory, qualitative information gathered in openended questions. So even if such questions were asked, the answers to these questions would not be used much in analysis. If it is clear that some analysts do need extensive information of an exploratory, qualitative nature, the designers of a prospective survey may wish to adopt a different data collection instrument or even a new research technique. See Chapter 25 for a thorough discussion of qualitative data collection alternatives. Verbatim Questions with Simple Answers All questions in LSMS surveys are written out in their entirety and are meant to be read out verbatim by the interviewer.This is done to ensure that questions are asked in a uniform way, since different wordings may elicit different responses. For example, the answers that a respondent gives to “Can you read?” and to “Can you read a newspaper or magazine?” will probably be somewhat different. Other changes may subtly alter the time period referred to, as in the change from “Have you worked since you were married?” to “Did you work after you were married?” Scott and others (1988) discuss some rigorous field experiments that compared such verbatim questionnaires with questionnaires in which the topic was given for each question but the exact wording was not. When the questionnaire that did not contain the exact wording was used, 7 to 20 times more errors occurred than when the verbatim questionnaire was used. When choosing the wording of questions, it is important to use terms that reflect the language as it is commonly spoken. Using language that is too formal or academic will make the interview stilted and unnatural. For example, “Did you spend any time doing housework?” followed, if necessary, by “...such as cooking, mending, doing laundry, or cleaning?” is better than “Did you spend any time engaged in domestic labor, for example, preparing food, repairing clothes, cleaning clothes, or cleaning house?” It is not always easy to find terms that are simple, short, and yet precise, but that should always be the goal. In most cases the interviewer reads the question aloud and marks the questionnaire with the code for the answer given by the respondent. For example, for the question,“Are you currently enrolled in school?” the interviewer writes down a 1 for “yes” or a 2 for “no.” For some questions the response categories are part of the question—for example,“Is the school you are currently enrolled in public or private?”There may also be a few questions for which the wording of respondents’ answers may vary even though the meaning is the same.The best thing to do in such cases is to have the interviewer read out all of the response categories. For example, in Question 4 of Figure 3.5, after reading “Compared to your health one year ago, would you say that your health is...” the interviewer should read the responses “much better now,”“somewhat better now,” “about the same,” “somewhat worse,” and “much worse.” If necessary, the interviewer can explain the differences between the various response categories. However, the reading out of response categories should be used as little as possible, because respondents may not listen to the full list before answering, which can lead to errors. The answers to the questions must be kept simple. This means that additional filter questions are often needed.Adding enough filter questions to ensure simple answers can make the number of questions and skips seem high.Many survey designers are tempted to shorten the questionnaire or simplify the skip pattern in a way that results in complex questions and answers. This should be avoided since it will confuse some respondents and is unlikely to save time. Survey designers yielded to this temptation in the agricultural module of the 1987–88 Ghana LSMS survey. In that module the following question was asked: “Do you or the members of your household have the right to sell all or part of their land to someone else if they wish?” The precoded answers (which were not read out to the respondents) were “Yes,” “No,” “Only after consulting family members who are not household members,” and “Only after consulting the chief or MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 64 FIGURE3.5:ILLUSTRATIONOFCASECONVENTIONS(HEALTHMODULESTANDARDVERSION) 1.2.3.4.5.6.7.8.9. I D C O D E ISTHIS PERSON ANSWER- INGFOR HIMSELF/ HERSELF? COPYTHEID CODEOF THE RESPOND- ENTFROM THE HOUSEHOLD ROSTER Duringthelastfour weeks,howmany daysofyour primarydaily activitiesdidyou missduetopoor health? Comparedwithyourhealthone yearago,wouldyousaythatyour healthis: [READOUTANSWERSTO RESPONDENT] CHECKTHEAGEIN YEARSOFTHIS PERSON Did[NAME] experience diarrheain thelast7 days? Wasit mixedwith blood? Wasit mixedwith mucus? Wasita paleliquid? Muchbetternow......10-6.............1 Somewhatbetternow..27-14............2 YES..1Aboutthesame.......3(»NEXTSECTION)YES..1YES..1YES..1YES..1 (»3)Somewhatworse.......415-39.......3(»24)NO...2NO...2NO...2NO...2 NO...2IDCODEDAYSMuchworse...........540ANDOVER.4(»11)(»11)1ST2ND3RD 1 2 3 4 5 6 7 8 9 10 11 12 10. ¨¨NEXT SECTION REDUCEDFOODOR LIQUIDGIVENTO CHILD...........1 GAVESPECIALFOODS TOCHILD........2 ORALREHYDRATION THERAPY.........3 OTHER (SPECIFY_____)...4 NOTREATMENT.....5 Howdidyoutreatit? CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 65 the village elders.” It is not clear whether the respondents could distinguish between the simple yes answer and the yes answer qualified by the need for consultation.Thus a different formulation might have been better. The question could have been left as is but using only simple “yes” and “no” codes.Then the interviewer could have put a second question to those who answered “yes,” worded as follows: “Do you need to consult with anyone outside the household before selling the land?”The response codes would be “Yes” and “No.”Then a third question would be put to those who answered “yes” to the second:“Whom must you consult?”The response codes for this question would be for “family member,”“village elders,”and other appropriate categories.This formulation would have made the questionnaire longer in terms of the number of questions but would probably not have increased the interview time since some sort of probing probably occurred in the Ghana LSMS when the “yes” answer was given. More importantly, keeping questions and answers simple makes the interpretation of the data much clearer. Skip Codes Skip codes are used extensively in LSMS questionnaires. Skip codes tell the interviewer which question to proceed to after finishing the current question. Some skip codes apply only when a particular answer is given. In such cases an arrow and the number of the question to skip to are positioned in parentheses next to or below the individual response to which the code applies. An example is given in Question 2 of Figure 3.2. If the answer to Question 2 is “yes,” the interviewer should skip Questions 3, 4, and 5 and proceed to Question 6. If the answer to Question 2 is “no,” the interviewer should proceed to Question 3. In Question 1 a similar construction is used, but when the answer is “no” the interviewer is instructed to skip all the remaining questions in the module for this respondent and proceed to interview the next person. Another kind of skip instruction applies regardless of the response given to the question.When an arrow and a question number or instruction are placed in a box separate from the response codes, the skip instruction contained in the box applies regardless of what answer is given. An example of this is given in Question 10 of Figure 3.5. There are several advantages to extensive, explicit skip codes.Interviewers do not have to make decisions themselves, nor do they need to remember complicated rules printed in the manual rather than on the questionnaire.This helps ensure that instructions will be followed uniformly.Well-placed skip codes ensure that inapplicable questions are not asked.(Asking inapplicable questions irritates respondents, wastes interview time, and confuses data analysis.) Finally, explicit skip codes imply that a “not applicable” code is almost never used in LSMS questionnaires. One way to check skip codes is to develop a flow chart of the questions in each module. Flow charts are useful both for checking the logic of the questionnaire and for training interviewers. Figure 3.6 presents a flow chart of a typical health module used in past LSMS surveys (which differs in several notable ways from the health module presented in Volume 3). The proportions of people who answer yes at each branch are recorded based on results from several previous LSMS surveys.The numbers of individuals that would be asked each set of questions are shown on the left, assuming a base of 10,000 individuals in the sample.The flow chart makes it easy to check whether the skip patterns lead people through the module correctly. For example, it is possible to check that the question on health insurance is asked of all household members, not just of those who are ill. Analyzing the whole household in this way gives survey designers a better sense of the likely length of time it will take to complete each interview than does the number of pages or number of questions in the questionnaire, because many questions will be skipped for many individuals. (For further discussion of the length of the questionnaire see the second section of this chapter.) Case Conventions Everything that the interviewer should read aloud should be written in lowercase letters. Instructions to the interviewer should always be written in uppercase letters.10 Answer codes should also be written in uppercase, unless they are to be read aloud to the respondent.This makes it easy to include instructions on the questionnaire as opposed to relying on the interviewers’ memory of the manual or of instructions that they were given during their training. In Figure 3.5 instructions to the interviewer are printed in Questions 1, 2, 4, and 5.These are in uppercase, as are the answer codes in Questions 1 and 5. (The answer codes in Question 4 are in lowercase because they are to be read aloud to the respondent.) MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 66 Enumeration of Lists There are two methods of gathering information about long lists of items.A typical LSMS questionnaire may use either method depending on particular circumstances. Consider the case in which one expects that a large proportion of the items on the list will apply to most households. For each item on this list a line is put in the grid and the name and code number of the item is printed on the questionnaire.This approach is used in the consumption module, as shown in Figure 3.7. Although several dozen items are included, it is expected that most households will have consumed many of them.The first question is “Has your household consumed [FOOD] during the past 12 months?”The interviewer first goes down the whole list asking this “yes or no” question. Then the interviewer returns to the first item that was consumed and asks all the follow-up questions for that item before proceeding to the next item. The complete FIGURE 3.6: FLOW CHART OF HEALTH MODULE USED IN PREVIOUS LSMS SURVEYS 10,000 1 Were you ill or injured in the last 4 weeks? YES (10-45%) 1000-4500 2 How many days in the last 4 weeks did you have to stop doing your usual activities? 3 Was anyone consulted? YES (40-80%) 400-3600 4 Who was consulted? 5 Where did you go for that consultation? 6 What was the cost of the consultation? 7 What means of travel did you use? 8 How long did it take to get to the place of consultation? 9 How much did you spend on travel costs? 10 How long did you have to wait? 11 Did you have to stay overnight at the clinic or hospital? YES (5-8%) 20-288 12 How many nights did you stay? 13 How much did you have to pay? 1000-4500 14 Did you buy any medicines for this illness or injury? YES (60-90%) 600-4050 15 How much did you spend on medicines? 10,000 16 Do you have health insurance? NO NO NO NO NEXT PERSON CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 67 FIGURE3.7:ILLUSTRATIONOFCLOSE-ENDEDLIST(PARTBOFCONSUMPTIONMODULE) PURCHASESSINCELASTVISITPURCHASESTYPICALMONHOMEPRODUCTIONGIFTS 1.2.3.4.5.6.7.8.9.10. Havethe membersof your household boughtany [FOOD] sincemy lastvisit, thatissince [DAY/DATE ]? Howmuch didyoupay intotal? Howmany monthsin thepast12 monthsdid your household purchase [FOOD]? Howmuch doyou usually spendon [FOOD]in oneofthe monthsthat you purchase [FOOD]? Howmany monthsin thepast12 monthsdid your household consume [FOOD]that yougrewor producedat home? Whatwas thevalueof the[FOOD] you consumed inatypical monthfrom yourown production? Whatisthe totalvalue ofthe [FOOD] consumed thatyou receivedas agiftover thepast12 months? YES.1 NO..2 NOYESCODE(»5)CURRENCYAMTUNITMONTHSCURRENCYMONTHSAMTUNITCURRENCYCURRENCY Wheat(grain)1 Wheat(flouror maida)2 Maize(flouror grain)3 Jawar/Bajra4 FInerice(basmati)5 Coarserice6 Other grains/cereals7 Gram8 Dal9 Groundnuts10 Liquidvegetable oils(dalda)11 Ghee,Desighee12 Freshmilk13 Inthefollowingquestions,Iwanttoask aboutallpurchasesmadeforyour household,regardlessofwhichperson madethem. Hasyourhouseholdconsumed[FOOD] duringthepast12months?Please excludefromyouranswerany[ITEM] purchasedforprocessingorresaleina householdenterprise. PUTACHECK(ߛ)INTHE APPROPRIATEBOXFOREACH FOODITEM.IFTHEANSWERTO Q.1ISYES,ASKQ.2-13. IFNONE WRITE ZERO, ¨7 IFNONE WRITE ZERO, ¨10 IFNONE, WRITE ZERO Howmuch didyou buy? Howmuch didyou consumein atypical month? UNITCODES: USECODES WITHSTAR WHENEVER POSSIBLE KILO*....1 GRAM*....2 POUND*...3 OUNCE*...4 LITER*...5 CUP*.....6 PINT*....7 QUART*...8 GALLON*..9 BUNCH...10 PECK....11 BUSHEL..12 TIN.....13 PIECES..14 DOZENS..15 BOTTLES.16 MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 68 FIGURE3.7:ILLUSTRATIONOFCLOSE-ENDEDLIST(PARTBOFCONSUMPTIONMODULE) PURCHASESSINCELASTVISITPURCHASESTYPICALMONHOMEPRODUCTIONGIFTS 1.2.3.4.5.6.7.8.9.10. Havethe membersof your household boughtany [FOOD] sincemy lastvisit, thatissince [DAY/DATE ]? Howmuch didyoupay intotal? Howmany monthsin thepast12 monthsdid your household purchase [FOOD]? Howmuch doyou usually spendon [FOOD]in oneofthe monthsthat you purchase [FOOD]? Howmany monthsin thepast12 monthsdid your household consume [FOOD]that yougrewor producedat home? Whatwas thevalueof the[FOOD] you consumed inatypical monthfrom yourown production? Whatisthe totalvalue ofthe [FOOD] consumed thatyou receivedas agiftover thepast12 months? YES.1 NO..2 NOYESCODE(»5)CURRENCYAMTUNITMONTHSCURRENCYMONTHSAMTUNITCURRENCYCURRENCY Inthefollowingquestions,Iwanttoask aboutallpurchasesmadeforyour household,regardlessofwhichperson madethem. Hasyourhouseholdconsumed[FOOD] duringthepast12months?Please excludefromyouranswerany[ITEM] purchasedforprocessingorresaleina householdenterprise. PUTACHECK(ߛ)INTHE APPROPRIATEBOXFOREACH FOODITEM.IFTHEANSWERTO Q.1ISYES,ASKQ.2-13. IFNONE WRITE ZERO, ¨7 IFNONE WRITE ZERO, ¨10 IFNONE, WRITE ZERO Howmuch didyou buy? Howmuch didyou consumein atypical month? UNITCODES: USECODES WITHSTAR WHENEVER POSSIBLE KILO*....1 GRAM*....2 POUND*...3 OUNCE*...4 LITER*...5 CUP*.....6 PINT*....7 QUART*...8 GALLON*..9 BUNCH...10 PECK....11 BUSHEL..12 TIN.....13 PIECES..14 DOZENS..15 BOTTLES.16 YogurtandLassi14 MilkPowder15 BabyFormula16 Sugar(refined)17 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 69 enumeration of items consumed is done before asking the follow-up questions so that respondents will not be tempted to say that they have not consumed something in order to shorten the interview by avoiding the follow-up questions. This temptation is prevented because the enumeration is done before the respondent finds out that there will be follow-up questions on each item enumerated. A second approach is useful when it is expected that only a few of many possible items will pertain to any one household. Consider Figure 3.8. The large grid on the right contains lines for several durable goods owned by the household, but these are not precoded. Rather, the respondent is asked, using the small grid on the left, whether the household owns certain durable goods. In this example 12 durable goods are considered, but in some cases 20–30 goods have been listed.Most households own only a few durable goods. For all durable goods owned by the household, the interviewer lists the name and the code number in the large grid to the right in Figure 3.8, and asks a series of questions about each good. If the household owns two or more of the same durable good, one line is filled out for each good owned. Probe Questions There are some kinds of information that respondents may accidentally not provide. In such cases the questionnaire includes instructions to the interviewer to ask further “probing” questions on the subject. An example of this is Question 9 of Figure 3.1. Suggested probing questions are usually included in the interviewers’ manual and occasionally included in the questionnaire itself. Probe questions are often used to ensure that all items in a respondent-determined list have been reported to the interviewer, or to ensure that the respondent’s answer is properly classified by the interviewer. Interviewers are also asked to probe for answers to questions that ask “how much...?” (This kind of question is commonly found in the consumption, agriculture, and household enterprise modules.) Interviewers should be thoroughly trained to ensure that they fully understand what information to probe for, and how to do so. Because the interviewer is trained and instructed to probe for information, there should be very few answers of “don’t know” and thus very few codes for “don’t know” in the questionnaire. In the exceptional case when even a sound interviewing technique does not produce an answer, the interviewer is instructed (in the interviewers’ manual and in training) to write “DK” (for “don’t know”) in the space reserved for an answer code. Such responses are given a special nonnumeric code in the data entry program. The end result for analysis is much the same as having a “don’t know” code for each question. However, this system has the advantage that it discourages interviewers from accepting “don’t know” answers too easily, which they may be tempted to do to speed up the interview. Moreover, the special non-numeric code for such responses is glaringly obvious when the supervisor reviews the questionnaire. Letting Respondents Choose Units For many questions that involve payments or quantities, respondents are allowed to give their answers in whatever units they find most convenient.Examples of this are found in Figure 3.3. In Questions 4 and 12 the code of the time unit in which the respondent replies is placed in the box marked “time unit.”The codes are provided in a box at the bottom of the page. Allowing the respondent to select the time unit means that transactions are expressed in the units in which they normally occur, which may differ from household to household or from person to person. This avoids inaccuracies in conversion. For example, a person paid $510 per week will respond precisely if allowed to respond on a per-week basis. If forced to respond in terms of dollars per month, the respondent might round the figure down to $500 for ease of multiplication and calculate each month as being equivalent to four weeks.The annualized figure would thus become $24,000 instead of the $26,520 that would be reported if the respondent were allowed to report on a per-week basis and the data analyst then calculated the respondent’s annual rate from that answer. Of course, data analysis is always slightly more complicated when respondents’ answers must be converted in order to arrive at annualized figures, but, since a computer can easily do this, this disadvantage is trivial. However, it is very important to ensure that, where necessary, the questionnaire explicitly asks the respondent how many times per year the payments are made. For example, a worker who reports a daily wage rate may be employed only intermittently. In this case, the questionnaire should ask the respondent how many weeks or months he or she has worked during the preceding 12 months (see Chapter 9 for details). MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 70 FIGURE3.8:ILLUSTRATIONOFOPEN-ENDEDLIST(PARTEOFCONSUMPTIONMODULE) 2.3.4.5.6.7. I T E M Howmany yearsago didyou acquirethis [ITEM]? Didyoupurchase itorreceiveitas agiftorpayment forservices? Howmuchdid youpayforit? Howmuch wasitworth whenyou receivedit? Ifyouwanted tosellthis [ITEM]today, howmuch wouldyou receive? PURCHASE..1 GIFTOR PAYMENT..2 ITEMCODEYESNODESCRIPTIONCODEYEARS(»6)CURRENCYCURRENCYCURRENCY Stove201 1 Refrigerator202 2 WashingMachine203 3 Sewing/knittingmachine204 4 Fan205 5 Television206 6 Videoplayer207 7 Tapeplayer/CDplayer208 8 Camera,videocamera209 9 Bicycle210 10 Motorcycle/scooter211 11 Carortruck21212 13 14 15 16 1.Doesyourhouseholdownanyofthe followingitems? DETERMINEWHICHDURABLESTHE HOUSEHOLDOWNSBYASKINGQ.1.FOR EACHDURABLEOWNED,WRITETHE DESCRIPTIONANDCODEINTHESPACE PROVIDEDUNDERQ.2,ANDPROCEEDTO ASKQ.3-7FOREACHITEM. LISTALLTHEITEMSOWNEDBY THEHOUSEHOLD,THENPROCEED TOASKQ.3-7. ¨¨7 CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 71 A particular place in the questionnaire where it is useful to allow respondents to choose their own units is in the “quantities produced “ questions in the agriculture module. In Ghana, for example, respondents were allowed to give answers in 22 different kinds of units (Table 3.1). A serious problem for analysts who want to convert these different quantities to a single standard unit is that only about half of the units used in this example were standardized, and some of the standardized units were local terms (such as minibag and maxibag) that would be unknown to anyone not familiar with farming in Ghana.11 In the case of standardized local units, the survey team should ensure that such terms are defined (in terms of international standardized units) in a basic information document that includes all of the information that data users will need to analyze the data. Respondent Codes It is sometimes useful to know who is answering a certain section of the questionnaire.In general,each household member should answer for himself or herself, but this is not always possible. For example, a household member may be away during the entire week when the field team is working in his or her community.To indicate who is providing the information,a question can be inserted that asks the interviewer whether the person is answering for himself or herself. If someone else is providing the information,the interviewer should fill in the identification code of that person.An example of this is shown in Figure 3.5.This information is useful because a proxy respondent may give less accurate information than the individual who is actually involved in the activity in question. For example, one household member may not know the exact salary of another. Therefore, some analysts may wish to identify any possible biases introduced by the proxy respondents or to omit their responses altogether. Cardstock Covers LSMS questionnaires are usually printed with cardstock covers—covers made of very thin cardboard similar to the cardboard used in file folders. In some past surveys it was decided not to use these covers because of their added cost, but this led to the problem that the front and back pages of the questionnaire occasionally came loose. Since the front page usually carries the key household identifier information and the back page sometimes contains the household roster, any such loss is likely to render the rest of the questionnaire useless. Thus cardstock covers are well worth their cost. Identifying Sections The household questionnaire contained in a prototypical full LSMS survey can be very bulky. The Nepal questionnaire, for example, had 70 pages.Therefore, it is useful to devise some ways to make it easy for readers to find their way around in these questionnaires.A few ideas are listed here, and there may well be more. First, it is useful to have page numbers on each page and a table of contents listing the sections (and their page numbers) at the beginning or end of the household questionnaire. Second, some inexpensive graphic techniques can be used to divide the questionnaire into smaller parts. For example, some sections of the questionnaire can be printed on different colored paper or in different colored inks, or sheets of colored paper can be inserted between major portions of the questionnaire. It is also possible to print short, dark bars at the edge of each page, with the placement of these bars on the page being the same within each module but lower down (if on the vertical edge) or further to the right (if on the bottom edge) in each Table 3.1 Units of Quantity Used in Ghana, 1987–88 Unit Code Pound *1 Kilogram *2 Ton *3 Minibag *4 Maxibag *5 Sheet 6 Basket 7 Bowl 8 American tin *9 Tree 10 Stick 11 Bundle 12 Barrel 13 Liter *14 Gallon *15 Beer bottle *16 Bunch 17 Nut 18 Fruit 19 Log 20 Box 21 All 22 Note: It is preferable to use the unit codes marked by (*) whenever possible. Source: Ghana LSMS survey (1987–88). MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 72 successive module. Using just one or a few of these techniques will be sufficient.The questionnaire should not become too colorful or complicated. Legibility and Spacing There is an art to laying out the grids for a questionnaire. The lettering must be large enough to read, which is sometimes difficult to accomplish in the compact structure of the grid. Legibility is especially important, as interviews often take place under poor lighting conditions, such as outdoors at dusk or after dark in homes dimly lit with lanterns, oil lamps, or candles. The good print quality now available from laser printers helps, but poor legibility is an ongoing complaint among interviewers. There must also be enough white (empty) space in the layout of the questionnaire. Whenever the answer will be coded later, a generous space should be allowed to write out fully the information required, such as the person’s name, the name of the school attended by the respondent, and the respondent’s occupation. In other places, judicious use of white space makes the questionnaire easier to read or less confusing than a questionnaire in which every page is crowded with print. In fact, in this book, the fonts used inVolume 3 are probably too small. This is necessary for Volume 3 to show how typical questionnaire pages should appear. In an actual questionnaire,the size of the pages usually will be somewhat larger than the pages in this book,and the font size should be increased by a similar proportion. Software for the Questionnaire Layout Many of the most common word processing and graphics software packages are adequate for producing questionnaire page layouts, and LSMS questionnaires have been produced using several different software packages. The modules in Volume 3 (the electronic versions of which are available to readers in the CDROM enclosed in the volume) were produced in Microsoft Excel, for two reasons. First, Excel is widely available. Second, spreadsheet software is better than word processing software at dealing with the long horizontal format of groups of questions on a single topic that are spread across several pages. Regardless of the software used, it is now much simpler and cheaper to make revisions between the various drafts of the modules than it was in the days when graphic artists had to draw each page by hand.The computerized approach also simplifies translations, as the verbal parts can be overwritten in the local language, leaving intact the skip codes, response codes, and general format. Appendix 3.1 Common Gaps and Overlaps This appendix provides a list of the modules that should be checked for gaps and overlaps with respect to the information that they collect. This list is not meant to be exhaustive because household questionnaires of different configurations will be subject to different risks of gaps and overlaps and because there are so many possibilities that it is difficult to list them all. However, some of the most common and important issues are mentioned here. Many more are mentioned in the relevant chapters of this book. Consumption Consumption information usually comes from several different modules of the household questionnaire. See the discussion in Chapter 5 on the different components of consumption and the modules in which those components are typically collected. Income Information on household income is gathered in the following modules: employment, household enterprise, agriculture, and transfers and other nonlabor income. It is sometimes also collected in the housing and savings modules. It is important to review the questionnaire as a whole to make sure that it accounts for all possible sources of income. In particular, questions about income from any rental property could be placed in the transfers and other nonlabor income module, on the assets page of the savings module, or, if the income comes from renting out a portion of the household’s primary dwelling, in the housing module. Wealth Information on household assets is collected in several modules.The housing module gathers information on the household’s principal residence.The household enterprise module gathers information on equipment and land associated with each household enterprise, and on the stocks of inputs and outputs used in each enterprise. The agricultural module gathers information on land, equipment, and livestock. The savings module collects information on other properties and financial assets, and the durable goods submodule of CHAPTER 3 DESIGNING MODULES AND ASSEMBLING THEM INTO SURVEY QUESTIONNAIRES 73 the consumption module collects data on the household’s durable goods. Finally, the credit module gathers information on the household’s liabilities. Credit Credit information is collected in several modules, including the modules for housing, consumption, savings, agriculture, and household businesses. There is also a separate credit module. Chapter 21 introduces the credit module and clarifies gaps and overlaps in credit. Mortgages Information on any mortgages that a household might hold can be gathered either in the credit module or in the housing, agriculture, and household enterprise modules. Employment Analysts often need to know how many hours each household member works in the household’s enterprises and in its agricultural activities as well as hours worked in employment outside the household. In previous LSMS surveys, all of this information was collected in the employment module. As explained in Chapter 9 (and Chapters 18 and 19), this book recommends collecting data on household members’ days and hours of work in household enterprises and agricultural activities in the household enterprise and agriculture modules, respectively, while continuing to ask about the number of hours worked in wage employment in the employment module. However, some survey designers may decide not to include the household enterprise and agriculture modules. In such cases information on the number of hours spent working on these activities must be collected in the employment module. Vaccination If the survey includes a fertility module, questions about vaccination should usually be placed in the fertility module so that this information can be collected not only for children who currently live in the household but also for children who have died or moved to another household. If there is no fertility module or the fertility module does not include all women of childbearing age, vaccination information on children living in the household can be collected in the anthropometry module. Another alternative is to gather this information in the health module, which includes questions on vaccinations in Part C of the standard health questionnaire. Domestic Housework Some previous LSMS surveys have collected information on how much time household members spend doing housework (such as cooking, cleaning, and childcare) in the employment module, usually asking only one question. If a time use module is included in a questionnaire, there is no reason to ask questions about housework in the employment module. However, because the time use module is very long, it is unlikely to be used in most LSMS-type multitopic surveys. If the time use module is not included but survey designers want to gather a small amount of information on, for example, the number of hours spent on housework during the previous seven days, one or two questions can be added to the employment module. (See Chapter 9 for further discussion of this issue.) Notes The authors would like to express their gratitude to Jere Behrman, Lawrence Haddad, Courtney Harold, John Hoddinott, Alberto Martini, and Raylynn Oliver for comments on an earlier draft. 1. Survey designers occasionally collect redundant information as a cross-check on other data. For example, most previous LSMS surveys have recorded both the age (in years) and the date of birth of each household member.This is done to verify the accuracy of the age variable. 2. This assumes that a two-stage sample is used. In the case of a three-stage sample, the secondary sampling unit is more pertinent. Generally, the penultimate sampling unit is the appropriate unit for collecting community data. 3. Issues concerning the order of the questions within each module are discussed in the topic-specific chapters in Parts 2 and 3 of this book. For a general discussion of ordering questions in household surveys see United Nations (1985) and Frey and Oishi (1995). 4. The short version of the health module presented in Chapter 9 does not ask for particularly sensitive information, but the standard and extended versions ask detailed questions about health status and health behavior (including drinking and smoking) that can be sensitive.If either the standard or the long module is used,health should not be one of the first modules in the questionnaire. 5. The questions in the household enterprise module that refer to “the past 14 days” can be reworded as “since my last visit” if the MARGARET GROSH, PAUL GLEWWE,AND JUAN MUÑOZ 74 second half of the questionnaire is administered two weeks after the interviewer’s first visit. 6. For example, the education module asks questions such as “What grade is [..NAME..] enrolled in?” For this question, the range of acceptable values in the data set is precisely defined. Moreover, it is also related to other information such as the degree obtained and the age of the student. (For example, a six-year-old should not be in secondary school.) In the consumption module, however, a wide range of values might be found for a question such as“How much did you spend on rice in the last two weeks?”which implies that fewer consistency checks are possible. 7. This section is a slightly modified version of the discussion on translating and field testing found in Chapter 3 of Grosh and Munoz (1996). 8. An alternative approach is to stretch the reference periods during the field test. For instance, instead of asking “Have you been ill or injured during the past 30 days?”as in the actual survey,it may be expedient to ask “Have you been ill or injured during the past 12 months?” or “When was the last time you were ill or injured?” This approach will simplify the logistics of finding enough people to try out the module but will not test very precisely whether the respondents find it difficult to recall the information, since the recall period used in the field test will be longer than the period used in the final questionnaire. 9. This section is a slightly modified version of the discussion of questionnaire formatting found in Chapter 3 of Grosh and Muñoz (1996). 10. For languages that do not have uppercase and lowercase, another way should be found to distinguish instructions from questions. It may be possible to use italics, bold, a different font, or a different color.An example of this is the LSMS survey of rural households in northeast China in 1995. Chinese characters do not have uppercase and lowercase, so two different fonts were used. 11. It is not necessary to convert quantities into standard units (for example, to convert bunches into kilos) to calculate farm income, which was the purpose of the agriculture module in the Ghana LSMS.However,as is common with such rich data sets,analysts are using the data for other purposes as well, such as calculating the total quantities of various crops that were produced. References Ainsworth, Martha, and Jacques van der Gaag. 1988. Guidelines for Adapting the LSMS Living Standards Questionnaires to Local Conditions. Living Standards Measurement Study Working Paper 26.Washington, D.C.:World Bank. Babbie, Earl. 1990. Survey Research Methods. Belmont, Cal.: Wadsworth. Fink, Arlene. 1995. The Survey Handbook. Thousand Oaks, Cal.: Sage Publications. Fowler, Floyd. 1993. Survey Research Methods. Second ed. Newbury Park, Cal.: Sage Publications. Frey,James,and Sabine Mertens Oishi.1995. How to Conduct Interviews byTelephone and in Person.Thousand Oaks,Cali:Sage Publications. Grosh, Margaret, and Juan Munoz. 1996. A Manual for Planning and Implementing the Living Standards Measurement Study Survey. Living Standards Measurement Study Working Paper 126. Washington, D.C.:World Bank. Oliver, Raylynn. 1997. Model Living Standards Measurement Study Survey Questionnaire for the Countries of the Former Soviet Union. Living Standards Measurement Study Working Paper 130. Washington, D.C.:World Bank. Scott, Christopher, Martin Vaessen, Sidiki Coulibaly, and Jane Verrall. 1988. “Verbatim Questionnaires Versus Field Translation or Schedules:An Experimental Study.” International Statistical Review 56 (3): 259–78. United Nations. 1985. “Development and Design of Survey Questionnaires.” Department of Technical Cooperation for Development, National Household Survey Capability Programme, NewYork.