BSSn4495: Qualitative research in security studies Validity, reliability, error, and bias April 25, 2024 Miriam Matejova, PhD Agenda • Measuring concepts • Bias, error • Data quality How to measure…? • …racism • …democracy • …political knowledge Criteria for measures • Validity – The degree of fit between a measure and the concept it is intended to measure – How well a measure “captures” the concept – If a measure is not valid, it can lead us to incorrect conclusions about the causes or effects of the underlying concept – Problems: the measure does not cover enough of the concept; covers things outside the concept; captures different things in different units Criteria for measures (cont.) • Reliability – How consistently a measurement procedure produces the same result when the procedure is repeated • If two researchers use the same procedure, do they get the same result? • If we use the same procedure at two different times, do we get the same result? – Requires a well-defined and transparent procedure – Reduces subjectivity and the possibility of individual biases affecting measurements Validity and reliability Measuring political knowledge Two measures: A. Ask respondents a specific set of factual questions; more answers right = more political knowledge B. Have interviewer provide rating of respondent knowledge after 2-hour indepth conversation with the respondent MORE RELIABLE, LESS VALID MORE VALID, LESS RELIABLE Measurement error Poor validity or reliability → measurement error. Two kinds of measurement error: • 1) Bias (systematic error) – Error produced when our measurement procedure produces scores that are, on average, either too high or too low relative to the truth. – Upward bias vs. downward bias Measurement error • 2) Random error – Error that derives from random features of the measurement process or the phenomenon – On average, random error cancels out over lots of iterations (but bias does not) Random error in measurement Measured Value = True Value + Bias + Random Error Measurement Error Bias is a systematic source of error Related to specific sources of discrepancy between operational definition and concept Random error is an unsystematic source of error Direction of error is unpredictable, not related to specific sources of discrepancy between operational definition and concept Sources of random error • Imperfect memory (for survey/interview measures), • Calculation/counting errors Random error: the good news Random errors cancel out over lots of iterations, so to minimize random error you can: • Repeat measure for lots of cases/individuals, • Repeat measure for same case at many points in time Too much to measure For many measurement tasks, we cannot measure all instances of a phenomenon. We can only take measures of a subset. Population The full set of cases that we’re interested in learning about. Sample The subset of the population that we actually measure. Selection bias • Occurs when the selection of cases in a sample is not representative of the population because the sample over-represents certain types of cases or under-represents certain types of cases in the population. Sources of selection bias – Sampling frame is not representative of the population • E.g., election poll based on random sampling from phone book – Sample frame: phone book – Population: all voters – Self-selection • Respondents often have control over whether they join your sample • E.g., who decides to take a survey on environmental issues? How to avoid selection bias? Random sampling Selecting cases from the population in a manner that gives every case an equal probability of being chosen. • Random sampling relies on the law of large numbers • As the sample size gets larger, the random sample characteristics will get closer to the population characteristics Random sampling error Random sampling error: caused by random variation between samples • By pure chance, one random sample of a population will be somewhat different from another random sample of the same population • To minimize random sampling error, increase the size of your sample. Measurement error due to social norms: “social desirability bias” • Do you have negative feelings towards people of other ethnicities? • Have you used illegal drugs in the past two years? • Have you ever cheated on a test? Minimizing social desirability bias: list experiments “I am going to read you a list of things that sometimes make people angry. After I read them, just tell me HOW MANY of them upset you. I don’t want to know which ones.” (1) The government increasing the tax on gasoline (2) Professional athletes getting million-dollar contracts (3) Large corporations polluting the environment (4) A black family moving in next door Measurement error due to costs of revealing truthful information • Bureaucrat admitting to accepting a bribe, • Politicians admitting to have links with certain big businesses, • Authoritarian leaders admitting to engaging in electoral fraud, • Teachers admitting to helping students cheat Data quality