Resear '
Preview 6.Chapter Objectives
The middle four chapters of this text, Chapters 5 through 8, concern the design
of experiments. The first half of Chapter 5 outlines the essential features of an
experiment-varying some factor ofinterest (theindependentvariable),controlling
all other factors (extraneous variables), and measuring the outcome (dependent
variables). In the second part of this chapter, you will learn how the validity of a
study can be affected by how well it is designed. When you finish this chapter, you
should be able to:
Define a manipulated independent variable and identify examples that are situational,
task, and instructional variables.
Distinguish between experimentaland control groups.
DescribeJohn StuartMill's rules ofinductivelogic andapply them to the concepts
of experimental and control groups.
Chapter 5. Introduction to Experimental Reseavch
Recognize the presence of confounding variables in an experiment and understand
why confounding creates serious problems for interpreting the results of
an experiment.
Distinguish independent from dependent variables, given a brief description of
any experiment.
Distinguish between independent variables that are manipulated variables or subject
variables, and understand the interpretation problems that accompany the
use of subject variables.
Recognize the factors that can reduce the statistical conclusion validity of an
experiment.
Describe how construct validity applies to the design of an experiment.
Describe the various ways in which an experiment's external validity can be
reduced.
Describe and be able to recognize the various threats to an experiment's internal
validity.
Recognize that external validity might not be important for all research but that
internal validity is essential.
Understand the ethical guidelines for running a "subject pool."
When Robert SessionsWoodworth finallypublished Experimental Psychologyin 1938,
the book's contents were alreadywell known amongpsychologists. As early as 1909,
Woodworth was giving his Columbia University students copies of a mimeographed
handout called "Problems and Methods in Psychology," and a companion handout
called "Laboratory Manual: Experiments in Memory, etc." appeared in 1912. By
1920, the manuscript filled 285 pages and was called "A Textbook of Experimental
Psychology." Afier a 1932 revision, still in mimeograph form, the book finally
was published in 1938. By then Woodworth's students were using it to teach their
own students, and it was so widely known that the publisher's announcement of its
publication said simply, "The Bible Is Out" (Winston, 1990).
The so-called Columbia bible was encyclopedic,with more than 823pages of text
and another 36 pages of references. Afier an introductory chapter, it was organized
into 29 different research topics such as "memory," "maze learning," "reaction
time," "association," "hearing," "the perception of color," and "thinking." Students
wading through the text would learn about the methods used in each content area,
and they would also learn virtually everything there was to know in 1938 about
each topic.
The impact of the Columbia bible on the teaching of experimental psychology
has been incalculable. Indeed, the teaching of experimental psychology today, and
to some degree the structure of the book you're now reading, are largely cast in the
mold set by Woodworth. In particular, he took the term "experiment," until then
loosely defined as virtually any type of empirical research, and gave it the definition
it has today. In particular, he contrasted experimental with correlational research, a
distinction now taken for granted.
The defining feature of the experimental method was the manipulation of what
Woodworth called an "independent variable," which would affect what he called
Essential Features ofExperimenta2 Research
the "dependent variable." In his words, the experimenter "holds all the conditions
constant except for one factor which is his 'experimental factor' or his 'independent
variable.' The observed effect is the 'dependent variable' which in a psychological
experiment is some characteristic ofbehavior or reported experience" (Woodworth,
1938, p. 2). Although the terms were not invented by Woodworth, he was the first
to use them as they are used today.
While the experimental method manipulates independent variables, the correlational
method, according to Woodworth, ''[nl]easures two or more characteristics
of the same individuals [and] computes the correlation of these characteristics.
This method. .. has no 'independent variable' but treats all the measured variables
alike" (Woodworth, 1938, p. 3).You wdl learn more about correlational research in
Chapter 9. In this and the next three chapters, however, the focus will be on the
experimental method, the researcher's most powerful tool for identifying cause-andeffect
relationships.
Essential Features of Experimental Research
Since Woodworth's time, psychologists have thought of an experiment as a systematic
research study in which the investigator directly varies some variable (or
variables), holds all other factors constant, and observes the results of the systematic
variation. The factors under the control of the experimenter are called independent
variables, the variables being held constant are referred to as extraneous variables,
and the behaviors measured are the dependent variables. Before we examine these
concepts more closely, however, you should read Box 5.1, which describes the logical
foundations of the experimental method in a set of rules proposed by the British
philosopherJohn Stuart Mill in 1843.
Stuart Mill and the Rules
C
b )ohn Stuart Mill (1805-1 873) was England's preeminent
nineteenth-century phdosopher. Although he was known
primarily as a political philosopher, much of his work has
direct relevance for psychology. For example, his book on
The Subjection of Women (1869) argued forcefully and well
ahead ofits time that women had abilities equalto those ofmen andoughttobe treated
equallywith men. Ofimportance for our focus onmethodology, in 1843he published
Chapter 5. Introduction to Experimental Research
*Cr ir.i.Qti rYtt"fi;A-F-..Tvi97sisi7 +." I Z k T i,'Y S I . ' -,' .,'-\ ' +*IAj,:i",&v . , I ) 8p' . I S T ' ) 8,. -1; ,.i .:T .: '
A System oflogic, Ratiocinativeand Indztctiv&~?~~g~&,@~~~~~e,dViewofthe PrinciJll&?$@u-~~:'
idence, and the Methods ofscientijic ~ n v e s t ~ a ~ i ~ ~ ~ & ~ ~ s , e ~ & ~ $ ~ t ~ ~ e ~ &they
could into a title!). In his Logic, Mill argued E ~ g ~ t ~ @ & ~ & ~ & ~ ~ ~ x c _ & n c eof p?~$c&g~g@,:~;
(hecalledit "ethology") on the groundsth& ~l&$&nig>pno$@e,il@I&e levelofpreG- sion
ofphysics,it could dojust aswell as 6tlietdirdpifiE9$rhatrm1:4~c*$j$ereds~entft;~~
at the time (meteorologywas the example he used).$Q$ also laid OL$& set of methods
that form the logical basis forwhat you will learn inst.@$chapteranc$&the chapter on
correlation. The methods were those of 'Agreement" and "Difference'~re1evantfor 1
this chapter),and of"Concomitant Vanation" (relevantforcorrelation--see Chapter 9, '
Titken together, the methods of Agreement and Difference enable us to conclude,
with a high degree of confidence, that some outcome, Y, was caused by some factor, .
X. The Method of Agreement states that if X is regularly followed hy $~qpX
is s~ficientfor Y to occur, and could be a cause of Y. That is, "if X, ?hen.V?i5vTheMethod
of Difference states that if Y does not occur when X does not occur, then
'
X is necessary for Y to occur-"if not X, then not Y." Taken together (what M$'
called the "Joint Method"), the methods of Agreement and Difference provide t6;
necessary and sufficient conditlons (i.e., the immediate cause) for the production,^
of Y. . /
To make this more concrete,suppose we are trying to determine ifwatching violent
TV causes a child to be aggressive. "Watching violent TV" is X, and "aggression" 49:
Y. If we can determine that every time a child watches violent TV (X), the result is'.
some act of aggression (Y), then we have satisfied the method of Agreement, and we i'
can say that watching violent TV is enough (sufficient)to produce aggression. If the .
chlld watchesviolent TV,then aggressionoccurs ("IfX, then Y"). Ifwe can also she$?j;
that whenever violent TV is not watched (not X), the child is not aggressive (not %L,c<~
r: ?.thenwe can say that watchingviolent TV is necessaryin order for aggressionto occuli~Ei't
'::dIf the child does not watch violent TV, aggression does not occur,(t'If,not X, th6q':t
not Y"). I
~~~r-ib~ngacd~ i ~ < s q i v v ~. I
It is important to note that in the real world of research, the conditionsdescribedin 1
0 L
.'@ese methods are never met fully. That is, it will be impossible to identlfyand measure
'
&he outcome of everyinstanceof every childwatchingTV Rather, the best one can do ,
!&Is to obseme systematicallyas many instances as possible, under controlled conditlons,
&@nd then draw conclusions with a certain amount of confidence. That is precisely ,,
k& I
&@what research psychologists do and, as you recall from the Chapter 1 discussion of .;
%$(+&ientific thinking, the reason why researchers regard all knowledge based on science
'
@be tentative, pending additionalresearch. As fmdings are replicated, c nfipFpc~in..
%$&hemincreases. $1 !$?2F:&Ygfq-T
As you work through this chapter, especially at the point where you learn aboct-;!
studies with experimental and control groups, you will see that an experhentar group
(e.g., some children shown violent TV shows) accomplishes Mill's Method
.
of Agreement, wlde a control group (e.g., other children not shown violent films) .
accomplishes the Method of Difference. Studieswith both experimentaland control
Essential Features ofExperirnenta1 Research
Establishing Independent Variables
Any experiment can be described as a study investigating the effect of X on Y.
The "X" is Woodworth's independent variable: it is the-factor of interest to the
experimenter, the one that is being studied to see if it wdl influence behavior. It
is sometimes called a "manipulated" factor because the experimenter has complete
control over it and is creating the situations that research participants will encounter
in the study. As you will see, the concept of an independent variable can also be
stretched to cover what are called nonmanipulated or subject variables,but, for now,
let us consider only those independent variables that are under the experimenter's
total control.
Independent variables must have a minimum of two levels. That is, at the very
least, an experiment involves a comparison between two situations (or conditions).
For example, suppose a researcher is interested in the effects of different dosages of
marijuana on reaction time. In such a study, there have to be at least two different
dosage levels in order to make a con~parison.This study would be described as an
experiment with "amount ofmarijuana" as the independent variable and "dosage 1"
and "dosage 2" as the two levels of the independent variable. You could also say that
the study has two conditions in it-the two dosage levels. Of course, independent
variables can have more than two levels. In fact, there are distinctadvantagesto adding
levels beyond the minimum of two, as you will see in Chapter 7 on experimental
design.
Experimental research can be either basic or applied in its goals, and it can be
conducted either in the laboratory or in the field (referback to Chapter 3, pp. 78-83
for an elaboration of these distinctions). Experiments that take place in the field are
sometimes called field experiments. The term field research is a broader term
for any empirical research outside of the laboratory, including both experimental
studies and studies using nonexperimental methods.
Varieties of Independent Variables
The range of factors that can be used as independent variables is limited only by
the creative thinking of the researcher. However, independent variables that are
manipulated in a study tend to fall into three somewhat overlapping categories:
situationalvariables, task variables, and instructional variables.
Situational variables refer to different features in the environment that participants
might encounter. For example, in a helping behavior study, the researcher
interested in studying the effect of the number of bystanders on the chances of help
being offered might create a situation in which participants encounter a person in
need of help. Sometimes the participant is alone with the person needing aid; at
other times the participant and the victim are accompanied by a group of either
three or six bystanders. In this case, the situational independent variable would be
the number of potential helpers on the scene besides the participant, and the levels
would be zero, three, and six bystanders.
Sometimes experimenters vary the type of task performed by participants. One
way to manipulate task variables is to give groups of participants different hnds
of problems to solve. For instance, research on the psychology of reasoning often
Chapter 5. Introduction to Experimental Research
involves giving people different kinds of logical problems to determine the kinds of
errors people tend to make. Similarly, mazes can differ in the degree of complexity,
different types of illusions could be presented in a perception study, and so on.
Instructional variables are manipulated by asking different groups to perform
a particular task in different ways. For example, children in a memory task who are
all shown the same list of words might be given different instructions about how to
memorize the list. Some might be told to forin visual images of the words, others
might be told to form associations between adjacent pairs of words, and still others
might be told simply to repeat each word three times as it is presented.
Of course, it is possible to combine several types of independent variables in a
single study. A study of the effects of crowding, task difficulty, and motivation on
problem-solving ability could have participants placed in either a large or a small
room, thereby manipulating crowding through the situational variable of room
size. Some participants in each type of room could be given difficult crossword
puzzles to solve and others less difficult ones; this illustrates a task variable. Finally,
an instructional variable could manipulate motivation by telling participants that
they will earn either $1or $5 for completing the puzzles.
Control Groups
In some experiments, the independent variable is whether or not some treatment is
administered. The levels of the independent variable in this case are essentially 1and
0; some get the treatment and others don't. In a study of the effects of TV violence
on children's aggressivebehavior, for instance, some children might be shown a violent
TV program, while others don't get to see it, or see a nonviolent TV show.
The term experimental group is used as alabel for the first situation, in which the
treatment is present. Those in the second type of condition, in which treatment is
withheld, are said to be in the control group. Ideally, the participants in a control
group are identical to those in the experimental group in all ways except that the
control group participants do not get the experimental treatment. As you recallfrom
Box 5.1, the conditions of the experimental group satisfi Mill's Method of Agreement
(ifviolent TV, then aggression) and the control group can satisfj the Method
ofDifference (ifno violent TV, then no aggression).Thus, a simple experiment with
an experimental and a control group is an example of what Mill called the "Joint
Method." In essence, the control group provides a baseline measure against which
the experimental group's behavior can be compared. Think of it this way: control
group = con~parisongroup.
Please don't think that control groups are necessary in all research, however. It
is indeed important to co~ztvolextraneous variables, as you are about to learn, but
control groups occur in research only when it is important to have a comparison
with a baseline level of performance. For example, suppose you were interested in
the construct "sense of direction," and wanted to know whether a specific training
program would help people avoid getting lost in new environments. In that study,
a reasonable comparison would be between a training group and a control group
without training. On the other hand, if your empirical question concerns gender
differences in sense of direction, the comparison wdl be between a group of males
and a group of females-neither would be considered a control group. You will
F Essential Features of Experimental Research
learn about several specialized types of control groups in Chapter 7, the first of two
chapters dealing with experimental design.
controlling Extraneous Variables
The second feature of the experimental method is that the researcher tries to control
what are called extraneous variables. These are any variables that are not of
interest to the researcher but which might influence the behavior being studied
if they are not controlled properly. As long as these are held constant, they present
no danger to the study. If they are not adequately controlled, however, they might
influence the behavior being measured in some systematicway. The result is called
confounding. A confound is any uncontrolled extraneous variable that "covaries"
with the independent variable and could provide an alternative explanation of the
results. That is, a confounding variable changesat the same time that an independent
variable changes (i.e.,they "covary") and, consequently,its effect cannot be separated
from the effect of the independent variable. Hence, when a study has a confound,
the results could be due to the effects of either the confounding variable or the
independent variable, or some combination of the two, and there is no way to
decide among these alternatives.
To illustrate some obvious confounding, consider a verbal learning experiment
in which a researcher wants to show that students who try to learn a large amount
of material all at once don't do as well as those who spread their learning over
several sessions. That is, massed practice (cramming?) is predicted to be inferior to
distributed practice. Three groups of students are selected, and each group is given
the same five chapters in a general psychology text to learn. Participants in the
first group are given 3 hours on Monday to study the material. Participants in the
second group are given 3 hours on Monday and 3 hours on Tuesday, and those in
the final group get 3 hours each on Monday, Tuesday, and Wednesday. O n Friday, all
the groups are tested on the material (see Table 5.1 for the design). The results show
that Group 3 scores the highest, followed by Group 2. Group 1 does not do well
at all, and the researcher concludes that distributed practice is superior to massed
practice. Do you agree with this conclusion?
You probably don't, because there are two serious confounds in this study, both
easy to spot. The participants certainly differ in how their practice is distributed
(1, 2, or 3 days), but they also differ in how much total practice they get (3, 6, or
TABLE5.1 Confoundingin a Hypothetical Distribution
ofPractice Experiment
Monday Tuesday Wednesday Thursda Friday
Group 1 3 - - Exam
Group 2 3 3 - - Exam
Group 3 3 3 3 - Exam
Note: The 3 in each equals the number of hours spent studying five chapters of a general psychology
text.
Chapter 5. Introduction to Experimental Research
TABLE5.2 Identifying Confounds
Levels of IV EV 1 EV 2 DV
Distribution Study Retention Retention Test
of Practice Hours Interval Performance
1day 3 hours 3 days Lousy
2 days 6 hours 2 days Average
3 days 9 hours 1 day Great
IV = independendt variable.
EV = extraneous variable.
DV = dependent variable.
9 hours). This is aperfect exampleof a confound-it is impossible to tell if the results
are due to one factor (distribution ofpractice) or the other (totalpractice hours); the
two factors covaryperfectly. The way to describe this situation is to say "distribution
of practice is confounded with total study hours." The second confound is perhaps
less obvious but is equally problematic. It concerns the retention interval. The test
is on Friday for everyone,but different amounts of time have elapsed between study
and test for each group. Perhaps Group 3 did the best because they studied the
material most recently and forgot the least amount. In this experiment, distribution
of practice is confounded both with total study hours and with retention interval.
Each confound by itself could account for the results, and the factors may also have
interacted with each other in some way to provide yet another interpretation.
Look at Table 5.2, which gives you a convenient way to identifj confounds. In
the first column are the levels of the independent variable and in the final column
are the results. The middle columns are extraneous variables that should be held
constant through the use of appropriate controls. If they are not kept constant,
then confounding exists. As you can see for the distributed practice example, the
results could be explained by the variation in any of the first three columns, either
indvidually or in some combination. To correct the confound problem in this case,
you need to ensure that the middle two columns are constant instead of variable.
A problem that students sometimes have with understanding confounds is that
they tend to use the term whenever they spot something in a study that might not be
right. For example, suppose the distribution ofpractice study included the statement
that only femaleswere used in the study.Somestudentsreading the description might
think there's a confound here-gender. What they really mean is they believe both
males and females ought to be in the study and that might indeed be the case, but
gender is not a confound in this example. Gender would be a confound only if males
were usedjust in one condition and only females were used in one other condition.
Then any group differences in the results could be due to the independent variable
or to gender. So be careful. A confound is a serious flaw in a study but not all design
flaws are confounds.
In the Applications exercises at the end of the chapter you will be identifjing
confounds. You might find the task easier if you fit the problems into the Table 5.2
format. Take a minute and redesign the distributed practice study. How would you
eliminate the confounding from these extraneous variables?
Essential Features ofExperirnenta1 Research
Learning to be aware of potential confounding factors and building appropriate
ways to control for them is one of the scientific thinking skills that is most difficult to
develop. Not all confounds are as obvious as the massed/distributed practice example.
We'll encounter the problem often in the remaining chapters and address it again
shortly in the context of a discussionof what is called the internal validity of a study.
Measuring Dependent Variables
The third part of any experiment is measuring some behavior that is presumably
being influenced by the independent variable. The term dependent variable is
used to describ'e those behaviors that are the measured outcomes of experiments.
If, as mentioned earlier, an experiment can be described as the effect of X on Y
and "X" is the independent variable, then "Y"is the dependent variable. In a study
of the effects of TV violence on children's aggressiveness, the dependent variable
would be some measure of aggressiveness. In the distribution of practice study, it
would be a measure of exam performance.
The credibility of any experiment and its chances of discovering anything ofvalue
depend partly on the decisions made about what behaviors to measure as dependent
variables. We've already seen that empirical questions cannot be answered unless the
terms are defined with some precision. You might take a minute and review the
section on operational definitions in Chapter 3 (pp. 85-86). When an experiment is
designed, one key component concerns the operational definitions for the behaviors
to be measured as dependent variables. Unless the behaviors are defined precisely,
replication is impossible.
Deciding on dependent variables can be tricky. A useful guide is to know the
prior research and use already-established dependent measures, those that have been
shown to be reliable and valid. Sometimes you have to develop a new measure,
however, and when you do, a brief pilot study might help you avoid two major
problems that can occur with poorly chosen dependent variables-ceiling and floor
effects. A ceiling effect occurs when the average scores for the different groups in
the study are so high that no difference can be determined. This happens when your
dependent measure is so easy that everyone gets a high score. Conversely, a floor
effect happens when all the scores are extremely low because the task is too difficult
for everyone, once again producing a failure to find any differencesbetween groups.
One finalpoint about variables.It is important to realize that aparticular construct
could be an independent, an extraneous, or a dependent variable, depending on the
research problem at hand. An experiment might manipulate a particular construct
as an independent variable, try to control it as an extraneous factor, or measure it
as a dependent variable. Consider the construct of anxiety, for instance. It could be
a manipulated independent variable by telling participants that they will be experiencing
shocks that will be either moderate or painful when they make errors on a
simulated driving task. Anxiety could also be a factor that needs to be held constant
in some experiments. For instance, if you wanted to evaluate the effects of a public
spealungworkshop on the ability of students to deliver a brief speech, you wouldn't
want to videotape the students in one group without taping those in the other group
as well. If everyone is taped, then the level of anxiety created by that factor (taping)
Chapter 5. Introduction to Experimental Research
is held constant for evelyone. Finally, anxiety could be a dependent variable in a
study of the effects of different types of exams (e.g., multiple choice vs. essay) on
the perceived test anxiety of students during final exam week. Some physiological
measures of anxiety might be used in this case. Anxiety could also be considered a
personality characteristic, with some people having more of it than others; this last
possibility leads to the next topic.
J Self Test 5.1
1. In a study of the effects of problem difficulty (easy or hard) and reward size
($1or $5 for each solution) on an anagram problem-solving task, what are the
independent and dependent variables?
2. What are extraneous variables and what happens if they are not controlled
properly?
3. Explain how frustration could be an independent, extraneous, or dependentvariable,
depending on the study.
Manipulated versus Subject Variables
Up to this point, the term independent variable has ineant some factor manipulated
directly by the researcher. An experiment compares one condition created by and
under the control of the experimenter with another. However, in many studies,
comparisons are also made between groups of people who differ from each other
in ways other than those designed by the researcher. These comparisons are made
between factors that are referred to variously as ex post facto variables, natural group
variables, nonmanipulated variables, or subject variables, which is the term I will
use. They refer to already existing characteristics of the individuals participating
in the study, such as gender, age, socioeconomic class, cultural group, intelligence,
physical or psychiatric disorder, and any personality attribute you can name. When
using subject variables in a study, the researcher cannot manipulate them directly
but must select people for the different conditions of the experiment by virtue of the
characteristicsthey already have.
To dustrate the differences between manipulated and subject variables, consider
a hypothetical study of the effects of anxiety on maze learning in humans. You
could man@ulate anxiety directly by creating a situation in which one group is made
anxious (told they'll be performing in front of a large audience perhaps), while a
second group is not (no audience). In that study, any person who volunteers could
potentially wind up in one group or the other. To do the studyusing a subject variable,
on the other hand, you would select two groups differingin their characteristiclevels
of anxiety and ask each to try the maze. The first group would be those who were
anxious types ofpeople (as determined ahead of time by a personality test for anxiety
proneness). The second group would include more relaxed types of people. Notice
Manipulated versus Subject Rriables
the major differencebetween this situation and one involvingamanipulated variable.
With anxlety as a subjectvariable,volunteers cominginto the study cannot be placed
into either of the conditions (anxious-all-the-time-Fred cannot be put into the lowanxiety
group),but must be in one group or the other, depending on attributes they
already possess prior to entering the study.
Some researchers, true to Woodworth's original use of the term, prefer to reserve
the term independent variable for those variables directly manipulated by the
experimenter. Others are wdling to include subject variables as examples of a particular
type of independent variable on the grounds that the experimenter has some
degree of control over them by virtue of the decisions involved in selecting them in
the first place. I take this latter position and wdl use the term independent variable
in the broader sense. However, whether this term is used broadly (manipulated +
subject) or narrowly (manipulated only) is not important, providing you understand
the difference between a manipulated and a nonmanipulated or subject variable.
Research Example 4-Using Subject Variables
One common type of research using subject variables examines differences from
one culture to another. Ji, Peng, and Nisbett (2000) provide a nice example. In a
series of studies involving various cognitive tasks, they looked at the implications of
the differences between those raised in Asian cultures and those raised in Western
cultures. In general, they pointed out that Asians, especially those from China,
Korea, and Japan, have a "relatively holistic orientation, emphasizing relationships
and connectedness" (p. 943) among objects, rather than on the individual properties
of the objects themselves. Those from Western cultures, especially those deriving
from the Greek "analytic" tradition, are "prone to focus more exclusively on the
object, searching for those attributes of the object that would help explain and
control its behavior" (p. 943).
This cultural differenceledJi et al. (2000) to make several predictions, including
one that produced a study with two separate subject variables-culture and gender.
They chose a cognitive task that has a long history, the rod and frame test (RFT).
While sitting in a darkened room, participants in an RFT study see an illuminated
square frameprojected on a screenin front of them, alongwith a separate illuminated
straightline (rod)inside the fiame. Theframe can be oriented to various anglesby the
experimenter and the participant's task is to move a device that changes the orientation
of the rod. The goal is to make the rod perfectlyvertical,regardless ofthe frame's
orientation. The classic finding (Witkin & Goodenough, 1977)is that some people
(fieldindependent) are quite able to bring the rod into a true vertical position, disregarding
the distraction of the frame, while others (field dependent) adjust the rod
with reference to the frame and not with reference to true vertical. Can you guess the
hypothesis?The researcherspredicted that those from Asian cultures would be more
likely to be field dependent than those from Western cultures. They also hypothesized
greater field dependence for females, a prelction based on a typical finding in
RFT studies. So, in terms of the concepts introduced in Chapter 3 (pp. 102-103),
part of this study (gender)involved replication and part (culture) involved extension.
Because the undergraduate population of the University of Michigan (where the
study was conducted) includes a large number of East Asians,Ji et al. (2000) were
Clzapter 5. Introductiolz to Experimental Research
IEuropean Americans
EastAsians
1
Male Female
FIGURE5.1 Gender and cultural differences in the rod and frame
test, fromJi, Peng, and Nisbett's (2000)cross-cultural study.Note the
vertical lines at the top of each bar; these are called "error bars," and
they reflect variability around the mean (see pp. xx).
able to complete their study using students enrolled in general psychology classes
there (in a few pages you'll be learning about "subject pools"). They compared 56
European Americans with 42 East Asians (most from China, Korea, andJapan) who
had been living in the United States for an average of about 2.5 years. Students in
the two cultural groups were matched in terms of SAT math scores, and there were
about an equal number of males and females in each group.
As you can see from Figure 5.1, the results supported both hypotheses. The
finding about females being more field dependent than males was replicated, and
the difference occurred in both cultures. In addition, the main finding was the
consistent difference between the cultures-those from East Asian cultures were
more field dependent than the European Americans. AsJi et al. (2000) described the
outcome, the relative field independence of the Americans reflected their tendency
to be "more attentive to the object and its relation to the self than to the field"
(p. 951), while the field dependent Asians tended to be "more attentive to the field
and to the relationship between the object and the field" (p. 952). One statistical
point worth noting relates to the concept of an outlier, introduced in Chapter 4
(p. 137).Each subject did the RFT task 16 times and, on average, 1.2of their scores
were omitted from the analysis because they were significantlybeyond the normal
Manipulated versus Subject Vdriables
range of scores. Their operational definition of outlier was somewhat technical, but
related to the &stance from the interquartile range, another concept you recall from
Chapter 4 (p. 138).
Only a study using manipulated independent variables can be called an experiment
in the strictest sense of the term; it is sometimes called a "true" experiment (which
sounds abit pretentious and carries the unfortunate implication that other studiesare
"false"). Studies using independent variables that are subject variables are occasionally
called ex post facto st~tdiesor quasi experiments ("quasi" meaning "to some degree"
here).' Sometimes (often, actually)studieswdl includeboth manipulated and subject
independent variables. Being aware of the presence of subject variables is important
because they affect the hnds of conclusions that canbe drawn from the study's results.
Drawing Conclusions When Using Subject Variables
Put a little asterisk next to this section-it is extremely important. Recall from
Chapter 1that one of the goals of research in psychology is to hscover explanations
for behavior. That is, we wish to know what caused some behavior to occur. Simply
put, with manipulated variables, conclusions about the causes of behavior can be
made; with subject variables, they cannot. The reason has to do with the amount of
control held by the experimenter in each case.
With manipulated variables, the experiment canmeet the criterialistedin Chapter
1 for demonstrating causality. The independent variable precedes the dependent
variable, covaries with it, and, assuming that no confounds are present, can be
considered the most reasonable explanation for the results. In other words, if you
vary some factor and successfullyhold all else constant, the results can be attributed
only to the factor varied. In a confound-free experimental study with two groups,
these groups will be essentially equal to each other (i.e., any differences will be
random ones) in all ways except for the manipulated factor.
When using subject variables, however, the experimenter can also vary some
factor (i.e., select participants having certain characteristics) but cannot hold all
else constant. Selecting participants who are high or low on some definition of
anxiety proneness does not guarantee that the two groups wdl be equivalent in
other ways. In fact, they might be different from each other in several ways (in
self-confidence, perhaps) that could influence the outcome of the study. When a
difference between the groups occurs in this type of study, we cannot say that the
differenceswere caused by the subjectvariable.In terms of the conditions for causality,
while we can say that the independent variable precedes the dependent variable and
covaries with it, we cannot eliminate alternative explanations for the relationship
because certain extraneous factors cannot be controlled. When subject variables are
present, all we can say is that the groups performed differently on the dependent
measure.
'The term quasi-experimental design is actually a broader designation referring to any type of design in
whch participants cannot be randomly assigned to the groups being studied (Cook & Canlpbell, 1979).
These designs are often found in applied research and are elaborated in Chapter 10.
Clzapter 5. Introduction to Expevirnental Research
An example from social psychology might help to clari+ the distinction. Suppose
you were interested in altruistic behavior and wanted to see how it was affected
by the construct of c'self-esteem." The study could be done in two ways. First,
you could manipulate self-esteem directly by first giving participants a personality
test. By providing different kinds of false feedback about the results of the test,
both positive and negative, self-esteem could be raised or lowered temporarily. The
participants could then be asked to do some volunteer work to see if those feeling
good about themselveswould be more likely to help.% second way to do this study
is to give participants a reliable and valid personality test for level of self-esteem
and select those who score in the upper 25% and lower 25% on the measure as the
participants for the two groups. Self-esteem in this case is a subject variable-half of
the participants will be low self-esteem types, while the other half wdl be high selfesteem
types. As in the first study, these two groups of people could be asked about
volunteering.
In the first study, differences in volunteering can be traced directly to the selfesteem
manipulation. If all other factors are properly controlled, the temporary
feeling of increased or decreased self-esteem is the only thing that could have produced
the differences in helping. In the second study, however, you cannot say
that high self-esteem is the direct cause of the helping behavior; what you can
say is that people with high self-esteem are more likely to help than those with
low self-esteem. All you can do is to speculate about the reasons why this might
be true because these participants may differ from each other in other ways unknown
to you. For instance, high self-esteem types of people might have had
prior experience in volunteering, and this experience might have had the joint
effect of raising or strengthening their self-esteem and increasing the chances that
they will volunteer in the future. Or they might have greater expertise in the
specific volunteering tasks (e.g., public speaking skdls). As you will see in Chapter
9, this difficulty in interpreting research with subject variables is exactly the
same problem encountered when trying to draw conclusions from correlational
research.
Returningfor amoment to theJi, Peng, and Nisbett (2000)study,which featured
the subject variables of culture and gender, the authors were careful to avoid drawing
conclusions about causality. The word "cause" never appears in their article, and the
descriptions of results are always in the form "this group scored higher than this
other group."
Before moving on to the discussion of the validity of experimental research, read
Box 5.2. It identifies the variables in a classic study that you probably recall from
your general psychology course-one of the so-called Bobo experiments that first
investigated imitative aggression. Working through the example will help you apply
your knowledge ofindependent, extraneous, and dependent variables, andwill allow
you to see how manipulated and subject variables are often encountered in the same
study.
' ~ a n i ~ u l a t i n ~self-esteem raises ethical questions that were considered in a study by Subvan and Deiker
(1973).See Chapter 2, p. 58.
m
Manipulated versus Subject Variables
)IES--Bob0 Dolls and Aggression
Ask any student who has just completed a course in child,
social, or personality psychology (perhapseven general psychology)
to tell you about the Bobo doll studies. The re-
:.(<.
-
sponse
will be immediate recognition and a brief description
along the lines of "Oh, yes, the studies showing that
chddren will punch out an iidated doll if they see an adult
doing it." A description of one of these studies is a good way to clarify further the
differencesbetween independent, extraneous, and dependent variables. The studywas
published by Albert Bandura and h s colleagues in 1963 and is entitled "Imitation of
Film-Mediated Aggressive Models" (Bandura,Ross, & Ross, 1963).
~stablishin~Independent Variables
The study included both manipulated and subject variables. The major manipulated
variable was the type of experience that preceded the opportunity for aggression.
There were four levels, including three experimentalgroups and a control group.
Experimentalgroup 1:real-life aggression (chddrendirectly observedan adult model
aggressingagainst the Bobo doll)
Experimental group 2: human film aggression (children observed a film of an adult
model aggressingagainst Bobo)
Experimental group 3: cartoon fdm aggression (chddren observed a cartoon of
"Herman the Cat" aggressing against a cartoon Bobo)
Controlgraup: no exposure to aggressive models
The nonmanipulatedindependent variable (subjectvariable) was gender. Mde and
femalestudents&omthe StanfordUniversity Nursery School(mean age =52 months)
were the participants in the study. (Actually,there was also another manipulated variable;
participantsin groups 1and 2 were exposedto eithera same-gender or oppositegender
model.) The basic procedure of the experiment was to expose the children
to some type of aggressive model (or not, for the control group), and then put them
into a room full of toys (includingBobo), thereby givingthem the opportunity to be
aggressive themselves.
Controlling Extraneous Variables
Several possible confounds were avoided. First, in groups 1 and 2, the adults aggressedagainst
a J-joot Bobo doll. When given a chance to pummel Bobo themselves,
the children were put into a room with a 3-joot Bobo doll, This kept the size relationship
between person and doll approximately constant. Second, participants in all
four groupswere mildly frustratedbefore being given a chance to aggress. They were
Chapter 5, Introduction to Experimerztal Research
allowed to play for $few minutes with some
the experimenter that the toys were special
children. Thus, for all of the children,there was an approximately equivalentincrease
in their degree of emotional arousaljust prior to the time when they were given the
opportunity to be aggressive.Any differences in aggressiveness could be attributed to
the mitative effects and not to any emotional differences between the groups
Measuring Dependent Variables
Several Werent measures of aggression were used in this study. Aggressive responses
were categorized as imitative, partially imitative, or nonirnitative, depending on how
closely they matched the model's behavior. For example, the operational definition
of imitative aggressive behaviors included striking the doll with a wooden mallet,
punching it in the nose, and kicking it. Partially imitative behaviors included hitting
something else with the mallet and sitting on the doll but not hitting it. Nonirnitative
aggression included shooting darts from an available dart gun at targets other than
Bobo and actingaggressively toward other objects in the room.
160 malesare generally more
- aggressivehere, as wall as
Inthe other R rnnditjons
-
-
g!
120
0
U='""I7100- - -
0.-
3 ao-
- P
' -- 60--1 +-
h
i 40>: -
20 O>eal-llfe
Human-film Cartoon-film
aggression aggression aggression
Aggression is significantlylower here
than in the three experimentalconditions.
FIGURE5.2 DatafromBandura, Ross, andRoss's
of imitation on aggression.
Brieflx the resultsofthe studywere that childrenin groups 1,2, and 3showedsignificantly
more aggressionthan those in the control group, but the same amount of overall
aggression occurred regardless of the type of modeling. Also, boys were more aggressive
than girlsin all conditions; some gender mfferences also occurred in the form
of the aggression: girls "were more inclined than boys to sit on the Bobo doll but
[unlike the boys] refrained from punchlng it" (Bandura et al., 1963,p. 9). Figure 5.2
summarizes the results.
: The Validity of Experimental Reseavclz
The Validity of Experimental Research
Chapter 4 introduced the concept of validity in the context of measurement. The
term also applies to experiments as a whole. Just as a measure is valid if it measures
what it is supposed to measure, psychologicalresearch is said to be valid if it provides
the understanding about behavior that it is supposed to provide. This section of the
chapter introduces four different types of validity, following the scheme outlined
by Cook and Campbell (1979) for research in field settings but applicable to any
research in psychology. The four types of validity are statistical conclusion validity,
construct validity (again), external vahlty, and internal validity.
\
Statistical Conclusion Vafidity
The previous chapter introduced you to the use of statistics in psychology. In particular,
you learned about measurement scales, the distinction between descriptive
and inferential statistics,and the basics of hypothesis testing. Statistical conclusion
validity concerns the extent to which the researcher uses statistics properly and
draws the appropriate conclusions from the statistical analysis.
The statisticalvalidity of a study can be reduced in several ways. First, researchers
might do the wrong analysis or violate some of the assumptions required for performing
a particular analysis. For instance, the data for a study might be measured
using an ordinal scale, thereby requiring the use of a particular type of statistical
procedure. The researcher, however, mistakenly uses an analysis that is appropriate
only for interval or ratio data. Second, the researcher might selectivelyreport some
analyses that came out as predicted but might not report others (guesswhich ones?),
a practice that borders on fraud (see Chapter 2, pp. 68-70). The third example of
a factor that reduces the statistical validity of a study concerns the reliability of the
measures used. If the dependent measures are not reliable, there will be a great deal
of error variability,which reduces the chances of finlng a significanteffect. If a true
effect exits (i.e., Hoshould be rejected),but low reliability results in a failure to find
that effect, the outcome would be a Type I1 error.
The careful researcher decides on the statistical analysis at the same time that
the experimental design is being planned. In fact, no experiment should ever be
designed without giving thought to how the data wdl be analyzed.
Construct Validity
The previous chapter described construct validity in the context of measuring psychological
constructs: it refers to whether a test truly measures some construct (e.g.,
self-efficacy connectedness to nature). In experimental research, construct validity
has a related meaning: it refers to the adequacy of the operational definitions for
both the independent and the dependent variables used in the study. In a study of the
effects of TV violence on chddren's aggression, questions about construct validity
could be (a)whether the programs chosen by the experimenter are the best choices
to contrast violent with nonviolent television programming, and (b) whether the
Chapter 5. Introduction to Experimental Researclz
operational definitions and measures of aggression used are the best ones that could
be chosen. If the study used violent cartoon characters (e.g.,Elmer Fudd shooting at
Bugs Bunny) compared to nonviolent characters (e.g., Winnie the Pooh), someone
might argue that children's aggressive behavior is unaffected by fantasy; hence, a
more valid manipulation of the independent variable, called "level of filmed violence,"
would involve showing children realistic films of people that varied in the
amount of violence portrayed.
Similarly, someone might criticize the appropriateness of a measure of aggression
used in a particular study.This, in fact, has been a problem in research on aggression.
For rather obvious ethical reasons, you cannot design a study that results in subjects
punching each other's lights out. Instead, aggression has been defined operationally
in a variety of ways, some of which might seem to you to be inore vahd (e.g.,
angered participants believing they are delivering shocks to another person) than
others (e.g., horn honlung by frustrated drivers). As was true for the discussion of
construct validty in the previous chapter when the emphasis was on measurement,
the validity of the choices about exactly how to define independent and dependent
variables develops over time as accumulated research fits into a coherent pattern.
External Validity
Experimental psychologists have been occasionally criticized for knowing a great
deal about college sophomores and white rats and very little about anything else.
This is, in essence, a criticism of external validity, the degree to which research
findings generalizebeyond the specific context of the experiment being conducted.
For research to achieve the highest degree of external validity, it is argued, its results
should generalize in three ways-to other populations, to other environments, and
to other times.
Other Populations
The comment about rats and sophomores fits here. As we have seen in Chapter 2,
part ofthe debate over the appropriatenessofanimal researchhas to do with how well
this researchprovides explanationsthat are relevant for human behavior. Concerning
sophomores, recall that Milgram deliberately avoided using college students, and
selected adultsfrom the generalpopulation as subjectsfor his obedience studies.The
same cannot be said of most socialpsychologists, however. A survey by Sears (1986)
of research in social psychology found that 75% of the research published in 1980
used undergraduates as participants. When Sears repeated the survey for research
published in 1985,the number was 74%.And it is notjust socialpsychologistswhose
studies feature a h g h percentage of college students-since it began publication in
1992, 86% of the empirical articles in theJourna1 of Consurnev Psychology have used
college student samples (Jaffe,2005). Sears argued that the characteristicsof college
students as a population could very well bias the general conclusions about social
phenomena. Compared to the general population, for instance, college students
are more able cognitively, more self-centered, more susceptible to social influence,
and more likely to change their attitudes on issues. To the extent that research
investigatesissues related to those features,results from students might not generalize
The Klidity ofExperimental Research
to other groups, according to Sears. He suggested that researchers expand their
databases and replicate important findings on a variety of populations. However,
he also pointed out that many research areas (e.g., perception, cognition) produce
outcomes relatively unaffected by the special characteristics of college students, and
there is no question that students exist in large numbers and are readily available.
Some special ethical considerations apply when using this group, as outlined in
Box 5.3.
- I : i - I
i . L _" 1 ;%" -7-..- :-,r '
i ; , ~
Icruiting participants: Everyone's &'thePool
Most research psychologists are employed by colleges and
universities and consequentlyfind themselves surrounded by
an available supply ofparticipantsfor their research. Because
studentsmay not readily volunteerto participate in research,
most university psychology departments establish what is
called the subject pool or the participant pool. The term
refersto agroup ofstudents, typicallythose enrolledinintroductorypsychology classes,
who are asked to participate in research as part of a course requirement. If you are a
student at a large university, you have probably had the experience of "volunteering"
for two or three experiments in order to avoid losing points or acquiring a grade of
Incompletefor the course.At alarge university,if 800 studentstake generalpsychology
each semester and each student signsup for three studies,that makes 2,400 participants
available to researchers.
Subject pools are convenientfor researchers,and they are defended on the grounds
that research participation is part of the educational process -el, 1996). Ideally
students can acquire deeper insights into the research process by being in the middle
of experiments and learning something about the psychological phenomena being
investigated. To maintain the "voluntary" nature, students are given the opportunity
to complete the requirement with alternatives other than direct research participation.
Problems exist, however. Critics argue that the pools are not really voluntary, that
alternative activities (e.g., writing papers) are often so onerous and time-consuming
that students are effectivelycompelledto signup for the research, and that the research
experience is more likely to be tedious and meaningless than educational (Korn,
1988). Some research supports such concern. A study by Sieber and Saks (1989)
found evidence that 89% of 366 departments surveyed had pools that faded to meet
at least one of the APAS recommendations (below).
Despite the potential for abuse, many psychology departments try to make the
research experience educational for students. For example, during debriefing for a
memory experiment, the participant/student could be told how the study relates to
the information in Chapter X of the text being used in the introductory course.
Clzapter 5. Introduction to Experimental Research
Many departments also include creative alternative activities. These include having
nonparticipatingstudents (a) observe ongoing studies and record their observations,
(b) participate in some community volunteer work, or (c) attend,aresearch presentation
by a visiting scholar and write a brief summary of it (-el, 1996; McCord,
1991). Some studies have shown that students generally find research participation
valuable, especially ifresearchers make an explicit attempt to tie the participation to
the educationoccurring in the general psychology course (e.g.,Landrum & Chastain,
1999;Leak, 1981).
The APA (1982,pp. 47-48) has provided some explicit guidelines aboutrecruiting
students as research participants, the main points being these:
J Students should be aware of the requirement be
for the course.
J Studentsshouldget athoroughdescriptionof the requirementon
the first day of class, includinga clear description of alternative
activities if they opt not to serve as researchsubjects.
J Alternative activities must equal research participation in time
and effort and: like participation, must have some educational
value.
J All proposals for research using subject pools must have prior
IRB approval;
J Special effort must be madeto treat students courteously.
J There must bea clear and simpleprocedurefor students to complain
about mistreatment without their course grade being af-
fected.
J All other aspects of the APA ethics code must be rigorouslyfol-
lowed.
J The psychology departmentmust have a mechanismin placeto
provide periodic review of pool policies.
The "college sophomore problem" is only one example of the concern over
generahzing to other groups. Another has to do with gender. Some of psychology's
most famous research has been limited by using only males (or, less frequently, only
females), but drawing conclusions as if they apply to everyone. Perhaps the bestknown
example is Lawrence Kohlberg's research on children's moral development.
Kohlberg (1964) asked adolescent boys (aged 10-16) to read and respond to brief
accounts of various moral dilemmas. O n the basis of the boys' responses, Kohlberg
developed a six-stage theory of moral development that has become a fixture in developmental
psychology texts. At the most advanced stage, the person acts according
to a set of universal principles based on preservingjustice and indvidual rights.
Kohlberg's theory has been criticized on external validity grounds. For example,
Gdhgan (1982)arguedthat Kohlberg's model overlooksimportant gender differences
in thinking patterns and in how moral decisions are made. Males may come to place
the highest value on individual rights, but females tend to value the preservation
of indvidual relationships. Hence, females responding to some of Kohlberg's moral
dilemmas might not seem to be as morally advanced as males, but this is due to
a biasing of the entire model because Kohlberg sampled only males, according to
Gilligan.
Research psychologists also are careful about generalizingresultsfiom one culture
to another. For example, "individualist" cultures are said to emphasize the unique
person over the group, and personal responsibility and initiative are valued. On the
other hand, the group is more important than the individual in "collectivist" cultures
(Triandis, 1995).Research conclusionsbased onjust one culture might not be
universally applicable. To takejust one example, most children in the United States
are taught to place great value on personal achievement. InJapan, on the other hand,
children learn that if they stand out from the crowd, they might diminish the value of
others in the group; individual achievement is not as valuable. One study found that
personal achievement was associatedwith positive emotions for American students,
but with negative emotions forJapanese students (Kitayama,Markus, Matsumoto, &
Norasakkunlut, 1997).To conclude that feeling good about individual achievement
is a universal human trait would be a mistake. Does this mean that all research in
psychology should make cross-cultural comparisons?No. Itjust means that conclusions
sometimes need to be drawn cautiously, and with reference only to the group
studied in the research project.
Other Environments
Besides generalizing to other types of individuals, externally valid results are applicable
to other stimulus settings.This problem is the basis for the occasional criticism
of laboratory research mentioned in Chapter 3-it is sometimes said to be artificial
and too far removed from real life. Recall from the discussion of basic and applied
research (pp. 78-80) that the laboratory researcher's response to criticisms about
artificiality is to use Aronson's concept of experimental reahty. The important thing
is that people are involved in the study; mundane reality is secondary. In addition,
laboratory researchers argue that some research is designed purely for theory testing
and, as such, whether the results apply to real-life settings is less relevant than
whether the results provide a good test of the theory (Mook, 1983).
Nonetheless, important developments in many areas of psychology have resulted
from attempts to study psychologicalphenomena in real-life settings.A good example
concerns the hstory of research on human memoly. For much of the twentieth
century, memory research occurred largely in the laboratory, where countless college
sophomores memorized seemingly endless lists of words, nonsense syllables,
strings of digits, and so on. The research created a comprehensive body of knowledge
about basic memory processes that has value for the development of theories
about memory and cognition, but whether principles discovered in the lab generalized
to real-life memory situations was not clear. Change occurred in the 1970s,
led by Cornell's Ulric Neisser. In Cognition and Reality (1976), he argued that the
laboratory tradition in cognitive psychology, while producing iinportant results,
nonetheless had failed to yield enough useful information about inforination processing
in real-world contexts. He called for more research concerning what he referred
to as ecological validity-research with relevance for the everydaycognitive
Chapter 5. Introductiovl to Experimental Research
activities of people trying to adapt to their environment. Experimental psychologists,
Neisser urged, "must make a greater effort to understand coption as it occurs
in the ordinary environment and in the context of natural purposeful activity. This
would not mean an end to laboratory experiments, but a commitment to the study of
variablesthat are ecologicallyimportant rather than those that are easily manageable"
(P 7).
Neisser's call to arms was embraced by many (but not all, of course) cognitive
researchers, and the 1980s and 1990s saw increased study of such topics as eyewitness
memory (e.g., Loftus, 1979) and the long-term recall of subjects learned in
school, such as Spanish (e.g., Bahrick, 1984). Neisser himself completed an interesting
analysis of the memory ofJohn Dean (Neisser,1981),the White House chief
counsel who blew the whistle on President Richard Nixon's attempted cover-up of
dlegal activities in the Watergate scandal of the early 1970s.Dean's testimony before
Congress precipitated the scandal and led to Nixon's resignation. Dean's 245-page
account was so detailed that some reporters referred to him as a human tape recorder.
As you might know, it was later revealed that the Oval Office meetings described by
Dean were also tape-recorded by the somewhat paranoid White House. Comparing
the tapes with Dean's testimony gave Neisser a perfect opportunity to evaluate
Dean's supposedly photographic memory, which turned out to be not so photographic
after all-he recalled the general topics of the meetings reasonably well but
missed a lot of the details and was often confused about sequences of events. The
important point for external validity is that Neisser's study is a good illustration of
how our knowledge of memory can be enriched by studying phenomena outside
of the normal laboratory environment.
er Times
The third way in which external vahdity is sometimes questioned has to do with
the longevity of results. Some of the most famous experiments in the history of
psychology are the conformity studes done by Solomon Asch in the 1950s (e.g.,
Asch, 1956). These experiments were completed during a hstorical period when
conservative values were dominant in the United States, the "red menace" of the
Soviet Union was a force to be concerned about, and conformity and obedience to
authority were valued in American society. In that context, Asch found that college
students were remarkably susceptible to conformity pressures. Would the same be
true today?Would the factors that Asch found to influence conformity (e.g., group
consensus) operate in the same way now? In general, research concerned with more
fundamental processes (e.g., cognition) stands the test of time better than research
involving social factors that may be embedded in some historical context.
A Note of Caution
Although external validity has value under many circumstances, it is important to
point out that it is not always a major concern of research, and some (e.g., Mook,
1983) have even criticized the use of the term, because it carries the implication
that research low in external "validity" is therefore "invalid." Yet there are many
examples of research, completed in the laboratory under so-called artificial condtions,
that have great value for the understanding of human behavior. Consider
research on "false memory," for example (Roedger & McDermott, 1995). The
The Wlidity ofExperimental Research
typical laboratory strategy is to give people a list of words to memorize, including
a number of words from the same category-"sleep," for instance. The list might
include the words dream, bed, pillow, nap, and so on, but not the broader term
sleep. When recalling the list, many people recall the word sleep and they are often
confident that the word was on the list when they are given a recognition test. That
is, a laboratory paradgm exists demonstrating that people can sometimes remember
something with confidence that they did not experience. The phenomenon has
relevance for eyewitness memory (jurors pay more attention to confident eyewitnesses),
but the procedure is far removed from an eyewitness context. It might be
judged by some to be low in external validity. Yet there is important research going
on that explores the theoretical basis for false memory, determining, for instance,
the limits of the.phenomenon and exactly how it occurs (e.g., Goodwin, Meissner,
& Ericsson, 2001). That research will eventually produce a body of knowledge that
comprehensively explains the false memory phenomenon.
In summary, the external validity of some research finding increases as it applies
to other people, places, and times. But must researchers design a study that includes
many different groups of people, takes place in several settings, including ''redstic"
ones, and gets repeated every decade? Of course not. External validity is not
determined by an individual research project-it develops over time as research is
replicated in various contexts-and as we have just seen, it is not always a relevant
concern for research that is theory-based. Indeed, for the researcher designing a
study, considerations of external validity pale compared to the importance of our
next topic.
Internal Validity
The final type of experimental validity described by Cook and Campbell (1979) is
called internal validity-the degree to which an experiment is methodologically
sound and confound-free. In an internally valid study, the researcher feels confident
that the results, as measured by the dependent variable, are hrectly associated with
the independent variable and are not the result of some other, uncontrolled factor.
In a study with confounding factors, as we've already seen in the massed/distributed
practice example,the resultswdl be uninterpretable. The outcome couldbe the result
of the independent variable, the confounding variable(s),or some combination of
both, and there is no clear way to decide between the different interpretations. Such
a study would be quite low in internal validity.
J Self Test 5.2
1. Explainhow "anxiety" could be both amanipulatedvariable and a subjectvariable.
2. In the famous "Bobo doll" study, what were the manipulated and the subject
variables?
3. What is the basic difference between internal and external validity?
4. The study on the memory of John Dean was used to illustrate which form of
validity?
Chapter 5. Introduction to Experimental Research
Threats to Internal Validity
Any uncontrolled extraneous factor (i.e., confound) can reduce a study's internal
validity, but there are a number of problems that require special notice (Cook &
Campbell, 1979).These "threats" to internal validity are especially dangerous when
control groups are absent, a problem that sometimes occurs in program evaluation
research (Chapter10).Many ofthese threatsoccurin studies that extend over aperiod
of time during which several measures are taken. For example, participants might
receive apretest, an experimental treatment ofsomelund, and then aposttest. Ideally,
the treatment should produce some positive effect that can be assessed by observing
changes from the pretest to the posttest. A second general type of threat occurs
when comparisons are made between groups that are said to be "nonequivalent."
These so-called subject selection problems can interact with the other threats.
Pre-Post Studies
Do students learn general psychology better if the course is self-paced and computerized?
If a college institutes a program to reduce test anxiety, can it be shown
that it works? If you train people in various mnemonic strategies, wlll it improve
their memories? These are all empirical questions that ask whether people wlll
change as the result of some experience (a course, a program, memory training).
To judge whether change occurred, one typical procedure is to evaluate people
prior to the experience with what is known as a pretest. Then, after the experience,
some posttest measure is taken. The ideal outcome for the examples I've
just described is that, on the posttest, people (a) know general psychology better
than they did at the outset, (b) are less anxious in test taking than they were
before, or (c) show improvement in their memory. The typical research design
compares experimental and control groups, with the latter not experiencing the
treatment:
Experimental:: pretest treatment posttest
Control:: pretest posttest
In the absence of a control group, there are several threats to the interval validity
of research using pretests. Suppose we are trying to evaluate the effectiveness of a
college's program to help studentswho sufferfrom test anxiety (i.e.,they have decent
study slulls and seem to know the material, but they are so anxious during exams
that they don't perform well on them). During orientation, first-year students fill
out several questionnaires, including one that serves as a pretest for test anxiety. Let's
assume that the scorescan range from 20 to 100,with hlgher scoresinhcating greater
anxiety. Incoming students who score high are asked to participate in the college's
test anxiety program, whch includes relaxation training, study slulls training, and
other techniques. Three months later they are assessed again for test anxiety, and the
results look like this:
Thveats to Internal Validity
pretest
90
posttest
70
Thus, the average pretest score of those selected for the program is 90, and the
average posttest score is 70. Assuming that the difference is statistically significant,
what would you conclude? Did the treatment program work? Was the change due
to the treatment, or could other factors have been involved?I hope you can see that
there are several ways of interpreting this outcome. Read on.
History and Maturation
Sometimes an event occurs between pre- and posttestingthat produces large changes
unrelated to the treatment program; when this happens, the study is confounded
by the threat of history. For example, suppose the college in the above example
decided that grades are counterproductive to learning and that all courses
would henceforth be graded on a pass/fail basis. Furthermore, suppose this decision
came after the pretest for test anxiety and in the middle of the treatment
program for reducing anxiety. The posttest might show a huge drop in anxiety,
but this result could very likely be due to the historical event of the college's
change in gralng policy rather than to the program. Wouldn't you be
a little more relaxed about this research methods course if grades weren't an
issue?
In a sirmlar fashion, the program for test anxiety involves first-year students at the
very start of their college careers, so pre-post changes could also be the result of a
general maturation of these students as they become accustomed to collegelife. As
you probably recall, the first semester of college was a time of real change in your
life. Maturation is always a concern whenever a study extends over some period of
time.
Notice that if a control group is used, the experimenter can account for the
effects of both history and maturation. These effects can be ruled out and the test
anxiety program deemed effective if these results occurred:
Experimental:: pretest treatment posttest
90 70
Control:: pretest posttest
90 90
On the other hand, either history or maturation or both would have to be considered
as explanationsfor the changes in the experimental group if the control group
scores also dropped to 70 on the posttest.
Regression
To regress is to go back, in this case in the direction of a mean score. Hence, the
phenomenon I'm about to describe is sometimes called regression to the mean.
In essence it refers to the fact that if score 1 is an extreme score, then score 2 will
be closer to whatever the mean for the larger set of scores is. Ths is because, for
Chapter 5. Introduction to Experimental Research
-t---- Regression
FIGURE5.3 Regression to the mean.
a large set of scores, most will cluster around the mean and only a few will be far
removed from the mean (i.e., extreme scores).Imagine you are selectingsome score
randomly from the normal distribution in Figure 5.3. Most of the scores center
on the mean; so, if you make a random selection, you'll most likely choose a score
near the mean (X on the left-hand side of Figure 5.3). However, suppose you just
happen to select one that is far removed from the mean (i.e.,an extreme score-Y).
If you then choose again, are you most likely to pick
a. the exact same extreme score again?
b. a score even more extreme than the first one?
c. a score less extreme (i.e., closer to the mean) than the first one?
My guess is that you've chosen alternative "c," which means that you understand
the basic concept of regression to the mean. To take a more concrete example (refer
to the right-hand side of Figure 5.3),suppose you know that on the average (based
on several hundred throws), Ted can throw a baseball 300 feet. Then he throws
one 380 feet. If you were betting on his next throw, where would you put your
money?
a. 380 feet
b. 420 feet
c. 330 feet
Again, I imagine you've chosen "c," further convincing yourself that you get the
idea of the regressionphenomenon. But what does thls have to do with our pretestposttest
study?
In a number of pre-post studies, people are selected for some treatment because
they've made an extreme score on the pretest. Thus, in the test anxiety study,
participants were picked because on the pretest they scored very high for anxiety.
On the posttest, their anxiety scores might improve (i.e., they will be lower than on
the pretest), but the improvement could be a regression effect rather than the result
of the memory improvement program. Once again, a control group of equivalent
hgh-anxiety participants would enable the researcher to spot a possible regression
effect. For instance,the following outcomewould suggestthat someregressionmight
Threats to Internal Klidity
be in~olved,~but the program nonetheless had an effect over and above regression.
Can you see why this is so?
Experimental:: pretes treattftent posttest
90 70
Control:: pretest posttest
90 80
Regression effects can cause anumber ofproblems, and were probably the culprit
in some early stuhes that erroneously questioned the effectiveness of the well-known
Head Start program. That particular example will be taken up in Chapter 10 as an
exampleofsome ofthe problems involvedin assessinglarge-scale, federallysupported
programs.
Testing and Instrumentation
Testing is considered to be a threat to internal validity when the mere fact of talung
the pretest has an effecton posttest scores.There could be apractice effect ofrepeated
testing, or some aspects of the pretest could sensitizeparticipants to something about
the program. For example, if the treatment program is a self-paced, computerized
generalpsychology course, the pretest would be some test of knowledge. Participants
might be sensitizedby the pretest to topics about which they seem to know nothing;
they could then pay more attention to those topics during the course and do better
on the posttest as a result.
Instrumentationis a problem when there are changes in the measurement instrument
from pretest to posttest. In the self-paced general psychology course mentioned
earlier, the pretest and posttest wouldn't be the same but would presumably
be equivalent in level of difficulty. However, if the posttest happened to be easier, it
would produce improvement that was more apparent than real. Instrumentation is
sometimes a problem when the measurement tool involves observations.Those doing
the observingmight get better at it with practice, malung the posttest instrument
essentially different (more accurate in this case) from the pretest instrument.
Like the problems of histoly, maturation, and regression, the possible confounds
of testing and instrumentation can be accounted for by including a control group.
The only exception is that in the case of pretest sensitization, the experimental
group might have a slight advantage over the control group on the posttest because
the knowledge gained fiom the pretest might enable the experimental participants
to focus on specific weaknesses during the treatment phase, whereas the control
participants would not have that opportunity.
Participant Problems
Threats to internal vhdity can also arise from concerns over the individuals participating
in the study. In particular, Cook and Campbell (1979) identified two
problems.
3 ~ o t i c ethat the sentence reads, ''might be involved," not "must be involved." This is because it is also
possible that the control group's change from 90 to 80 could be due to one of the other threats. Regression
would be suspected if these other threats could not be ruled out.
Subject Selection Efff ,
Chapter 5. Introduction to Experimental Research
One of the defining features of an experimental study with a manipulated independent
variable is that participants in the hfferent conditions are equivalent to each
other except for the independent variable. In the next chapter you will learn how
these equivalent groups are formed through random assignment and matching. If
groups are not equivalent, then subject selection effects might occur. For example,
suppose two sections of a general psychology course are being offered and a
researcher wants to compare a traditional lecture course with the one combining
lecture and discussion groups. School policy (a) prevents the researcher from randomly
assigning students to the two courses, and (b)requires full lsclosure of the
nature of the courses. Thus, students can sign up for either section. You can see the
difficultyhere. If students in the lecture plus lscussion course outperform students
in the straight lecture course, what caused the difference?Was it the nature of the
course (the discussion element) or was it something about the students who chose
that course? Maybe they were more articulate (hence, interested in discussion)than
those in the straightlecture course. In short, there is a confound due to the selection
of subjects for the two groups being compared.
Selection effects can also interact with other threats to internal validity. For example,
in a study with two groups, some historical event might affect one group
but not the other. This would be referred to as a history x selection confound (read
as "history by selection"). Similarly, two groups might mature at different rates,
respond to testing at different rates, be influenced by instrumentation in different
ways, or show different degrees of regression.
One of psychology's most famous studies is (unfortunately) a good example of a
subject selection effect. Known as the "ulcers in executive monkeys" study, it was
a pioneering investigation by Joseph Brady in the area of health psychology. Brady
investigated the relationship between stress and its physical consequences by placing
pairs of rhesus monkeys in adjoining restraint chairs. One monkey, the "executive"
(note the allusion to the stereotype of the hard-driving, stressed-out, responsiblefor-everythng
business executive), could avoid mild shocks to its feet that were
programmed to occur every 20 seconds by pressing a lever at any time during the
interval. For the control monkey (stereotype of the worker with no control over
anything),the lever didn't work and it was shocked every time the executive monkey
let the 20 secondsgo by and was shocked. Thus, both monkeyswere shocked equally
often, but only one monkey had the ability to control the shocks. The outcome was
a stomach ulcer for the executive monkey, but none for the control monkey. Brady
then replicated the experiment with a second pair of monkeys and found the same
result. He eventually reported data on four pairs of animals (Brady, Porter, Conrad,
& Mason, 1958),concluding that the psychologicalstress ofbeing in command, not
just of one's own fate but also of that of a subordinate, could lead to health problems
(ulcersin this case).
The Brady study was widely reported in introductory psychology texts, and its
publication in Scientijc Anzerican (Brady, 1958) gave it an even broader audience.
However, a close examination of Brady7sprocedure showed that a subject selection
confound occurred. Specifically,Brady did not place the monkeys randomly in the
two groups. Rather, all eight of them started out as executives in the sense that
Threats to I~zterizalValidity 189
they were pretested on how quickly they would learn the avoidance conditioning
procedure. Those responlng most quickly were placed in the executive condtion
for the experiment proper. Although Brady didn't know it at the time, animals differ
in their characteristic levels of emotionality and the more emotional ones respond
most quickly to shock. Thus, he unwittingly placed highly emotional (andtherefore
ulcer-prone) animals in the executive condition and more laid-back animals in the
control condition.
The first to point out the confound was Weiss (1968), whose better-controlled
studies with rats produced results the oppositeofBradyYs.Weiss found that those with
control over the shock, in fact, developedfewer ulcers than those with no control
over the shocks.
Attrition
Participants do not always conlplete the experiment they begin. Some studies may
last for a relatively long period of time, and people move away, lose interest, and
even die. In some stuhes, participants may become uncomfortable and exercise
their right to be released from further testing. Hence, for any number of reasons,
there may be 100 participants at the start of the study and only 60 at the end. This
problem sometimes is called subject mortahty, or attrition. Attrition is a problem
because, if particular types of people are more likely to drop out than others, then
the group finishing the study is on average made up of lfferent types of people
than is the group that started the study. In essence, t h s is similar to the selection
problem because the result is that the group beginning the study is not equivalentto
the group completing the study. Note that one way to test for differences between
those continuing a study and those leaving is to look at the pretest scores or other
attributes at the outset of the study for both groups. If "attriters" and "continuers"
are inhstinguishable at the start of the study, then overall conclusions at the end of
the study are strengthened, even with the loss through attrition.
J Self Test 5.3
1. Determined to get into graduate school,Jan takes the GRE nine times. In her
first seven attempts, she always scored between 1050 and 1100,averaging 1075.
On her eighth try, she gets a 1250.What do you expect her score to be like on
her ninth try?Why?
2. How can attritionprod~~cean effect that is similar to a subject selection effect?
T h s concludes our introduction to the experimental method. The next three
chapters will elaborate-Chapter 6 begins by distinguishng between-subjects
designs from withn-subjects (or repeated measures) designs and describes a number
of control problems in experimental research. In particular, it looks at the problems
of creating equivalent groups in between-subjects designs, controlling for sequence
effects in withn-subjects designs, and the biasing effects that result from the fact
Clzapter 5. Introduction to Experimental Research
that both experimenters and participants are humans. Chapters 7 and 8 look at a
variety of research designs, ranging from those with a single independent variable
(Chapter 7) to those with multiple independent variables, whch are known as
factorial designs (Chapter 8).
Essential Features of Experimental Research
An experiment in psychology involves establishng independent variables, controlling
extraneous variables, and measuring dependent variables. Independent variables
refer to the creation of experimental conditions or comparisons that are under the
direct control of the researcher. Manipulated independent variables can involveplating
participants in different situations, assigning them different tasks, or giving them
different instructions. Extraneous variables are factors that are not of interest to the
researcher; failure to control them leads to a problem called confounding. When a
confound exists, the results could be due to the independent variable or they could
be due to the confounding variable. Dependent variables are the behaviors that are
measured in the study; they must be defined precisely (operationally).
Manipulated versus Subject Variables
Some research in psychology compares groups of participants who differ &omeach
other in some way before the experiment begins (e.g., gender, age, introversion).
When this occurs, the independent variable of interest in the study is said to be
selected by the experimenter rather than manipulated directly, and it is called a
subject variable. Research in psychology hequently includes both manipulated and
subjectvariables.In awell-controlled study, conclusionsabout cause and effect canbe
drawn when manipulated variables are used, but not when subjectvariables are used.
The Validity of Experimental Research
There are four ways in whch psychological research can be considered valid. Valid
research uses statistical analysis properly (statisticalconclusion validity), defines independent
and dependent variables meaningfully (construct validity), and is free
of confounding variables (internal validity). External validity refers to whether the
study's results generalize beyond the particular experiment just completed.
Threats to Internal Validity
The internal validity of an experiment can be threatened by a number of factors.
History, maturation, regression, testing, and instrumentation are confounding factors
especially likely to occur in poorly controlled studies that include comparisons
rn Applicatiotzs Exercises
between pretests and posttests. Selection problems can occur when comparisons are
made between groups of individuals that are nonequivalent before the study begins
(e.g.,Brady's ulcers in executive monkeys study). Selection problems also can interact
with the other threats to internal validity. In experiments extendng over time,
attrition can result in a type of selection problem-the small group remaining at the
conclusion of the study could be systematicallyhfferent from the larger group that
started the study.
Chapter Review Qu
1. With anxiety as an example, illustrate the hfference between independent
variables that are (a) manipulated variables and (b) subject variables.
2. Distinguish between Mill's methods of Agreement and Difference, and apply
them to a study with an experimental and a control group.
3. Use examples to show the dfferences between situational, task, and instructional
independent variables.
4. What is a confound and why does the presence of one make it difficult to
interpret the results of a study?
5. When a study uses subject variables, it is said that causal conclusions cannot be
drawn. Why?
6. Describe the circumstancesthat could reduce the statistical conclusion vahdity
of an experiment.
7. Describe the three types of circumstances in which external validity can be
reduced.
8. Explain how the presence of a control group can help reduce the various
threats to internal validity. Use histoiy, maturation, or regression as a specific
example.
9. Use the Brady study of "ulcers in executive monkeys" to illustrate selection
effects.
10. What is attrition and why can it produce interpretation problems similar to
subject selection problems?
Applicatzons Exerczses '
Exercise 5.1-Identifying Variables
For each of the following, identify the independent variable(~),the levels of the
independent variable(s),and the dependent variabJe(s).For independent variables,
Chapter 5. Introduction to Experimental Research
identi@whether they are manipulated variables or nonmanipulated subjectvariables.
For dependent variables, indcate the scale of measurement being used.
1. In a cognitive mapping study, first-year students are compared with seniors in
their abihty to point accurately to campus buildings. Some of the buildings are
in the center of the campus along well-traveled routes; other buildngs are on
the periphery of the campus. Participants are asked to indicate (on a scale of
1 to 10) how confident they are about their pointing; the amount of error (in
degrees) in their pointing is also recorded.
2. In a study of the effectiveness of anew drug in treating depression,some patients
receive the drug while others only think they are receiving it. A third group
is not treated. After the program is completed, participants complete the Beck
Depression Inventory and are rated on depression (10-point scale) by trained
observers.
3. In a Pavlovian conditioning study, hungry animals are conditioned to salivate
to the sound of a tone by pairing the tone with food. For some animals, the
tone is turned on and then off before the food is presented. For others, the tone
remains on until the food is presented. For still others, the food precedes the
tone. Experimenters record when salivation first begins and how much saliva
accumulatesfor a fixed time interval.
4. In a study of developmental psycholinguistics, 2-, 3-, and 4-year-old children
are shown dolls and asked to act out several scenes to determine if they can use
certain grammatical rules. Sometimes each child is asked to act out a scene in
the active voice (Ernie hit Bert); at other times, each chdd acts out a scene in
the passive voice (Ernie was hit by Bert). Children are judged by whether or
not they act out the scene accurately (two possible scores) and by how quickly
they begin acting out the scene.
5. In a study of maze learning, some rats are fed after reaching the end of the maze
during the course of 30 trials; others aren't fed at all; still others are not fed for
the first 15 trials but are fed for each of the 15 trials thereafter; a final group is
fed for the first 15 trials and not fed for the last 15. The researcher makes note
of any errors (wrong turns) made and how long it takes the animal to reach the
goal.
6. In a helping behavior study, passersby in a mall are approached by a student
who is either well dressed or shabbily dressed. The student asks for directions
to either the public restroom or the Kmart. Nearby, an experimenter records
whether or not people provide any help.
Exercise 5.2-Spot the Confound(s)
For each of the following, identify the independent and dependent variables, the
levels of each independent variable, and find at least one extraneous variable that has
not been adequately controlled (i.e., that is creating a confound). Use the format
dustrated in Table 5.2.
F.
Applications Exercises
1. A testing company is trying to determine if a new type of driver (club 1)wlll
drive a golf ball greater distances than three competing brands (clubs 2-4).
Twenty male golfpros are recruited. Each golfer hits 50 balls with club 1,then
50 more with 2, then 50 with 3, then 50 with 4. To add reahsm, the experiment
takes place over the first four holes of an actual golf course-the first set of
50 balls is hit from the first tee, the second 50 from the second tee, and so on.
The first four holes are all 380-400 yards in length, and each is a par 4 hole.
2. Aresearcherisinterestedin the abilityofschizophrenicpatientstojudge different
time durations. It is hypothesized that loud noise will adversely affect their
judgments. Participants are tested two ways. In the "quiet" condition, some
participants are tested in a small soundproofroom that is used for hearing tests.
Those in the "noisy" condition are tested in a nurse's office where a stereo is
playing music at a constant (andloud) volume. Because of schedulingproblems,
locked-ward (i.e.,slightlymore dangerous)patients are available for testing only
on Monday and open-ward (i.e.,slightlyless dangerous)patients are availablefor
testing onlyon Thursday.Furthermore,hearingtests arescheduledforThursdays,
so the soundproofroom is available only on Monday.
3. An experimenteris interestedin whether memory can be improvedifpeople use
visual imagery. Participants (allfemales)are placed in one of two groups-some
are trained in imagery techniques, and others are trained to use rote repetition.
The imagery group is given a list of 20 concrete nouns (for which it is easier
to form images than abstract nouns) to study, and the other group is given 20
abstract words (ones that are especially easy to pronounce, so repetition will
be easy), matched with the concrete words for frequency of general usage. To
match the method of presentation with the method of study, participants in the
imagery group are shown the words visually (on a computer screen).To control
for any "compu-phobia," rote participants also sit at the computer terminal,
but the computer is programmed to read the lists to them. After hearing their
respective word lists, participants have 60 seconds to recall as many words as
they can in any order that occurs to them.
4. A socialpsychologistis interested in helpingbehavior and happens to know two
male graduatestudentswho would be happy to assist. The first (Ned)is generally
well dressed, but the second (Ted) doesn't care much about appearances. An
experiment is designed in which passersby in a mall will be approached by a
studentwho is eitherwell-dressed Ned or shabbily dressedTed.Allof the testing
sessions occur between 8 and 9 o'clock in the evening, with Ned working on
Monday and Ted worlung on Friday. The student will approach a shopper and
askfor a dollarfor a cup of coffee.Nearby, the experimenterwill record whether
or not people give money.
Exercise 5.3-Operational Definitions (Again)
In Chapter3, you first learned about operationaldefinitionsand completed an exercise
on the operational definitions of some familiar constructs used in psychological
I research. In this exercise, you are to play the role of an experimenter designing a
i study. For each of the four hypotheses:
Chapter 5. Introduction to Experimental Research
a. identify the independent variable(s), decide how many levels of the independent
variable(s) you would like to use, and identify the levels;
b. identify the dependent variable in each study; and
c. create operational definitions for your independent and dependent variables.
1. People will be more likely to offer help to someone in need if the situation
unambiguously calls for help.
2. Ability to concentrate on a task deteriorates when people feel crowded.
3. Good bowlers improve their performance in the presence of an audience,
whereas average bowlers do worse.
4. Animals learn a difficult maze best when they are moderately aroused. They do
poorly in difficult mazes when their arousal is low or high. When the maze is
easy, performance improves steadily from low to moderate to high arousal.
Answers to the Self Tests:
J 5.1.
I. IVs = problem difficulty and reward size
DV = number of anagrams solved
2. Extraneous variables are all of the factors that need to be controlled or kept
constant from one group to another in an experiment; failure to control
these variables results in a confound.
3. Frustration couldbe manipulated as an IV by havingtwo groups,one allowed
to complete a maze, and the other prevented from doing so. It could also be
an extraneous variable being controlled in a study in which frustration was
avoided completely. It could also be what is measured in a study that looked
to see if self-reported frustration levels differed for those given impossible
problems to solve,whereas others are given solvable problems.
J 5.2.
1. As a manipulated variable, some people in a study could be made anxious
("you will be shocked if you make errors"), and others not; as a subject
variable, people who are generally anxiouswould be in one group, and low
anxious people would be in a second group.
2. Manipulated + the viewing experience shown to children.
Subject + gender.
3. Internal+the studyis freefrommethodologicalflaws,especially confounds.
External + results generalize beyond the confines of the study.
4. Ecological.
J 5.3.
1. Somewherearound 1275; regression to the mean
2. If those who drop out are systematicallydifferent from those who stay, then
the group ofsubjects who startedthe studywill be quite differentfrom those
who finished.