4	The	IBM	SPSS	Statistics	environment
4.1	What	will	this	chapter	tell	me?	136
4.2	Versions	of	IBM	SPSS	Statistics	137
4.3	Windows,	Mac	OS,	and	Linux	137
4.4	Getting	started	138
4.5	The	data	editor	139
4.6	Entering	data	into	IBM	SPSS	Statistics	144
4.7	Importing	data	156
4.8	The	SPSS	viewer	157
4.9	Exporting	SPSS	output	162
4.10	The	syntax	editor	162
4.11	Saving	files	164
4.12	Opening	files	165
4.13	Extending	IBM	SPSS	Statistics	166
4.14	Brian’s	attempt	to	woo	Jane	171
4.15	What	next?	172
4.16	Key	terms	that	I’ve	discovered	172
Smart	Alex’s	tasks	173
4.1	What	will	this	chapter	tell	me?
At	about	5	years	old	I	moved	from	nursery	to	primary	school.	Even	though	my
older	brother	(you	know,	Paul,	‘the	clever	one’)	was	already	there,	I	was	really
apprehensive	on	my	first	day.	My	nursery	school	friends	were	all	going	to
different	schools	and	I	was	terrified	about	meeting	new	children.	I	arrived	in	my
classroom,	and	as	I’d	feared,	it	was	full	of	scary	children.	In	a	fairly	transparent
ploy	to	make	me	think	that	I’d	be	spending	the	next	6	years	building	sand
castles,	the	teacher	told	me	to	play	in	the	sandpit.	While	I	was	nervously	trying
to	discover	whether	I	could	build	a	pile	of	sand	high	enough	to	bury	my	head	in
it,	a	boy	came	to	join	me.	His	name	was	Jonathan	Land,	and	he	was	really	nice.
Within	an	hour,	he	was	my	new	best	friend	(5-year-olds	are	fickle	…)	and	I
loved	school.	We	remained	close	friends	all	through	primary	school.	Sometimes,
new	environments	seem	scarier	than	they	really	are.	This	chapter	introduces	you
to	what	might	seem	like	a	scary	new	environment:	IBM	SPSS	Statistics.	I	won’t
lie,	the	SPSS	environment	is	a	more	unpleasant	environment	in	which	to	spend
time	than	a	sandpit,	but	try	getting	a	plastic	digger	to	do	a	least	squares
regression	for	you.	For	the	purpose	of	this	chapter,	I	intend	to	be	a	5-year-old
called	Jonathan.	Thinking	like	a	5-year-old	comes	quite	naturally	to	me,	so	it
should	be	fine.	I	will	hold	your	hand,	and	show	you	how	to	use	the	diggers,
excavators,	grabbers,	cranes,	front	loaders,	telescopic	handlers,	and	tractors1	in
the	sandpit	of	IBM	SPSS	Statistics.	In	short,	we’re	going	to	learn	the	tools	of
IBM	SPSS	Statistics,	which	will	enable	us,	over	subsequent	chapters,	to	build	a
magical	sand	palace	of	statistics.	Or	thrust	our	faces	into	our	computer	monitor.
Time	will	tell.
1	Yes,	I	have	been	spending	a	lot	of	time	with	a	vehicle-obsessed	2-year-old	boy
recently.
Figure	4.1	All	I	want	for	Christmas	is	…	some	tasteful	wallpaper
4.2	Versions	of	IBM	SPSS	Statistics	
This	book	is	based	primarily	on	version	25	of	IBM	SPSS	Statistics	(I	generally
call	it	SPSS	for	short).	IBM	regularly	improves	and	updates	SPSS,	but	this	book
covers	only	a	small	proportion	of	the	functionality	of	SPSS,	and	focuses	on	tools
that	have	been	in	the	software	a	long	time	and	work	well.	Consequently,
improvements	made	in	new	versions	of	SPSS	Statistics	are	unlikely	to	impact
the	contents	of	this	book.	With	a	bit	of	common	sense,	you	can	get	by	with	a
book	that	doesn’t	explicitly	cover	the	latest	version	(or	the	version	you’re	using).
So,	although	this	edition	was	written	using	version	25,	it	will	happily	cater	for
earlier	versions	(certainly	back	to	version	18),	and	most	likely	for	versions	26
onwards	(unless	IBM	does	a	major	overhaul	just	to	keep	me	on	my	toes).
IBM	SPSS	Statistics	comes	in	four	flavours:2
2	You	can	look	at	a	detailed	comparison	here:
https://www.ibm.com/marketplace/spss-statistics/purchase
Base:	Most	of	the	functionality	covered	in	this	book	is	in	the	base	package.
The	exceptions	are	exact	tests	and	bootstrapping,	which	are	available	only
in	the	premium	edition.
Standard:	This	has	everything	in	the	base	package	but	also	covers
generalized	linear	models	(which	we	don’t	get	into	in	this	book).
Professional:	This	has	everything	in	the	standard	edition,	but	with	missing
value	imputation	and	decision	trees	and	forecasting	(again,	not	covered	in
this	text).
Premium:	This	has	everything	in	the	professional	package	but	also	exact
tests	and	bootstrapping	(which	we	cover	in	this	book),	and	structural
equation	modelling	and	complex	sampling	(which	we	don’t	cover).
There	is	also	a	subscription	model	where	you	can	buy	monthly	access	to	a	base
package	(as	described	above	but	also	including	bootstrapping)	and,	for	an	extra
fee,	add-ons	for:
Custom	tables	and	advanced	statistics	users:	is	similar	to	the	standard
package	above	in	that	it	adds	generalized	linear	models.	It	also	includes
logistic	regression,	survival	analysis,	Bayesian	analysis	and	more
customization	of	tables.
Complex	sampling	and	testing	users:	adds	functionality	for	missing	data
and	complex	sampling	as	well	as	categorical	principal	components	analysis,
multidimensional	scaling,	and	correspondence	analysis.
Forecasting	and	decision	trees	users:	as	the	name	suggests,	this	adds
functionality	for	forecasting	and	decision	trees	as	well	as	neural	network
predictive	models.
If	you	are	subscribing,	then	most	of	the	contents	of	this	book	appear	in	the	base
subscription	package,	with	a	few	things	(e.g.,	Bayesian	statistics	and	logistic
regression)	requiring	the	advanced	statistics	add	on.
4.3	Windows,	Mac	OS	and	Linux	
SPSS	Statistics	works	on	Windows,	Mac	OS,	and	Linux	(and	Unix-based
operating	systems	such	as	IBM	AIX,	HP-UX,	and	Solaris).	SPSS	Statistics	is
built	on	a	program	called	Java,	which	means	that	the	Windows,	Mac	OS	and
Linux	versions	differ	very	little	(if	at	all).	They	look	a	bit	different,	but	only	in
the	way	that,	say,	Mac	OS	looks	different	from	Windows	anyway.3	I	have	taken
the	screenshots	from	Windows	because	that’s	the	operating	system	that	most
readers	will	use,	but	you	can	use	this	book	if	you	have	a	Mac	(or	Linux).	In	fact,
I	wrote	this	book	using	a	Mac.
3	You	can	get	the	Mac	OS	version	to	display	itself	like	the	Windows	version,	but
I	have	no	idea	why	you’d	want	to	do	that.
Figure	4.2	The	start-up	window	of	IBM	SPSS
4.4	Getting	started	
SPSS	mainly	uses	two	windows:	the	data	editor	(this	is	where	you	input	your
data	and	carry	out	statistical	functions)	and	the	viewer	(this	is	where	the	results
of	any	analysis	appear).	You	can	also	activate	the	syntax	editor	window	(see
Section	4.10),	which	is	for	entering	text	commands	(rather	than	using	dialog
boxes).	Most	beginners	ignore	the	syntax	window	and	click	merrily	away	with
their	mouse,	but	using	syntax	does	open	up	additional	functions	and	can	save
time	in	the	long	run.	Strange	people	who	enjoy	statistics	can	find	numerous	uses
for	syntax	and	dribble	excitedly	when	discussing	it.	At	times	I’ll	force	you	to	use
syntax,	but	only	because	I	wish	to	drown	in	my	own	saliva.
When	SPSS	loads,	the	start-up	window	in	Figure	4.2	appears.	At	the	top	left	is	a
box	labelled	New	Files,	where	you	can	select	to	open	an	empty	data	editor
window,	or	begin	a	database	query	(something	not	covered	in	this	book).
Underneath,	in	the	box	labelled	Recent	Files,	there	will	appear	a	list	of	any	SPSS
data	files	(on	the	current	computer)	on	which	you’ve	recently	worked.	If	you
want	to	open	an	existing	file,	select	it	from	the	list	and	then	click	 	.
If	you	want	to	open	a	file	that	isn’t	in	the	list,	select	 	and	click	
	to	open	a	window	for	browsing	to	the	file	you	want	(see	Section
4.12).	The	dialog	box	also	has	an	overview	of	what’s	new	in	this	release	and
contains	links	to	tutorials	and	support,	and	a	link	to	the	online	developer
community.	If	you	don’t	want	this	dialog	to	appear	when	SPSS	starts	up,	then
select	.	
Figure	4.3	The	SPSS	Data	Editor
4.5	The	data	editor	
Unsurprisingly,	the	data	editor	window	is	where	you	enter	and	view	data	(Figure
4.3).	At	the	top	of	this	window	(or	the	top	of	the	screen	on	a	Mac)	is	a	menu	bar
like	ones	you’ve	probably	seen	in	other	programs.	As	I	am	sure	you’re	aware,
you	can	navigate	menus	by	using	your	mouse/trackpad	to	move	the	on-screen
arrow	to	the	menu	you	want	and	pressing	(clicking)	the	left	mouse	button	once.
The	click	will	reveal	a	list	of	menu	items	in	a	list,	which	again	you	can	click
using	the	mouse.	In	SPSS	if	a	menu	item	is	followed	by	a	 	then	clicking	on
it	will	reveal	another	list	of	options	(a	submenu)	to	the	right	of	that	menu	item;	if
it	doesn’t	then	clicking	on	it	will	activate	a	window	known	as	a	dialog	box.	Any
window	in	which	you	have	to	provide	information	or	a	response	(i.e.,	‘have	a
dialog’	with	the	computer)	is	a	dialog	box.	When	referring	to	selecting	items	in	a
menu,	I	will	use	the	menu	item	names	connected	by	arrows	to	indicate	moving
down	items	or	through	submenus.	For	example,	if	I	were	to	say	that	you	should
select	the	Save	As	…	option	in	the	File	menu,	you	will	see	File	 	Save	As	…
The	data	editor	has	a	data	view	and	a	variable	view.	The	data	view	is	for
entering	data,	and	the	variable	view	is	for	defining	characteristics	of	the
variables	within	the	data	editor.	To	switch	between	the	views,	select	one	of	the
tabs	at	the	bottom	of	the	data	editor	( );	the	highlighted	tab
indicates	which	view	you’re	in	(although	it’s	obvious).	Let’s	look	at	some
features	of	the	data	editor	that	are	consistent	in	both	views.	First,	the	menus.
Some	letters	are	underlined	within	menu	items	in	Windows,	which	tells	you	the
keyboard	shortcut	for	accessing	that	item.	With	practice	these	shortcuts	are	faster
than	using	the	mouse.	In	Windows,	menu	items	can	be	activated	by
simultaneously	pressing	Alt	on	the	keyboard	and	the	underlined	letter.	So,	to
access	the	File	 	Save	As	…	menu	item	you	would	simultaneously	press	Alt
and	F	on	the	keyboard	to	activate	the	File	menu,	then,	keeping	your	finger	on	the
Alt	key,	press	A.	In	Mac	OS,	keyboard	shortcuts	are	listed	in	the	menus,	for
example,	you	can	save	a	file	by	simultaneously	pressing	 	and	S	(I	denote
these	shortcuts	as	 	+	S).	Below	is	a	brief	reference	guide	to	each	of	the
menus:
File	This	menu	contains	all	the	options	that	you	expect	to	find	in	File
menus:	you	can	save	data,	graphs	or	output,	open	previously	saved	files	and
print	graphs,	data	or	output.
Edit	This	menu	contains	edit	functions	for	the	data	editor.	For	example,	it	is
possible	to	cut	and	paste	blocks	of	numbers	from	one	part	of	the	data	editor
to	another	(which	is	handy	when	you	realize	that	you’ve	entered	lots	of
numbers	in	the	wrong	place).	You	can	insert	a	new	variable	into	the	data
editor	(i.e.,	add	a	column)	using	 ,	and	add	a	new	row	of	data
between	two	existing	rows	using	 .	Other	useful	options	for
large	data	sets	are	the	ability	to	skip	to	a	particular	row	( )	or
column	( )	in	the	data	editor.	Finally,	although	for	most	people
the	default	preferences	are	fine,	you	can	change	them	by	selecting	.
View	This	menu	deals	with	system	specifications	such	as	whether	you	have
grid	lines	on	the	data	editor,	or	whether	you	display	value	labels	(exactly
what	value	labels	are	will	become	clear	later).
Data	This	menu	is	all	about	manipulating	the	data	in	the	data	editor.	Some
of	the	functions	we’ll	use	are	the	ability	to	split	the	file	( )	by	a
grouping	variable	(see	Section	6.10.4),	to	run	analyses	on	only	a	selected
sample	of	cases	( ),	to	weight	cases	by	a	variable	(
)	which	is	useful	for	frequency	data	(Chapter	19),	and	to
convert	the	data	from	wide	format	to	long	or	vice	versa	( )
which	we’ll	use	in	Chapter	12.
Transform	This	menu	contains	items	relating	to	manipulating	variables	in
the	data	editor.	For	example,	if	you	have	a	variable	that	uses	numbers	to
code	groups	of	cases	then	you	might	want	to	switch	these	codes	around	by
changing	the	variable	itself	( )	or	creating	a	new	variable	(
);	see	SPSS	Tip	11.2.	You	can	also	create	new	variables	from
existing	ones	(e.g.,	you	might	want	a	variable	that	is	the	sum	of	10	existing
variables)	using	the	compute	function	( );	see	Section	6.12.6.
Analyze	The	fun	begins	here,	because	the	statistical	procedures	lurk	in	this
menu.	Below	is	a	rundown	of	the	bits	of	the	statistics	menu	that	we’ll	use	in
this	book:
Descriptive	Statistics	We’ll	use	this	for	conducting	descriptive
statistics	(mean,	mode,	median,	etc.),	frequencies	and	general	data
exploration.	We’ll	use	Crosstabs…	for	exploring	frequency	data	and
performing	tests	such	as	chi-square,	Fisher’s	exact	test	and	Cohen’s
kappa	(Chapter	19).
Compare	Means	We’ll	use	this	menu	for	t-tests	(related	and	unrelated
–	Chapter	10)	and	one-way	independent	ANOVA	(Chapter	12).
General	Linear	Model	This	menu	is	for	linear	models	involving
categorical	predictors,	typically	experimental	designs	in	which	you
have	manipulated	a	predictor	variable	using	different	cases
(independent	design),	the	same	cases	(repeated	measures	deign)	or	a
combination	of	these	(mixed	designs).	It	also	caters	for	multiple
outcome	variables,	such	as	in	multivariate	analysis	of	variance
(MANOVA)	–	see	Chapters	13–17.
Mixed	Models	We’ll	use	this	menu	in	Chapter	21	to	fit	a	multilevel
linear	model	and	growth	curve.
Correlate	It	doesn’t	take	a	genius	to	work	out	that	this	is	where
measures	of	correlation	hang	out,	including	bivariate	correlations	such
as	Pearson’s	r,	Spearman’s	rho	(ρ)	and	Kendall’s	tau	(τ)	and	partial
correlations	(see	Chapter	8).
Regression	There	are	a	variety	of	regression	techniques	available	in
SPSS,	including	simple	linear	regression,	multiple	linear	regression
(Chapter	9)	and	logistic	regression	(Chapter	20).
Loglinear	Loglinear	analysis	is	hiding	in	this	menu,	waiting	for	you,
and	ready	to	pounce	like	a	tarantula	from	its	burrow	(Chapter	19).
Dimension	Reduction	You’ll	find	factor	analysis	here	(Chapter	19).
Scale	We’ll	use	this	menu	for	reliability	analysis	in	Chapter	18.
Nonparametric	Tests	Although,	in	general,	I’m	not	a	fan	of	these	tests,
in	Chapter	7	I	prostitute	my	principles	to	cover	the	Mann–Whitney
test,	the	Kruskal–Wallis	test,	Wilcoxon’s	test	and	Friedman’s	ANOVA.
Graphs	This	menu	is	used	to	access	the	Chart	Builder	(discussed	in	Chapter
5),	which	is	your	gateway	to,	among	others,	bar	charts,	histograms,
scatterplots,	box–whisker	plots,	pie	charts	and	error	bar	graphs.
Utilities	There’s	plenty	of	useful	stuff	here,	but	we	don’t	get	into	it.	I	will
mention	that	 	is	useful	for	writing	notes	about	the	data	file	to
remind	yourself	of	important	details	that	you	might	forget	(where	the	data
come	from,	the	date	they	were	collected	and	so	on).
Extensions	(formerly	Add-ons)	Use	this	menu	to	access	other	IBM	software
that	augments	SPSS	Statistics.	For	example,	IBM	SPSS	Sample	Power
computes	the	sample	size	required	for	studies	and	power	statistics	(see
Section	2.9.7),	and	if	you	have	the	premium	version	you’ll	find	IBM	SPSS
AMOS	listed	here,	which	is	software	for	structural	equation	modelling.
Because	most	people	won’t	have	these	add-ons	(including	me)	I’m	not
going	to	discuss	them	in	the	book.	We’ll	also	use	the	Utilities	submenu	to
install	custom	dialog	boxes	( )	later	in	this	chapter.4
Window	This	menu	allows	you	to	switch	from	window	to	window.	So,	if
you’re	looking	at	the	output	and	you	wish	to	switch	back	to	your	data	sheet,
you	can	do	so	using	this	menu.	There	are	icons	to	shortcut	most	of	the
options	in	this	menu,	so	it	isn’t	particularly	useful.
Help	Use	this	menu	to	access	extensive	searchable	help	files.
4	In	version	23	of	IBM	SPSS	Statistics,	this	function	can	be	found	in	Utilities	
	Custom	Dialogs	….
SPSS	Tip	4.1	Save	time	and	avoid	RSI	
By	default,	when	you	go	to	open	a	file,	SPSS	looks	in	the	directory	in
which	it	is	stored,	which	is	usually	not	where	you	store	your	data	and
output.	So,	you	waste	time	navigating	your	computer	trying	to	find
your	data.	If	you	use	SPSS	as	much	as	I	do	then	this	has	two
consequences:	(1)	all	those	seconds	have	added	up	to	weeks
navigating	my	computer	when	I	could	have	been	doing	something
useful	like	playing	my	drum	kit;	(2)	I	have	increased	my	chances	of
getting	RSI	in	my	wrists,	and	if	I’m	going	to	get	RSI	in	my	wrists	I
can	think	of	more	enjoyable	ways	to	achieve	it	than	navigating	my
computer	(drumming	again,	obviously).	Luckily,	we	can	avoid	wrist
death	by	using	Edit	 	to	open	the	Options	dialog	box
(Figure	4.4)	and	selecting	the	‘File	Locations’	tab.
In	this	dialog	box	we	can	select	the	folder	in	which	SPSS	will
initially	look	for	data	files	and	other	files.	For	example,	I	keep	my
data	files	in	a	single	folder	called,	rather	unimaginatively,	‘Data’.	In
the	dialog	box	in	Figure	4.4	I	have	clicked	on	 	and	then
navigated	to	my	data	folder.	SPSS	will	now	use	this	as	the	default
location	when	I	open	files,	and	my	wrists	are	spared	the	indignity	of
RSI.	You	can	also	select	the	option	for	SPSS	to	use	the	Last	folder
used,	in	which	case	SPSS	remembers	where	you	were	last	time	it	was
loaded	and	uses	that	folder	as	the	default	location	when	you	open	or
save	files.
Figure	4.4	The	Options	dialog	box
At	the	top	of	the	data	editor	window	are	a	set	of	icons	(see	Figure	4.3)	that	are
shortcuts	to	frequently	used	facilities	in	the	menus.	Using	the	icons	saves	you
time.	Below	is	a	brief	list	of	these	icons	and	their	functions.
 Use	this	icon	to	open	a	previously	saved	file	(if	you	are	in	the
data	editor,	SPSS	assumes	you	want	to	open	a	data	file;	if	you	are	in	the	output
viewer,	it	will	offer	to	open	a	viewer	file).
 Use	this	icon	to	save	files.	It	will	save	the	file	you	are	currently
working	on	(be	it	data,	output	or	syntax).	If	the	file	hasn’t	already	been	saved	it
will	produce	the	Save	Data	As	dialog	box.
 Use	this	icon	for	printing	whatever	you	are	currently	working
on	(either	the	data	editor	or	the	output).	The	exact	print	options	will	depend	on
your	printer.	By	default,	SPSS	prints	everything	in	the	output	window,	so	a
useful	way	to	save	trees	is	to	print	only	a	selection	of	the	output	(see	SPSS	Tip
4.5).
 Clicking	on	this	icon	activates	a	list	of	the	last	12	dialog	boxes
that	were	used;	select	any	box	from	the	list	to	reactivate	the	dialog	box.	This
icon	is	a	useful	shortcut	if	you	need	to	repeat	parts	of	an	analysis.
 The	big	arrow	on	this	icon	implies	to	me	that	clicking	it
activates	a	miniaturizing	ray	that	shrinks	you	before	sucking	you	into	a	cell	in
the	data	editor,	where	you	will	spend	the	rest	of	your	days	cage-fighting	decimal
points.	It	turns	out	my	intuition	is	wrong,	though,	and	this	icon	opens	the	‘Case’
tab	of	the	Go	To	dialog	box,	which	enables	you	to	go	to	a	specific	case	(row)	in
the	data	editor.	This	shortcut	is	useful	for	large	data	files.	For	example,	if	we
were	analysing	a	survey	with	3000	respondents,	and	wanted	to	look	at
participant	2407’s	responses,	rather	than	tediously	scrolling	down	the	data	editor
to	find	row	2407	we	could	click	this	icon,	enter	2407	in	the	response	box	and
click	 	(Figure	4.5,	left).
 As	well	as	data	files	with	huge	numbers	of	cases,	you
sometimes	have	ones	with	huge	numbers	of	variables.	Like	the	previous	icon,
clicking	this	one	opens	the	Go	To	dialog	box	but	in	the	‘Variable’	tab,	which
enables	you	to	go	to	a	specific	variable	(column)	in	the	data	editor.	For	example,
the	data	file	we	use	in	Chapter	18	(SAQ.sav)	contains	23	variables	and	each
variable	represents	a	question	on	a	questionnaire	and	is	named	accordingly.	If	we
wanted	to	go	to	Question	15,	rather	than	getting	wrist	cramp	by	scrolling	across
the	data	editor	to	find	the	column	containing	the	data	for	Question	15,	we	could
click	this	icon,	scroll	down	the	variable	list	to	Question	15	and	click	
	(Figure	4.5,	right).
Figure	4.5	The	Go	To	dialog	boxes	for	a	case	(left)	and	a	variable	(right)
 Clicking	on	this	icon	opens	a	dialog	box	that	shows	you	the
variables	in	the	data	editor	on	the	left	and	summary	information	about	the
selected	variable	on	the	right.	Figure	4.6	shows	the	dialog	box	for	the	same	data
file	that	we	discussed	for	the	previous	icon.	I	have	selected	the	first	variable	in
the	list	on	the	left,	and	on	the	right	we	see	the	variable	name	(Question_01),	the
label	(Statistics	makes	me	cry),	the	measurement	level	(ordinal),	and	the	value
labels	(e.g.,	the	number	1	represents	the	response	of	‘strongly	agree’).
Figure	4.6	Dialog	box	for	the	Variables	icons
 If	you	select	a	variable	(column)	in	the	data	editor	by	clicking
on	the	name	of	the	variable	(at	the	top	of	the	column)	so	that	the	column	is
highlighted,	then	clicking	this	icon	will	produce	a	table	of	descriptive	statistics
for	that	variable	in	the	viewer	window.	To	get	descriptive	statistics	for	multiple
variables	hold	down	Ctrl	as	you	click	at	the	top	of	the	columns	you	want	to
summarize	to	highlight	them,	then	click	the	icon.
 I	initially	thought	that	this	icon	would	allow	me	to	spy	on	my
neighbours,	but	this	shining	diamond	of	excitement	was	snatched	cruelly	from
me	as	I	discovered	that	it	enables	me	to	search	for	words	or	numbers	in	the	data
editor	or	viewer.	In	the	data	editor,	clicking	this	icon	initiates	a	search	within	the
variable	(column)	that	is	currently	active.	This	shortcut	is	useful	if	you	realize
from	plotting	the	data	that	you	have	made	an	error,	for	example	typed	20.02
instead	of	2.02	(see	Section	5.4),	and	you	need	to	find	the	error	–	in	this	case	by
searching	for	20.02	within	the	relevant	variable	and	replacing	it	with	2.02
(Figure	4.7).
Figure	4.7	The	Find	and	Replace	dialog	box
 Clicking	on	this	icon	inserts	a	new	case	in	the	data	editor	(it
creates	a	blank	row	at	the	point	that	is	currently	highlighted	in	the	data	editor).
 Clicking	on	this	icon	creates	a	new	variable	to	the	left	of	the
variable	that	is	currently	active	(to	activate	a	variable	click	the	name	at	the	top	of
the	column).
 Clicking	on	this	icon	is	a	shortcut	to	the	Data	
dialog	box	(see	Section	6.10.4).	In	SPSS,	we	differentiate	groups	of	cases	by
using	a	coding	variable	(see	Section	4.6.5),	and	this	function	runs	any	analyses
separately	for	groups	coded	with	such	a	variable.	For	example,	imagine	we	test
males	and	females	on	their	statistical	ability.	We	would	code	each	participant
with	a	number	that	represents	their	sex	(e.g.,	1	=	female,	0	=	male).	If	we	then
want	to	know	the	mean	statistical	ability	for	males	and	females	separately	we
ask	SPSS	to	split	the	file	by	the	variable	Sex	and	then	run	descriptive	statistics.
 This	icon	shortcuts	to	the	Data	 	dialog	box.	As	we
shall	see,	you	sometimes	need	to	use	the	weight	cases	function	when	you	analyse
frequency	data	(see	Section	19.7.2).	It	is	also	useful	for	some	advanced	issues	in
survey	sampling.
 This	icon	is	a	shortcut	to	the	Data	 	dialog	box,
which	can	be	used	if	you	want	to	analyse	only	a	portion	of	your	data.	This
function	allows	you	to	specify	what	cases	you	want	to	include	in	the	analysis.
 Clicking	on	this	icon	either	displays	or	hides	the	value	labels	of
any	coding	variables	in	the	data	editor.	We	use	a	coding	variable	to	input
information	about	category	or	group	membership.	We	discuss	this	in	Section
4.6.5.	Briefly,	if	we	wanted	to	record	participant	sex,	we	could	create	a	variable
called	Sex	and	assign	1	as	female	and	0	as	male.	We	do	this	by	assigning	value
labels	describing	the	category	(e.g,.	‘female’)	to	the	number	assigned	to	the
category	(e.g.,	1).	In	the	data	editor,	we’d	enter	a	number	1	for	any	females	and	0
for	any	males.	Clicking	this	icon	toggles	between	the	numbers	you	entered
(you’d	see	a	column	of	0s	and	1s)	and	the	value	labels	you	assigned	to	those
numbers	(you’d	see	a	column	displaying	the	word	‘male’	or	‘female’	in	each
cell).
4.6	Entering	data	into	IBM	SPSS	Statistics	
4.6.1	Data	formats	
There	are	two	common	data	entry	formats,	which	are	sometimes	referred	to	as
wide	format	data	and	long	format	data.	Most	of	the	time,	we	enter	data	into
SPSS	in	wide	format,	although	you	can	switch	between	wide	and	long	formats
using	the	Data	 	menu.	In	the	wide	format	each	row	represents	data
from	one	entity	and	each	column	represents	a	variable.	There	is	no
discrimination	between	predictor	(independent)	and	outcome	(dependent)
variables:	both	appear	in	a	separate	column.	The	key	point	is	that	each	row
represents	one	entity’s	data	(be	that	entity	a	human,	mouse,	tulip,	business,	or
water	sample)	and	any	information	about	that	entity	should	be	entered	across	the
data	editor.	Contrast	this	with	long	format,	in	which	scores	on	an	outcome
variable	appear	in	a	single	column	and	rows	represent	a	combination	of	the
attributes	of	those	scores.	In	long	format	data,	scores	from	a	single	entity	can
appear	over	multiple	rows,	where	each	row	represents	a	combination	of	the
attributes	of	the	score	(the	entity	from	which	the	score	came,	to	which	level	of	an
independent	variable	the	score	belongs,	the	time	point	at	which	the	score	was
recorded,	etc.).
We	use	the	long	format	in	Chapter	21,	but	for	everything	else	in	this	book	we	use
the	wide	format,	so	let’s	look	at	an	example	of	how	to	enter	data	in	this	way.
Imagine	you	were	interested	in	how	perceptions	of	pain	created	by	hot	and	cold
stimuli	were	influenced	by	whether	or	not	you	swore	while	in	contact	with	the
stimulus	(Stephens,	Atkins,	&	Kingston,	2009).	You	could	place	some	people’s
hands	in	a	bucket	of	very	cold	water	for	a	minute	and	ask	them	to	rate	how
painful	they	thought	the	experience	was	on	a	scale	of	1	to	10.	You	could	then	ask
them	to	hold	a	hot	potato	and	again	measure	their	perception	of	pain.	Half	the
participants	are	encouraged	to	shout	profanities	during	the	experiences.	Imagine
I	was	a	participant	in	the	swearing	group.	You	would	have	a	single	row
representing	my	data,	so	there	would	be	a	different	column	for	my	name,	the
group	I	was	in,	my	pain	perception	for	cold	water	and	my	pain	perception	for	a
hot	potato:	Andy,	Swearing	Group,	7,	10.
The	column	with	the	information	about	the	group	to	which	I	was	assigned	is	a
grouping	variable:	I	can	belong	to	either	the	group	that	could	swear	or	the	group
that	was	forbidden,	but	not	both.	This	variable	is	a	between-group	or
independent	measure	(different	people	belong	to	different	groups).	In	SPSS	we
typically	represent	group	membership	with	numbers,	not	words,	but	assign	labels
to	those	numbers.	As	such,	group	membership	is	represented	by	a	single	column
in	which	the	group	to	which	the	person	belonged	is	defined	using	a	number	(see
Section	4.6.5).	For	example,	we	might	decide	that	if	a	person	was	in	the
swearing	group	we	assign	them	the	number	1,	and	if	they	were	in	the	non-
swearing	group	we	assign	them	a	0.	We	then	assign	a	value	label	to	each
number,	which	is	text	that	describes	what	the	number	represents.	To	enter	group
membership,	we	would	input	the	numbers	we	have	decided	to	use	into	the	data
editor,	but	the	value	labels	remind	us	which	groups	those	numbers	represent	(see
Section	6.10.4).
The	two	pain	scores	make	up	a	repeated	measure	because	all	of	the	participants
produced	a	score	after	contact	with	a	hot	and	cold	stimulus.	Levels	of	this
variable	(see	SPSS	Tip	4.2)	are	entered	in	separate	columns	(one	for	pain	from	a
hot	stimulus	and	one	for	pain	from	a	cold	stimulus).
Figure	4.8	The	variable	view	of	the	SPSS	Data	Editor
SPSS	Tip	4.2	Wide	format	data	entry	
When	using	the	wide	format,	there	is	a	simple	rule:	data	from
different	things	go	in	different	rows	of	the	data	editor,	whereas	data
from	the	same	things	go	in	different	columns	of	the	data	editor.	As
such,	each	person	(or	mollusc,	goat,	organization,	or	whatever	you
have	measured)	is	represented	in	a	different	row.	Data	within	each
person	(or	mollusc,	etc.)	go	in	different	columns.	So,	if	you’ve
prodded	your	mollusc,	or	human,	several	times	with	a	pencil	and
measured	how	much	it	twitches	as	an	outcome,	then	each	prod	will
be	represented	by	a	column.
In	experimental	research	this	means	that	variables	measured	with	the
same	participants	(a	repeated	measure)	should	be	represented	by
several	columns	(each	column	representing	one	level	of	the	repeated-
measures	variable).	However,	any	variable	that	defines	different
groups	of	things	(such	as	when	a	between-group	design	is	used	and
different	participants	are	assigned	to	different	levels	of	the
independent	variable)	is	defined	using	a	single	column.	This	idea	will
become	clearer	as	you	learn	about	how	to	carry	out	specific
procedures.
The	data	editor	is	made	up	of	lots	of	cells,	which	are	boxes	in	which	data	values
can	be	placed.	When	a	cell	is	active,	it	becomes	highlighted	in	orange	(as	in
Figure	4.3).	You	can	move	around	the	data	editor,	from	cell	to	cell,	using	the
arrow	keys	←↑↓→	(on	the	right	of	the	keyboard)	or	by	clicking	the	mouse	on
the	cell	that	you	wish	to	activate.	To	enter	a	number	into	the	data	editor,	move	to
the	cell	in	which	you	want	to	place	the	data	value,	type	the	value,	then	press	the
appropriate	arrow	button	for	the	direction	in	which	you	wish	to	move.	So,	to
enter	a	row	of	data,	move	to	the	far	left	of	the	row,	type	the	first	value	and	then
press	→	(this	process	inputs	the	value	and	moves	you	into	the	next	cell	on	the
right).
4.6.2	The	variable	view	
Before	we	input	data	into	the	data	editor,	we	need	to	create	the	variables	using
the	variable	view.	To	access	this	view	click	the	‘Variable	View’	tab	at	the	bottom
of	the	data	editor	( );	the	contents	of	the	window	will	change	(see
Figure	4.8).
Every	row	of	the	variable	view	represents	a	variable,	and	you	set	characteristics
of	each	variable	by	entering	information	into	the	following	labelled	columns
(play	around,	you’ll	get	the	hang	of	it):
Let’s	use	the	variable	view	to	create	some	variables.	Imagine	we	were	interested
in	looking	at	the	differences	between	lecturers	and	students.	We	took	a	random
sample	of	five	psychology	lecturers	from	the	University	of	Sussex	and	five
psychology	students	and	then	measured	how	many	friends	they	had,	their	weekly
alcohol	consumption	(in	units),	their	yearly	income	and	how	neurotic	they	were
(higher	score	is	more	neurotic).	These	data	are	in	Table	4.1.
4.6.3	Creating	a	string	variable	
The	first	variable	in	Table	4.1	is	the	name	of	the	lecturer/student.	This	variable	is
a	string	variable	because	it	consists	of	names	(which	are	strings	of	letters).	To
create	this	variable	in	the	variable	view:
1.	 Click	in	the	first	white	cell	in	the	column	labelled	Name.
2.	 Type	the	word	‘Name’.
3.	 Move	from	this	cell	using	the	arrow	keys	on	the	keyboard	(you	can	also	just
click	in	a	different	cell,	but	this	is	a	very	slow	way	of	doing	it).
Well	done,	you’ve	just	created	your	first	variable.	Notice	that	once	you’ve	typed
a	name,	SPSS	creates	default	settings	for	the	variable	(such	as	assuming	it’s
numeric	and	assigning	2	decimal	places).	However,	we	don’t	want	a	numeric
variable	(i.e.,	numbers),	we	want	to	enter	people’s	names,	so	we	need	a	string
variable,	so	we	have	to	change	the	variable	type.	Move	into	the	column	labelled	
	using	the	arrow	keys	on	the	keyboard.	The	cell	will	now	look	like
this	 .	Click	 	to	activate	the	Variable	Type	dialog	box.
By	default,	the	numeric	variable	type	is	selected	( )	–	see	the	top	of
Figure	4.9.	To	change	the	variable	to	a	string	variable,	click	
(bottom	left	of	Figure	4.9).	Next,	if	you	need	to	enter	text	of	more	than	8
characters	(the	default	width),	then	change	this	default	value	to	a	number
reflecting	the	maximum	number	of	characters	that	you	will	use	for	a	given	case
of	data.	Click	 	to	return	to	the	variable	view.
SPSS	Tip	4.3	Naming	variables	
‘Surely	it’s	a	waste	of	my	time	to	type	in	long	names	for	my
variables	when	I’ve	already	given	them	a	short	one?’	I	hear	you	ask.	I
can	understand	why	it	would	seem	so,	but	as	you	go	through
university	or	your	career	accumulating	data	files,	you	will	be	grateful
that	you	did.	Imagine	you	had	a	variable	called	‘number	of	times	I
wanted	to	bang	the	desk	with	my	face	during	Andy	Field’s	statistics
lecture’;	then	you	might	have	named	the	column	in	SPSS	‘nob’	(short
for	number	of	bangs).	You	thought	you	were	smart	coming	up	with
such	a	succinct	label.	If	you	don’t	add	a	more	detailed	label,	SPSS
uses	this	variable	name	in	all	the	output	from	an	analysis.	Fast
forward	a	few	months	when	you	need	to	look	at	your	data	and	output
again.	You	look	at	the	300	columns	all	labelled	things	like	‘nob’,
‘pom’,	‘p’,	‘lad’,	‘sit’	and	‘ssoass’	and	think	to	yourself,	‘What	does
"nob"	stand	for?	Which	of	these	variables	relates	to	face-butting	a
desk?	Imagine	the	chaos	you	could	get	into	if	you	always	used
acronyms	for	the	variable	and	had	an	outcome	of	‘wait	at	news	kiosk’
for	a	study	about	queuing.	I	deal	with	many	data	sets	with	variables
called	things	like	‘sftg45c’,	and	if	they	don’t	have	proper	variable
labels,	then	I’m	in	all	sorts	of	trouble.	Get	into	a	good	habit	and	label
your	variables.
Next,	because	I	want	you	to	get	into	good	habits,	move	to	the	cell	in	the	
	column	and	type	a	description	of	the	variable,	such	as
‘Participant’s	First	Name’.	Finally,	we	can	specify	the	scale	of	measurement	for
the	variable	(see	Section	1.6.2)	by	going	to	the	column	labelled	Measure	and
selecting	 ,	 	or	 	from	the	drop-down
list.	In	the	case	of	a	string	variable,	it	represents	a	description	of	the	case	and
provides	no	information	about	the	order	of	cases,	or	the	magnitude	of	one	case
compared	to	another.	Therefore,	select	 .
Once	the	variable	has	been	created,	return	to	the	data	view	by	clicking	on	the
‘Data	View’	tab	at	the	bottom	of	the	data	editor	( ).	The	contents	of
the	window	will	change,	and	notice	that	the	first	column	now	has	the	label
Name.	We	can	enter	the	data	for	this	variable	in	the	column	underneath.	Click
the	white	cell	at	the	top	of	the	column	labelled	Name	and	type	the	first	name,
‘Ben’.	To	register	this	value	in	this	cell,	move	to	a	different	cell	and	because	we
are	entering	data	down	a	column,	the	most	sensible	way	to	do	this	is	to	press	the
↓	key	on	the	keyboard.	This	action	moves	you	down	to	the	next	cell,	and	the
word	‘Ben’	should	appear	in	the	cell	above.	Enter	the	next	name,	‘Martin’,	and
then	press	↓	to	move	down	to	the	next	cell,	and	so	on.
4.6.4	Creating	a	date	variable	
The	second	column	in	our	table	contains	dates	(birth	dates	to	be	exact).	To	create
a	date	variable,	we	more	or	less	repeat	what	we’ve	just	done.	First,	move	back	to
the	variable	view	using	the	tab	at	the	bottom	of	the	data	editor	( ).
Move	to	the	cell	in	row	2	of	the	column	labelled	Name	(under	the	previous
variable	you	created).	Type	the	word	‘Birth_Date’	(note	that	I	have	used	a	hard
space	to	separate	the	words).	Move	into	the	column	labelled	 	using
the	→	key	on	the	keyboard	(doing	so	creates	default	settings	in	the	other
columns).	As	before,	the	cell	you	have	moved	into	will	indicate	the	default	of	
,	and	to	change	this	we	click	 	to	activate	the	Variable
Type	dialog	box,	and	click	 	(bottom	right	of	Figure	4.9).	On	the	right	of
the	dialog	box	is	a	list	of	date	formats,	from	which	you	can	choose	your
preference;	being	British,	I	am	used	to	the	day	coming	before	the	month	and
have	chosen	dd-mmm-yyyy	(i.e.,	21-Jun-1973),	but	Americans,	for	example,
more	often	put	the	month	before	the	date	and	so	might	select	mm/dd/yyyy
(06/21/1973).	When	you	have	selected	a	date	format,	click	 	to
return	to	the	variable	view.	Finally,	move	to	the	cell	in	the	column	labelled	Label
and	type	‘Date	of	Birth’.
Once	the	variable	has	been	created,	return	to	the	data	view	by	clicking	on	the
‘Data	View’	tab	( ).	The	second	column	now	has	the	label
Birth_Date;	click	the	white	cell	at	the	top	of	this	column	and	type	the	first	value,
03-Jul-1977.	To	register	this	value	in	this	cell,	move	down	to	the	next	cell	by
pressing	the	↓	key.	Now	enter	the	next	date,	and	so	on.
Figure	4.9	Defining	numeric,	string	and	date	variables
4.6.5	Creating	coding	variables	
I’ve	mentioned	coding	or	grouping	variables	briefly	already;	they	use	numbers
to	represent	different	groups	or	categories	of	data.	As	such,	a	coding	variable	is
numeric,	but	because	the	numbers	represent	names	its	variable	type	is	
.	The	groups	of	data	represented	by	coding	variables	could	be
levels	of	a	treatment	variable	in	an	experiment	(an	experimental	group	or	a
control	group),	different	naturally	occurring	groups	(men	or	women,	ethnic
groups,	marital	status,	etc.),	different	geographic	locations	(countries,	states,
cities,	etc.),	or	different	organizations	(different	hospitals	within	a	healthcare
trust,	different	schools	in	a	study,	different	companies).
In	experiments	that	use	an	independent	design,	coding	variables	represent
predictor	(independent)	variables	that	have	been	measured	between	groups	(i.e.,
different	entities	were	assigned	to	different	groups).	We	do	not,	generally,	use
this	kind	of	coding	variable	for	experimental	designs	where	the	independent
variable	was	manipulated	using	repeated	measures	(i.e.,	participants	take	part	in
all	experimental	conditions).	For	repeated-measures	designs	we	typically	use
different	columns	to	represent	different	experimental	conditions.
Think	back	to	our	swearing	and	pain	experiment.	This	was	an	independent
design	because	we	had	two	groups	representing	the	two	levels	of	our
independent	variable:	one	group	could	swear	during	the	pain	tasks,	the	other
could	not.	Therefore,	we	can	use	a	coding	variable.	We	might	assign	the
experimental	group	(swearing)	a	code	of	1	and	the	control	group	(no	swearing)	a
code	of	0.	To	input	these	data	you	would	create	a	variable	(which	you	might	call
group)	and	type	the	value	1	for	any	participants	in	the	experimental	group,	and	0
for	any	participant	in	the	control	group.	These	codes	tell	SPSS	that	the	cases	that
have	been	assigned	the	value	1	should	be	treated	as	belonging	to	the	same	group,
and	likewise	for	the	cases	assigned	the	value	0.	The	codes	you	use	are	arbitrary
because	the	numbers	themselves	won’t	be	analysed,	so	although	people	typically
use	0,	1,	2,	3,	etc.,	if	you’re	a	particularly	arbitrary	person	feel	free	to	code	one
group	as	616	and	another	as	11	and	so	on.
We	have	a	coding	variable	in	our	data	that	describes	whether	a	person	was	a
lecturer	or	student.	To	create	this	coding	variable,	we	follow	the	same	steps	as
before,	but	we	will	also	have	to	record	which	numeric	codes	are	assigned	to
which	groups.	First,	return	to	the	variable	view	( )	if	you’re	not
already	in	it	and	move	to	the	cell	in	the	third	row	under	the	column	labelled
Name.	Type	a	name	(let’s	call	it	Group).	I’m	still	trying	to	instil	good	habits,	so
move	along	the	third	row	to	the	column	called	Label	and	give	the	variable	a	full
description	such	as,	‘Is	the	person	a	lecturer	or	a	student?’	To	define	the	group
codes,	move	along	the	row	to	the	column	labelled	 .	The	cell	will
indicate	the	default	of	 .	Click	 	to	access	the	Value
Labels	dialog	box	(see	Figure	4.10).
The	Value	Labels	dialog	box	is	used	to	specify	group	codes.	First,	click	in	the
white	space	next	to	where	it	says	Value	(or	press	Alt	and	U	at	the	same	time)	and
type	in	a	code	(e.g.,	1).	The	second	step	is	to	click	in	the	white	space	below,	next
to	where	it	says	Label	(or	press	Tab,	or	Alt	and	L	at	the	same	time)	and	type	in	an
appropriate	label	for	that	group.	In	Figure	4.10	I	have	already	defined	a	code	of
1	for	the	lecturer	group,	and	then	I	have	typed	in	2	as	a	code	and	given	this	a
label	of	Student.	To	add	this	code	to	the	list	click	 .	When	you	have
defined	all	your	coding	values	you	might	want	to	check	for	spelling	mistakes	in
the	value	labels	by	clicking	 .	To	finish,	click	 ;	if	you
do	this	before	you	have	clicked	 	to	register	your	most	recent	code
in	the	list,	SPSS	displays	a	warning	that	any	‘pending	changes	will	be	lost’.	This
message	is	telling	you	to	go	back	and	click	 	before	continuing.
Finally,	coding	variables	represent	categories	and	so	the	scale	of	measurement	is
nominal	(or	ordinal	if	the	categories	have	a	meaningful	order).	To	specify	this
level	of	measurement,	go	to	the	column	labelled	Measure	and	select	
	(or	 	if	the	groups	have	a	meaningful	order)	from	the
drop-down	list.
Figure	4.10	Defining	coding	variables	and	their	values
Having	defined	your	codes,	switch	to	the	data	view	and	for	each	participant	type
the	numeric	value	that	represents	their	group	membership	into	the	column
labelled	Group.	In	our	example,	if	a	person	was	a	lecturer,	type	‘1’,	but	if	they
were	a	student	then	type	‘2’	(see	SPSS	Tip	4.4).	SPSS	can	display	either	the
numeric	codes	or	the	value	labels	that	you	assigned	to	them,	and	you	can	toggle
between	the	two	states	by	clicking	 	(see	Figure	4.11).	Figure	4.11
shows	how	the	data	should	be	arranged:	remember	that	each	row	of	the	data
editor	represents	data	from	one	entity:	the	first	five	participants	were	lecturers,
whereas	participants	6–10	were	students.
4.6.6	Creating	a	numeric	variable	
Our	next	variable	is	Friends,	which	is	numeric.	Numeric	variables	are	the
easiest	ones	to	create	because	they	are	the	default	format	in	SPSS.	Move	back	to
the	variable	view	using	the	tab	at	the	bottom	of	the	data	editor	( ).
Go	to	the	cell	in	row	4	of	the	column	labelled	Name	(under	the	previous	variable
you	created).	Type	the	word	‘Friends’.	Move	into	the	column	labelled	
	using	the	→	key	on	the	keyboard.	As	with	the	previous	variables
we	have	created,	SPSS	has	assumed	that	our	new	variable	is	 ,	and
because	our	variable	is	numeric	we	don’t	need	to	change	this	setting.
The	scores	for	the	number	of	friends	have	no	decimal	places	(unless	you	are	a
very	strange	person	indeed,	you	can’t	have	0.23	of	a	friend).	Move	to	the	
	column	and	type	‘0’	(or	decrease	the	value	from	2	to	0	using	
)	to	tell	SPSS	that	you	don’t	want	to	display	decimal	places.
Let’s	continue	our	good	habit	of	naming	variables	and	move	to	the	cell	in	the
column	labelled	Label	and	type	‘Number	of	Friends’.	Finally,	number	of	friends
is	measured	on	the	ratio	scale	of	measurement	(see	Section	1.6.2)	and	we	can
specify	this	by	going	to	the	column	labelled	Measure	and	selecting	
	from	the	drop-down	list	(this	will	have	been	done	automatically,
but	it’s	worth	checking).
Figure	4.11	Coding	values	in	the	data	editor	with	the	value	labels	switched	off
and	on
SPSS	Tip	4.4	Copying	and	pasting	into	the	data	editor	and	variable
viewer	
Often	(especially	with	coding	variables),	you	need	to	enter	the	same
value	lots	of	times	into	the	data	editor.	Similarly,	in	the	variable	view,
you	might	have	a	series	of	variables	that	all	have	the	same	value
labels	(e.g.,	variables	representing	questions	on	a	questionnaire	might
all	have	value	labels	of	0	=	never,	1	=	sometimes,	2	=	always	to
represent	responses	to	those	questions).	Rather	than	typing	the	same
number	lots	of	times,	or	entering	the	same	value	labels	multiple
times,	you	can	use	the	copy	and	paste	functions	to	speed	things	up.
All	you	need	to	do	is	to	select	the	cell	containing	the	information	that
you	want	to	copy	(whether	that	is	a	number	or	text	in	the	data	view,
or	a	set	of	value	labels	or	another	characteristic	within	the	variable
view)	and	click	with	the	right	mouse	button	to	activate	a	menu	within
which	you	can	click	(with	the	left	mouse	button)	on	Copy	(top	of
Figure	4.12).	Next,	highlight	any	cells	into	which	you	want	to	place
what	you	have	copied	by	dragging	the	mouse	over	them	while
holding	down	the	left	mouse	button.	These	cells	will	be	highlighted
in	orange.	While	the	pointer	is	over	the	highlighted	cells,	click	with
the	right	mouse	button	to	activate	a	menu	from	which	you	should
click	Paste	(bottom	left	of	Figure	4.12).	The	highlighted	cells	will	be
filled	with	the	value	that	you	copied	(bottom	right	of	Figure	4.12).
Figure	4.12	shows	the	process	of	copying	the	value	‘1’	and	pasting	it
into	four	blank	cells	in	the	same	column.
Figure	4.12	Copying	and	pasting	into	empty	cells
Why	is	the	‘Number	of	Friends’	variable	a	‘scale’	variable?
Once	the	variable	has	been	created,	you	can	return	to	the	data	view	by	clicking
on	the	‘Data	View’	tab	at	the	bottom	of	the	data	editor	( ).	The
contents	of	the	window	will	change,	and	you’ll	notice	that	the	fourth	column
now	has	the	label	Friends.	To	enter	the	data,	click	the	white	cell	at	the	top	of	the
column	labelled	Friends	and	type	the	first	value,	5.	Because	we’re	entering
scores	down	the	column	the	most	sensible	way	to	record	this	value	in	this	cell	is
to	press	the	↓	key	on	the	keyboard.	This	action	moves	you	down	to	the	next	cell,
and	the	number	5	is	stored	in	the	cell	above.	Enter	the	next	number,	2,	and	then
press	↓	to	move	down	to	the	next	cell,	and	so	on.
Having	created	the	first	four	variables	with	a	bit	of	guidance,	try	to
enter	the	rest	of	the	variables	in	Table	4.1	yourself.
4.6.7	Missing	values	
Although	we	strive	to	collect	complete	sets	of	data,	often	scores	are	missing.
Missing	data	can	occur	for	a	variety	of	reasons:	in	long	questionnaires
participants	accidentally	(or,	depending	on	how	paranoid	you’re	feeling,
deliberately	to	irritate	you)	miss	out	questions;	in	experimental	procedures
mechanical	faults	can	lead	to	a	score	not	being	recorded;	and	in	research	on
delicate	topics	(e.g.,	sexual	behaviour)	participants	may	exert	their	right	not	to
answer	a	question.	However,	just	because	we	have	missed	out	on	some	data	for	a
participant,	that	doesn’t	mean	that	we	have	to	ignore	the	data	we	do	have
(although	it	creates	statistical	difficulties).	The	simplest	way	to	record	a	missing
score	is	to	leave	the	cell	in	the	data	editor	empty,	but	it	can	be	helpful	to	tell
SPSS	explicitly	that	a	score	is	missing.	We	do	this,	much	like	a	coding	variable,
by	choosing	a	number	to	represent	the	missing	data	point.	You	then	tell	SPSS	to
treat	that	number	as	missing.	For	obvious	reasons,	it	is	important	to	choose	a
code	that	cannot	also	be	a	naturally	occurring	data	value.	For	example,	if	we	use
the	value	9	to	code	missing	values	and	several	participants	genuinely	scored	9,
then	SPSS	will	wrongly	treat	those	scores	as	missing.	You	need	an	‘impossible’
value,	so	people	usually	pick	a	score	greater	than	the	maximum	possible	score
on	the	measure.	For	example,	in	an	experiment	in	which	attitudes	are	measured
on	a	100-point	scale	(so	scores	vary	from	1	to	100)	a	good	code	for	missing
values	might	be	something	like	101,	999	or,	my	personal	favourite,	666	(because
missing	values	are	the	devil’s	work).
Labcoat	Leni’s	Real	Research	4.1	Gonna	be	a	rock	‘n’	roll	singer	
Oxoby,	R.	J.	(2008).	Economic	Enquiry,	47(3),	598–602.
AC/DC	are	one	one	of	the	best-selling	hard	rock	bands	in	history,
with	around	100	million	certified	sales,	and	an	estimated	200	million
actual	sales.	In	1980	their	original	singer	Bon	Scott	died	of	alcohol
poisoning	and	choking	on	his	own	vomit.	He	was	replaced	by	Brian
Johnson,	who	has	been	their	singer	ever	since.5	Debate	rages	with
unerring	frequency	within	the	rock	music	press	over	who	is	the	better
frontman.	The	conventional	wisdom	is	that	Bon	Scott	was	better,
although	personally,	and	I	seem	to	be	somewhat	in	the	minority	here,
I	prefer	Brian	Johnson.	Anyway,	Robert	Oxoby,	in	a	playful	paper,
decided	to	put	this	argument	to	bed	once	and	for	all	(Oxoby,	2008).
5	Well,	until	all	that	weird	stuff	with	W.	Axl	Rose	in	2016,	which	I’m
trying	to	pretend	didn’t	happen.
Using	a	task	from	experimental	economics	called	the	ultimatum
game,	individuals	are	assigned	the	role	of	either	proposer	or
responder	and	paired	randomly.	Proposers	are	allocated	$10	from
which	they	have	to	make	a	financial	offer	to	the	responder	(i.e.,	$2).
The	responder	can	accept	or	reject	this	offer.	If	the	offer	is	rejected
neither	party	gets	any	money,	but	if	the	offer	is	accepted	the
responder	keeps	the	offered	amount	(e.g.,	$2),	and	the	proposer	keeps
the	original	amount	minus	what	they	offered	(e.g.,	$8).	For	half	of
the	participants	the	song	‘It’s	a	long	way	to	the	top’	sung	by	Bon
Scott	was	playing	in	the	background,	for	the	remainder	‘Shoot	to
thrill’	sung	by	Brian	Johnson	was	playing.	Oxoby	measured	the
offers	made	by	proposers,	and	the	minimum	offers	that	responders
accepted	(called	the	minimum	acceptable	offer).	He	reasoned	that
people	would	accept	lower	offers	and	propose	higher	offers	when
listening	to	something	they	like	(because	of	the	‘feel-good	factor’	the
music	creates).	Therefore,	by	comparing	the	value	of	offers	made	and
the	minimum	acceptable	offers	in	the	two	groups,	he	could	see
whether	people	have	more	of	a	feel-good	factor	when	listening	to
Bon	or	Brian.	The	offers	made	(in	$)	are6	as	follows	(there	were	18
people	per	group):
6	These	data	are	estimated	from	Figures	1	and	2	in	the	paper	because
I	couldn’t	get	hold	of	the	author	to	get	the	original	data	files.
Bon	Scott	group:	1,	2,	2,	2,	2,	3,	3,	3,	3,	3,	4,	4,	4,	4,	4,	5,	5,	5
Brian	Johnson	group:	2,	3,	3,	3,	3,	3,	4,	4,	4,	4,	4,	5,	5,	5,	5,	5,	5,
5
Enter	these	data	into	the	SPSS	Data	Editor,	remembering	to	include
value	labels,	to	set	the	measure	property,	to	give	each	variable	a
proper	label,	and	to	set	the	appropriate	number	of	decimal	places.
Answers	are	on	the	companion	website,	and	my	version	of	how	this
file	should	look	can	be	found	in	Oxoby	(2008)	Offers.sav.
To	specify	missing	values	click	in	the	column	labelled	 	in	the
variable	view	( )	and	then	click	 	to	activate	the
Missing	Values	dialog	box	in	Figure	4.13.	By	default,	SPSS	assumes	that	no
missing	values	exist,	but	you	can	define	them	in	one	of	two	ways.	The	first	is	to
select	discrete	values	(by	clicking	on	the	radio	button	next	to	where	it	says
Discrete	missing	values),	which	are	single	values	that	represent	missing	data.
SPSS	allows	you	to	specify	up	to	three	values	to	represent	missing	data.	The
reason	why	you	might	choose	to	have	several	numbers	to	represent	missing
values	is	that	you	can	assign	a	different	meaning	to	each	discrete	value.	For
example,	you	could	have	the	number	8	representing	a	response	of	‘not
applicable’,	a	code	of	9	representing	a	‘don’t	know’	response,	and	a	code	of	99
meaning	that	the	participant	failed	to	give	any	response.	SPSS	treats	these	values
in	the	same	way	(it	ignores	them),	but	different	codes	can	be	helpful	to	remind
you	of	why	a	particular	score	is	missing.	The	second	option	is	to	select	a	range
of	values	to	represent	missing	data	and	this	is	useful	in	situations	in	which	it	is
necessary	to	exclude	data	falling	between	two	points.	So,	we	could	exclude	all
scores	between	5	and	10.	With	this	last	option	you	can	also	(but	don’t	have	to)
specify	one	discrete	value.
4.7	Importing	data	
We	can	import	data	into	SPSS	from	other	software	packages	such	as	Microsoft
Excel,	R,	SAS,	and	Systat	by	using	the	File	 	Import	Data	menu	and
selecting	the	corresponding	software	from	the	list	(Figure	4.14).	If	you	want	to
import	from	a	package	that	isn’t	listed	(e.g.,	R	or	Systat),	then	export	the	data
from	these	packages	as	tab-delimited	text	data	(.txt	or	.dat)	or	comma-separated
values	(.csv)	and	select	the	Text	Data	or	CSV	Data	options	in	the	menu.
Figure	4.13	Defining	missing	values
Oditi’s	Lantern	Entering	data
‘I,	Oditi,	believe	that	the	secrets	of	life	have	been	hidden	in	a
complex	numeric	code.	Only	by	“analysing”	these	sacred	numbers
can	we	reach	true	enlightenment.	To	crack	the	code	I	must	assemble
thousands	of	followers	to	analyse	and	interpret	these	numbers	(it’s	a
bit	like	the	chimps	and	typewriters	theory).	I	need	you	to	follow	me.
To	spread	the	numbers	to	other	followers	you	must	store	them	in	an
easily	distributable	format	called	a	“data	file”.	You,	my	follower,	are
loyal	and	loved,	and	to	assist	you	my	lantern	displays	a	tutorial	on
how	to	do	this.’
4.8	The	SPSS	viewer	
The	SPSS	viewer	appears	in	a	different	window	than	the	data	editor	and	displays
the	output	of	any	procedures	in	SPSS:	tables	of	results,	graphs,	error	messages
and	pretty	much	everything	you	could	want,	except	for	photos	of	your	cat.
Although	the	SPSS	viewer	is	all-singing	and	all-dancing,	my	prediction	in
previous	editions	of	this	book	that	it	will	one	day	include	a	tea-making	facility
have	not	come	to	fruition	(IBM,	take	note	☺).	Figure	4.15	shows	the	viewer.	On
the	right	there	is	a	large	space	in	which	all	output	is	displayed.	Graphs	(Section
5.9)	and	tables	displayed	here	can	be	edited	by	double-clicking	on	them.	On	the
left,	is	a	tree	diagram	of	the	output.	This	tree	diagram	provides	an	easy	way	to
access	parts	of	the	output,	which	is	useful	when	you	have	conducted	tonnes	of
analyses.	The	tree	structure	is	self-explanatory:	every	time	you	do	something	in
SPSS	(such	as	drawing	a	graph	or	running	a	statistical	procedure),	it	lists	this
procedure	as	a	main	heading.
In	Figure	4.15,	I	ran	a	graphing	procedure	followed	by	a	univariate	analysis	of
variance	(ANOVA),	and	these	names	appear	as	main	headings	in	the	tree
diagram.	For	each	procedure	there	are	subheadings	that	represent	different	parts
of	the	analysis.	For	example,	in	the	ANOVA	procedure,	which	you’ll	learn	more
about	later	in	the	book,	there	are	sections	such	as	Tests	of	Between-Subjects
Effects	(this	is	the	table	containing	the	main	results).	You	can	skip	to	any	one	of
these	sub-components	by	clicking	on	the	appropriate	branch	of	the	tree	diagram.
So,	if	you	wanted	to	skip	to	the	between-groups	effects,	you	would	move	the	on-
screen	arrow	to	the	left-hand	portion	of	the	window	and	click	where	it	says	Tests
of	Between-Subjects	Effects.	This	action	will	highlight	this	part	of	the	output	in
the	main	part	of	the	viewer	(see	SPSS	Tip	4.5).
Figure	4.14	The	Import	Data	menu
Figure	4.15	The	SPSS	viewer
Oditi’s	Lantern	Importing	data	into	SPSS
‘I,	Oditi,	have	become	aware	that	some	of	the	sacred	numbers	that
hide	the	secrets	of	life	are	contained	within	files	other	than	those	of
my	own	design.	We	cannot	afford	to	miss	vital	clues	that	lurk	among
these	rogue	files.	Like	all	good	cults,	we	must	convert	all	to	our
cause,	even	data	files.	Should	you	encounter	one	of	these	files,	you
must	convert	it	to	the	SPSS	format.	My	lantern	shows	you	how.’
Oditi’s	Lantern	Editing	tables
‘I,	Oditi,	impart	to	you,	my	loyal	posse,	the	knowledge	that	SPSS
will	conceal	the	secrets	of	life	within	tables	of	output.	Like	the	author
of	this	book’s	personality,	these	tables	appear	flat	and	lifeless;
however,	if	you	give	them	a	poke	they	have	hidden	depths.	Often	you
will	need	to	seek	out	the	hidden	codes	within	the	tables.	To	do	this,
double-click	on	them.	This	will	reveal	the	“layers”	of	the	table.	Stare
into	my	lantern	and	find	out	how.’
SPSS	Tip	4.5	Printing	and	saving	the	planet	
Rather	than	printing	all	of	your	SPSS	output,	you	can	help	the	planet
by	printing	only	a	selection.	Do	this	by	using	the	tree	diagram	in	the
SPSS	viewer	to	select	parts	of	the	output	for	printing.	For	example,	if
you	decided	that	you	wanted	to	print	a	particular	graph,	click	on	the
word	Graph	in	the	tree	structure	to	highlight	the	graph	in	the	output.
Then,	in	the	Print	menu	you	can	print	just	the	selected	part	of	the
output	(Figure	4.16).	Note	that	if	you	click	a	main	heading	(such	as
Univariate	Analysis	of	Variance)	SPSS	will	highlight	all	the
subheadings	under	that	heading,	which	is	useful	for	printing	all	the
output	from	a	single	statistical	procedure.
Figure	4.16	Printing	only	the	selected	parts	of	SPSS	output
Some	of	the	icons	in	the	viewer	are	the	same	as	those	for	the	data	editor	(so	refer
back	to	our	earlier	list),	but	others	are	unique.
Oditi’s	Lantern	The	SPSS	viewer	window
‘I,	Oditi,	believe	that	by	“analysing”	the	sacred	numbers	we	can	find
the	answers	to	life.	I	have	given	you	the	tools	to	spread	these
numbers	far	and	wide,	but	to	interpret	these	numbers	we	need	“the
viewer”.	The	viewer	is	like	an	X-ray	that	reveals	what	is	beneath	the
raw	numbers.	Use	the	viewer	wisely,	my	friends,	because	if	you	stare
long	enough	you	will	see	your	very	soul.	Stare	into	my	lantern	and
see	a	tutorial	on	the	viewer.’
SPSS	Tip	4.6	Funny	numbers	
SPSS	sometimes	reports	numbers	with	the	letter	‘E’	placed	in	the	mix
just	to	confuse	you.	For	example,	you	might	see	a	value	such	as
9.612	E−02.	Many	students	find	this	notation	confusing.	This
notation	means	9.61	×	10−2	,	which	might	be	a	more	familiar
notation,	or	could	be	even	more	confusing.	Think	of	E−02	as
meaning	‘move	the	decimal	place	2	places	to	the	left’,	so	9.612	E−02
becomes	0.09612.	If	the	notation	reads	9.612	E−01,	then	that	would
be	0.9612,	and	9.612	E−03	would	be	0.009612.	Conversely,	E+02
(notice	the	minus	sign	has	changed)	means	‘move	the	decimal	place
2	places	to	the	right’,	so,	9.612	E+02	becomes	961.2.
4.9	Exporting	SPSS	output	
If	you	want	to	share	your	SPSS	output	with	other	people	who	don’t	have	access
to	IBM	SPSS	Statistics,	you	have	two	choices:	(1)	export	the	output	into	a
software	package	that	they	do	have	(such	as	Microsoft	Word)	or	in	the	portable
document	format	(PDF)	that	can	be	read	by	various	free	software	packages;	or
(2)	get	them	to	install	the	free	IBM	SPSS	Smartreader	from	the	IBM	SPSS
website.	The	SPSS	Smartreader	is	basically	a	free	version	of	the	viewer	so	you
can	view	output	but	not	run	new	analyses.
4.10	The	syntax	editor	
I	mentioned	earlier	that	sometimes	it’s	useful	to	use	SPSS	syntax.	Syntax	is	a
language	of	commands	for	carrying	out	statistical	analyses	and	data
manipulations.	Most	people	prefer	to	do	the	things	they	need	to	do	using	dialog
boxes,	but	SPSS	syntax	can	be	useful.	No,	really,	it	can.	For	one	thing,	there	are
things	you	can	do	with	syntax	that	you	can’t	do	through	dialog	boxes
(admittedly,	most	of	these	things	are	advanced,	but	I	will	periodically	show	you
some	nice	tricks	using	syntax).	The	second	benefit	to	syntax	is	if	you	carry	out
very	similar	analyses	on	data	sets.	In	these	situations,	it	is	often	quicker	to	do	the
analysis	and	save	the	syntax	as	you	go	along.	Then	you	can	adapt	it	to	new	data
sets	(which	is	frequently	quicker	than	going	through	dialog	boxes.	Finally,	using
syntax	creates	a	record	of	your	analysis,	and	makes	it	reproducible,	which	is	an
important	part	of	engaging	in	open	science	practices	(Section	3.6).
Oditi’s	Lantern	Exporting	SPSS	output
‘That	I,	the	almighty	Oditi,	can	discover	the	secrets	within	the
numbers,	they	must	spread	around	the	world.	But	non-believers	do
not	have	SPSS,	so	we	must	send	them	a	link	to	the	IBM	SPSS
Smartreader.	I	have	also	given	to	you,	my	subservient	brethren,	a
tutorial	on	how	to	export	SPSS	output	into	Word.	These	are	the	tools
you	need	to	spread	the	numbers.	Go	forth	and	stare	into	my	lantern.’
To	open	a	syntax	editor	window,	like	the	one	in	Figure	4.17,	use	File	 	New	
.	The	area	on	the	right	(the	command	area)	is	where	you	type
syntax	commands,	and	on	the	left	is	a	navigation	area	(like	the	viewer	window).
When	you	have	a	large	file	of	syntax	commands	the	navigation	area	helps	you
find	the	bit	of	syntax	that	you	need.
Like	grammatical	rules	when	we	write,	there	are	rules	that	ensure	that	SPSS
‘understands’	the	syntax.	For	example,	each	line	must	end	with	a	full	stop.	If	you
make	a	syntax	error	(i.e.,	break	one	of	the	rules),	SPSS	produces	an	error
message	in	the	viewer	window.	The	messages	can	be	indecipherable	until	you
gain	experience	of	translating	them,	but	they	helpfully	identify	the	line	in	the
syntax	window	in	which	the	error	occurred.	Each	line	in	the	syntax	window	is
numbered	so	you	can	easily	find	the	line	in	which	the	error	occurred,	even	if	you
don’t	understand	what	the	error	is!	Learning	SPSS	syntax	is	time-consuming,	so
in	the	beginning	the	easiest	way	to	generate	syntax	is	to	use	dialog	boxes	to
specify	the	analysis	you	want	to	do	and	then	click	 	(many	dialog	boxes
have	this	button).	Doing	so	pastes	the	syntax	to	do	the	analysis	you	specified	in
the	dialog	box.	Using	dialog	boxes	in	this	way	is	a	good	way	to	get	a	feel	for
syntax.
Once	you’ve	typed	in	your	syntax	you	run	it	using	the	Run	menu.	Run	
	will	run	all	the	syntax	in	the	window,	or	you	can	highlight	a
selection	of	syntax	using	the	mouse	and	select	Run	 	 	(or	click	
	in	the	syntax	window)	to	process	the	selected	syntax.	You	can
also	run	the	syntax	a	command	at	a	time	from	either	the	current	command	(Run	
	Step	Through	 	From	Current),	or	the	beginning	(Run	 	Step
Through	 	From	Start).	You	can	also	process	the	syntax	from	the	cursor	to
the	end	of	the	syntax	window	by	selecting	Run	 .
A	final	note.	You	can	have	multiple	data	files	open	in	SPSS	simultaneously.
Rather	than	having	a	syntax	window	for	each	data	file,	which	could	get
confusing,	you	can	use	one	syntax	window,	but	select	the	data	file	that	you	want
to	run	the	syntax	commands	on	before	you	run	them	using	the	drop-down	list	
.
Figure	4.17	A	syntax	window	with	some	syntax	in	it
Oditi’s	Lantern	Sin-tax
‘I,	Oditi,	leader	of	the	cult	of	undiscovered	numerical	truths,	require
my	brethren	to	focus	only	on	the	discovery	of	those	truths.	To	focus
their	minds	I	shall	impose	a	tax	on	sinful	acts.	Sinful	acts	(such	as
dichotomizing	a	continuous	variable)	can	distract	from	the	pursuit	of
truth.	To	implement	this	tax,	followers	will	need	to	use	the	sin-tax
window.	Stare	into	my	lantern	to	see	a	tutorial	on	how	to	use	it.’
4.11	Saving	files	
Most	of	you	should	be	familiar	with	how	to	save	files.	Like	most	software,	SPSS
has	a	save	icon	 	and	you	can	use	File	 	 	or	File	
	Save	as	…	or	Ctrl	+	S	( 	+	S	on	Mac	OS).	If	the	file	hasn’t	been	saved
previously	then	initiating	a	save	will	open	the	Save	As	dialog	box	(see	Figure
4.18).	SPSS	will	save	whatever	is	in	the	window	that	was	active	when	you
initiated	the	save;	for	example,	if	you	are	in	the	data	editor	when	you	initiate	the
save,	then	SPSS	will	save	the	data	file	(not	the	output	or	syntax).	You	use	this
dialog	box	as	you	would	in	any	other	software:	type	a	name	in	the	space	next	to
where	it	says	File	name.	If	you	have	sensitive	data,	you	can	password	encrypt	it
by	selecting	 .	By	default,	the	file	will	be	saved	in	an	SPSS	format,
which	has	a	.sav	file	extension	for	data	files,	.spv	for	viewer	documents,	and	.sps
for	syntax	files.	Once	a	file	has	previously	been	saved,	it	can	be	saved	again
(updated)	by	clicking	on	 .
Figure	4.18	The	Save	Data	As	dialog	box
You	can	save	data	in	formats	other	than	SPSS.	Three	of	the	most	useful	are
Microsoft	Excel	files	(.xls,	.xlsx),	comma-separated	values	(.csv)	and	tab-
delimited	text	(.dat).	The	latter	two	file	types	are	plain	text,	which	means	that
they	can	be	opened	by	virtually	any	spreadsheet	software	you	can	think	of
(including	Excel,	OpenOffice,	Numbers,	R,	SAS,	and	Systat).	To	save	your	data
file	in	of	these	formats	(and	others),	click	 	and	select	a	format	from
the	drop-down	list	(Figure	4.18).	If	you	select	a	format	other	than	SPSS,	the	
	option	becomes	active.	If	you	leave	this	option	unchecked,	coding
variables	(Section	4.6.5)	will	be	exported	as	numeric	values	in	the	data	editor;	if
you	select	it	then	coding	variables	will	be	exported	as	string	variables	containing
the	value	labels.	You	can	also	choose	to	include	the	variable	names	in	the
exported	file	(usually	a	good	idea)	as	either	the	Names	at	the	top	of	the	data
editor	columns,	or	the	full	Labels	that	you	gave	to	the	variables.
4.12	Opening	files	
This	book	relies	on	you	working	with	data	files	that	you	can	download	from	the
companion	website.	You	probably	don’t	need	me	to	tell	you	how	to	open	these
file,	but	just	in	case	…	To	load	a	file	into	SPSS	use	the	 	icon	or
select	File	 	Open	 	and	then	 	to	open	a	data	file,	
	to	open	a	viewer	file,	or	 	to	open	a	syntax	file.	This
process	opens	a	dialog	box	(Figure	4.19),	with	which	I’m	sure	you’re	familiar.
Navigate	to	wherever	you	saved	the	file	that	you	need.	SPSS	will	list	the	files	of
the	type	you	asked	to	open	(so,	data	files	if	you	selected	 ).	Open
the	file	you	want	by	either	selecting	it	and	clicking	on	 ,	or	double-
clicking	on	the	icon	next	to	the	file	you	want	(e.g.,	double-clicking	on	
).	If	you	want	to	open	data	in	a	format	other	than	SPSS	(.sav),	then
click	 	to	display	a	list	of	alternative	file	formats.	Click	the
appropriate	file	type	–	Microsoft	Excel	file	(*.xls),	text	file	(*.dat,	*.txt,)	etc.),	to
list	files	of	that	type	in	the	dialog	box.
Figure	4.19	Dialog	box	to	open	a	file
4.13	Extending	IBM	SPSS	Statistics	
IBM	SPSS	Statistics	has	some	powerful	tools	for	users	to	build	their	own
functionality.	For	example,	you	can	create	your	own	dialog	boxes	and	menus	to
run	syntax	that	you	may	have	written.	SPSS	Statistics	also	interfaces	with	a
powerful	open	source	statistical	computing	language	called	R	(R	Core	Team,
2016).	There	are	two	extensions	to	SPSS	that	we	use	in	this	book.	One	is	a	tool
called	PROCESS	and	the	other	is	the	Essentials	for	R	for	Statistics	plugin,	which
will	give	us	access	to	R	so	that	we	can	implement	robust	models	using	the	WRS2
package	(Mair,	Schoenbrodt,	&	Wilcox,	2015).
4.13.1	The	PROCESS	tool	
The	PROCESS	tool	(Hayes,	2018)	wraps	up	a	range	of	functions	written	by
Andrew	Hayes	and	Kristopher	Preacher	(e.g.,	Hayes	&	Matthes,	2009;	Preacher
&	Hayes,	2004,	2008a)	to	do	moderation	and	mediation	analyses,	which	we	look
at	in	Chapter	11.	While	using	these	tools,	spare	a	thought	of	gratitude	to	Hayes
and	Preacher	for	using	their	spare	time	to	do	cool	stuff	like	this	that	makes	it
possible	for	you	to	analyse	your	data	without	having	a	nervous	breakdown.	Even
if	you	think	you	are	having	a	nervous	breakdown,	trust	me,	it’s	not	as	big	as	the
one	you’d	be	having	if	PROCESS	didn’t	exist.	The	PROCESS	tool	is	what’s
known	as	a	custom	dialog	box	and	it	can	be	installed	in	three	steps	(Mac	OS
users	ignore	step	2):
1.	 Download	the	install	file.	Download	the	file	process.spd	from	Andrew
Hayes’s	website:	http://www.processmacro.org/download.html.	Save	this
file	onto	your	computer.
Figure	4.20	Installing	the	PROCESS	menu
2.	 Start	up	IBM	SPSS	Statistics	as	an	administrator.	To	install	the	tool	in
Windows,	you	need	to	start	IBM	SPSS	Statistics	as	an	administrator.	To	do
this,	make	sure	that	SPSS	isn’t	already	running,	and	click	the	Start	menu	(
).	Locate	the	icon	for	SPSS	( ),	which,	if	it’s	not
in	your	most	used	list,	will	be	listed	under	‘I’	for	IBM	SPSS	Statistics.	The
text	next	to	the	icon	will	refer	to	the	version	of	SPSS	Statistics	that	you
have	installed	(if	you	have	a	subscription	it	will	say	‘Subscription’	rather
than	a	version	number).	Click	on	this	icon	with	the	right	mouse	button	to
activate	the	menu	in	Figure	4.20.	Within	this	menu	select	(you’re	back	to
using	the	left	mouse	button	now)	 .	This	action	opens	SPSS
Statistics	but	allows	it	to	make	changes	to	your	computer.	A	dialog	box	will
appear	that	asks	you	whether	you	want	to	let	SPSS	make	changes	to	your
computer	and	you	should	reply	‘yes’.
3.	 Once	SPSS	has	loaded	select	Extensions	 	Utilities	 	 ,
which	activates	a	dialog	box	for	opening	files	(Figure	4.20).7	Locate	the	file
process.spd,	select	it,	and	click	 .	This	installs	the	PROCESS
menu	and	dialog	boxes	into	SPSS.	If	you	get	an	error	message,	the	most
likely	explanation	is	that	you	haven’t	opened	SPSS	as	an	administrator	(see
step	2).
7	If	you’re	using	a	version	of	SPSS	earlier	than	24,	you	need	to	select	Utilities	
	Custom	Dialogs	 	 .
4.13.2	Essentials	for	R	
At	various	points	in	the	book	we’re	going	to	use	robust	tests	that	use	R.	To	get
SPSS	Statistics	to	interface	with	R,	we	need	to	install:	(1)	the	version	of	R	that	is
compatible	with	our	version	of	SPSS	Statistics;	and	(2)	the	Essentials	for	R	for
Statistics	plugin	from	IBM.	At	the	time	of	writing,	the	R	plugin	isn’t	available
for	SPSS	Statistics	version	25,	but	by	the	time	the	book	is	published	it	may	well
be.	These	instructions	are	for	SPSS	Statistics	version	24	but	you	can	hopefully
extrapolate	to	other	versions.	First,	let’s	get	the	plugin	and	installation
documentation	from	IBM:
1.	 Create	an	account	on	IBM.com	(www-01.ibm.com).
2.	 Go	to	https://www-01.ibm.com/marketing/iwm/iwm/web/preLogin.do?
source=swg-tspssp
3.	 There	will	be	a	long	list	of	stuff	you	can	download.	Select	IBM	SPSS
Statistics	Version	24	–	Essentials	for	R	(or	whatever	version	of	SPSS
Statistics	you’re	using)	and	click	continue.
4.	 Complete	the	privacy	information,	and	read	and	agree	(or	not)	to	IBM’s
terms	and	conditions.
5.	 Download	the	version	of	IBM	SPSS	Statistics	Version	24	–	Essentials	for	R
for	your	operating	system	(Windows,	Mac	OS,	Linux,	etc.)	and	the
corresponding	installation	instructions	(labelled	Installation	Documentation
24.0	Multilingual	for	xxx,	where	xxx	is	the	operating	system	you	use).	By
default	the	website	uses	an	app	called	the	Download	Director	to	manage	the
download.	This	app	never	works	for	me	(on	a	Mac)	and	if	you	have	the
same	problem,	switch	the	tab	at	the	top	of	the	list	of	downloads	to
‘Download	using	http’	( )	and	download	the	files	directly
through	your	browser.
6.	 Open	the	installation	documentation	(it	should	be	a	PDF	file)	and	check
which	version	of	R	you	need	to	install.8
Having	got	the	Essentials	for	R	plugin,	don’t	install	it	yet.	You	need	to
check	which	version	of	R	you	need,	and	download	it.	SPSS	Statistics
typically	uses	an	old	version	of	R	(because	IBM	needs	to	check	that	the
Essentials	for	R	plugin	is	stable	before	releasing	it	and	by	the	time	they
have	done	that	R	has	updated).	Finding	old	versions	of	R	is	tediously
overcomplicated;	I’ve	tried	to	illustrate	the	process	in	Figure	4.21.
7.	 Go	to	https://www.r-project.org/
8.	 Click	the	link	labelled	CRAN	(under	the	Download	heading)	to	go	to	a	page
to	select	a	CRAN	mirror.	A	CRAN	mirror	is	a	location	from	which	to
download	R.	It	doesn’t	matter	which	you	choose;	because	I’m	based	in	the
UK,	I	picked	one	of	the	UK	links	in	Figure	4.21.
9.	 On	the	next	page,	click	the	link	for	the	operating	system	you	use	(Windows,
Mac,	or	Linux).
10.	 You	will	already	know	what	version	of	R	you’re	looking	for	because	I	told
you	to	check	before	getting	to	this	point	(e.g.,	SPSS	Statistics	version	24
uses	R	version	3.2).9	What	happens	next	differs	for	Windows	and	Mac	OS:
Windows:	If	you	selected	the	link	to	the	Windows	version	you’ll	be
directed	to	a	page	for	R	for	Windows.	Click	the	link	labelled	Install	R
for	the	first	time	to	go	a	page	to	download	R	for	Windows.	Do	not
click	the	link	at	the	top	of	the	page,	but	scroll	down	to	the	section
labelled	Other	builds,	and	click	the	link	to	Previous	releases.	The
resulting	page	lists	previous	versions	of	R.	Select	the	version	you	want
(for	SPSS	Statistics	24,	select	R	3.2.5,	for	other	versions	of	SPSS
consult	the	documentation).
Mac	OS:	If	you	selected	the	link	to	the	OS	X	version	you’ll	be	directed
to	a	page	for	R	for	Mac	OS	X.	On	this	page	click	the	link	to	the	old
directory.	This	takes	you	to	a	directory	listing.	You	need	to	scroll	down
a	bit	until	you	find	the	.pkg	files.	Click	the	link	to	the	.pkg	file	of	the
version	of	R	that	you	want	(for	SPSS	Statistics	24,	click	R	3.2.4,	for
other	versions	consult	the	documentation).
8	At	the	time	of	writing,	the	installation	documentation	for	SPSS	Statistics	24
links	to	a	PDF	file	for	version	23,	which	says	that	you	need	R	3.1.	This	is	true
for	version	23	of	SPSS	Statistics,	but	version	24	requires	R	3.2	onwards.
9	There	will	be	several	versions	of	R	3.2	which	are	denoted	as	3.2.x,	where	x	is	a
minor	update.	It	shouldn’t	matter	whether	you	install	version	3.2.1	or	3.2.5,	but
you	may	as	well	go	for	the	last	of	the	releases.	In	the	case	of	R	3.2,	the	last
update	before	release	3.3	was	3.2.5.
Figure	4.21	Finding	an	old	version	of	R	is	overly	complicated	…
You	should	now	have	the	install	files	for	R	and	for	the	Essentials	for	R	plugin	in
your	download	folder.	Find	them.	First,	install	R	by	double-clicking	the	install
file	and	going	through	the	usual	install	process	for	your	operating	system.
Having	installed	R,	install	the	Essentials	for	R	plugin	by	double-clicking	the
install	file	to	initiate	a	standard	install.	If	all	that	fails,	there	is	a	guide	(at	the
time	of	writing)	to	installing	the	R	plugin	via	GitHub	at
https://developer.ibm.com/predictiveanalytics/2016/03/21/r-spss-installing-r-
essentials-from-github/	or	see	Oditi’s	Lantern.
4.13.3	The	WRS2	package	
Once	the	Essentials	for	R	plugin	is	installed	(see	above)	we	can	access	the
WRS2	package	for	R	(Mair	et	al.,	2015)	by	opening	a	syntax	window	and	typing
and	executing	the	following	syntax:
BEGIN	PROGRAM	R.
install.packages("WRS2")
END	PROGRAM.
The	first	and	last	lines	(remember	the	full	stops)	tell	SPSS	to	talk	to	R	and	then
to	stop.	All	the	stuff	in	between	is	language	that	tells	R	what	to	do.	In	this	case	it
tells	R	to	install	the	package	WRS2.	When	you	run	this	program	a	window	will
appear	asking	you	to	select	a	CRAN	mirror.	Select	any	in	the	list	(it	determines
from	where	R	downloads	the	package,	so	it’s	not	an	important	decision).
I	supply	various	syntax	files	for	robust	analyses	in	R,	and	at	the	top	of	each	one	I
include	this	program	(for	those	who	skipped	this	section).	However,	you	only
need	to	execute	this	program	once,	not	every	time	you	run	an	analysis.	The	only
times	you’d	need	to	re-execute	this	program	would	be:	(1)	if	you	change
computers;	(2)	if	you	upgrade	SPSS	Statistics	or	need	to	reinstall	the	Essentials
for	R	plugin,	or	R	itself,	for	some	reason;	(3)	something	goes	wrong	and	you
think	it	might	help	to	reinstall	WRS2.
Oditi’s	Lantern	SPSS	extensions
‘I,	Oditi,	am	bearded	like	a	great	pirate	sailing	my	ship	of	idiocy
across	the	vacant	seas	of	your	mind.	To	join	my	cult	you	must
become	pirate-like	in	my	image	and	speak	the	pirate	language.	You
must	punctuate	your	speech	with	the	exclamation	‘Rrrrrrrrrrr’.	It	will
help	you	uncover	the	unknown	numerical	truths	embedded	in	the
treasure	maps	of	data.	The	Rrrrrrr	plugin	for	SPSS	Statistics	will
help,	and	my	lantern	is	primed	with	a	visual	cannon-ball	of	an
installation	guide	that	will	blow	your	mind.’
4.13.4	Accessing	the	extensions	
Once	the	PROCESS	tool	has	been	added	to	SPSS	Statistics	it	appears	in	the
Analyze	 	Regression	menu.	If	you	can’t	see	it	then	the	install	hasn’t	worked
and	you’ll	need	to	work	through	this	section	again.	At	the	time	of	writing	WRS2
can	be	accessed	only	using	syntax.
4.14	Brian’s	attempt	to	woo	Jane	
Brian	had	been	stung	by	Jane’s	comment.	He	was	many	things,	but	he	didn’t
think	he	had	his	head	up	his	own	backside.	He	retreated	from	Jane	to	get	on	with
his	single	life.	He	listened	to	music,	met	his	friends,	and	played	Uncharted	4.
Truthfully,	he	mainly	played	Uncharted	4.	The	more	he	played,	the	more	he
thought	of	Jane,	and	the	more	he	thought	of	Jane,	the	more	convinced	he	became
that	she’d	be	the	sort	of	person	who	was	into	video	games.	When	he	next	saw
her	he	tried	to	start	a	conversation	about	games,	but	it	went	nowhere.	She	said
computers	were	good	only	for	analysing	data.	The	seed	was	sown,	and	Brian
went	about	researching	statistics	packages.	There	were	a	lot	of	them.	Too	many.
After	hours	on	Google,	he	decided	that	one	called	SPSS	looked	the	easiest	to
learn.	He	would	learn	it,	and	it	would	give	him	something	to	talk	about	with
Jane.	Over	the	following	week	he	read	books,	blogs,	watched	tutorials	on
YouTube,	bugged	his	lecturers,	and	practised	his	new	skills.	He	was	ready	to
chew	the	statistical	software	fat	with	Jane.
Figure	4.22	What	Brian	learnt	from	this	chapter
He	searched	around	campus	for	her:	the	library,	numerous	cafés,	the	quadrangle
–	she	was	nowhere.	Finally,	he	found	her	in	the	obvious	place:	one	of	the
computer	rooms	at	the	back	of	campus	called	the	Euphoria	cluster.	Jane	was
studying	numbers	on	the	screen,	but	it	didn’t	look	like	SPSS.	‘What	the	hell	…,’
Brian	thought	to	himself	as	he	sat	next	to	her	and	asked	…
4.15	What	next?	
At	the	start	of	this	chapter	we	discovered	that	I	feared	my	new	environment	of
primary	school.	My	fear	wasn’t	as	irrational	as	you	might	think,	because,	during
the	time	I	was	growing	up	in	England,	some	idiot	politician	had	decided	that	all
school	children	had	to	drink	a	small	bottle	of	milk	at	the	start	of	the	day.	The
government	supplied	the	milk,	I	think,	for	free,	but	most	free	things	come	at
some	kind	of	price.	The	price	of	free	milk	turned	out	to	be	lifelong	trauma.	The
milk	was	usually	delivered	early	in	the	morning	and	then	left	in	the	hottest	place
someone	could	find	until	we	innocent	children	hopped	and	skipped	into	the
playground	oblivious	to	the	gastric	hell	that	awaited.	We	were	greeted	with	one
of	these	bottles	of	warm	milk	and	a	very	small	straw.	We	were	then	forced	to
drink	it	through	grimacing	faces.	The	straw	was	a	blessing	because	it	filtered	out
the	lumps	formed	in	the	gently	curdling	milk.	Politicians	take	note:	if	you	want
children	to	enjoy	school,	don’t	force-feed	them	warm,	lumpy	milk.
But	despite	gagging	on	warm	milk	every	morning,	primary	school	was	a	very
happy	time	for	me.	With	the	help	of	Jonathan	Land,	my	confidence	grew.	With
this	new	confidence	I	began	to	feel	comfortable	not	just	at	school	but	in	the
world	more	generally.	It	was	time	to	explore.
4.16	Key	terms	that	I’ve	discovered
Currency	variable
Data	editor
Data	view
Date	variable
Long	format	data
Numeric	variable
Smartreader
String	variable
Syntax	editor
Variable	view
Viewer
Wide	format	data
Smart	Alex’s	tasks
Task	1:	Smart	Alex’s	first	task	for	this	chapter	is	to	save	the
data	that	you’ve	entered	in	this	chapter.	Save	it	somewhere	on
the	hard	drive	of	your	computer	(or	a	USB	stick	if	you’re	not
working	on	your	own	computer).	Give	it	a	sensible	title	and
save	it	somewhere	easy	to	find	(perhaps	create	a	folder	called
‘My	Data	Files’	where	you	can	save	all	of	your	files	when
working	through	this	book).	
Task	2:	What	are	the	following	icons	shortcuts	to?	
Task	3:	The	data	below	show	the	score	(out	of	20)	for	20
different	students,	some	of	whom	are	male	and	some	female,
and	some	of	whom	were	taught	using	positive	reinforcement
(being	nice)	and	others	who	were	taught	using	punishment
(electric	shock).	Enter	these	data	into	SPSS	and	save	the	file	as
Method	Of	Teaching.sav.	(Hint:	the	data	should	not	be	entered
in	the	same	way	that	they	are	laid	out	below.)	
Task	4:	Thinking	back	to	Labcoat	Leni’s	Real	Research	4.1,
Oxoby	also	measured	the	minimum	acceptable	offer;	these
MAOs	(in	dollars)	are	below	(again,	they	are	approximations
based	on	the	graphs	in	the	paper).	Enter	these	data	into	the	SPSS
Data	Editor	and	save	this	file	as	Oxoby	(2008)	MAO.sav.	
Bon	Scott	group:	2,	3,	3,	3,	3,	4,	4,	4,	4,	4,	4,	4,	4,	5,	5,	5,	5,
5
Brian	Johnson	group:	0,	1,	2,	2,	3,	3,	3,	3,	3,	4,	4,	4,	4,	4,	4,
4,	4,	1
Task	5:	According	to	some	highly	unscientific	research	done	by
a	UK	department	store	chain	and	reported	in	Marie	Claire
magazine	(http://ow.ly/9Dxvy),	shopping	is	good	for	you.	They
found	that	the	average	woman	spends	150	minutes	and	walks
2.6	miles	when	she	shops,	burning	off	around	385	calories.	In
contrast,	men	spend	only	about	50	minutes	shopping,	covering
1.5	miles.	This	was	based	on	strapping	a	pedometer	on	a	mere
10	participants.	Although	I	don’t	have	the	actual	data,	some
simulated	data	based	on	these	means	are	below.	Enter	these	data
into	SPSS	and	save	them	as	Shopping	Exercise.sav.	
Task	6:	This	task	was	inspired	by	two	news	stories	that	I
enjoyed.	The	first	was	about	a	Sudanese	man	who	was	forced	to
marry	a	goat	after	being	caught	having	sex	with	it
(http://ow.ly/9DyyP).	I’m	not	sure	whether	he	treated	the	goat	to
a	nice	dinner	in	a	posh	restaurant	beforehand	but,	either	way,
you	have	to	feel	sorry	for	the	goat.	I’d	barely	had	time	to
recover	from	that	story	when	another	appeared	about	an	Indian
man	forced	to	marry	a	dog	to	atone	for	stoning	two	dogs	and
stringing	them	up	in	a	tree	15	years	earlier	(http://ow.ly/9DyFn).
Why	anyone	would	think	it’s	a	good	idea	to	enter	a	dog	into
matrimony	with	a	man	with	a	history	of	violent	behaviour
towards	dogs	is	beyond	me.	Still,	I	wondered	whether	a	goat	or
dog	made	a	better	spouse.	I	found	some	other	people	who	had
been	forced	to	marry	goats	and	dogs	and	measured	their	life
satisfaction	and	how	much	they	like	animals.	Enter	these	data
into	SPSS	and	save	as	Goat	or	Dog.sav.	
Task	7:	One	of	my	favourite	activities,	especially	when	trying	to
do	brain-melting	things	like	writing	statistics	books,	is	drinking
tea.	I	am	English,	after	all.	Fortunately,	tea	improves	your
cognitive	function	–	well,	it	does	in	old	Chinese	people,	at	any
rate	(Feng,	Gwee,	Kua,	&	Ng,	2010).	I	may	not	be	Chinese	and
I’m	not	that	old,	but,	I	nevertheless,	enjoy	the	idea	that	tea
might	help	me	think.	Here	are	some	data	based	on	Feng	et	al.’s
study	that	measured	the	number	of	cups	of	tea	drunk	and
cognitive	functioning	in	15	people.	Enter	these	data	into	SPSS
and	save	the	file	as	Tea	Makes	You	Brainy	15.sav.	
Task	8:	Statistics	and	maths	anxiety	are	common	and	affect
people’s	performance	on	maths	and	stats	assignments;	women,
in	particular,	can	lack	confidence	in	mathematics	(Field,	2010).
Zhang,	Schmader,	&	Hall,	(2013)	did	an	intriguing	study,	in
which	students	completed	a	maths	test	in	which	some	put	their
own	name	on	the	test	booklet,	whereas	others	were	given	a
booklet	that	already	had	either	a	male	or	female	name	on	it.
Participants	in	the	latter	two	conditions	were	told	that	they
would	use	this	other	person’s	name	for	the	purpose	of	the	test.
Women	who	completed	the	test	using	a	different	name
performed	better	than	those	who	completed	the	test	using	their
own	name.	(There	were	no	such	effects	for	men.)	The	data
below	are	a	random	subsample	of	Zhang	et	al.’s	data.	Enter
them	into	SPSS	and	save	the	file	as	Zhang	(2013)
subsample.sav	
Task	9:	What	is	a	coding	variable?	
Task	10:	What	is	the	difference	between	wide	and	long	format
data?	
Answers	&	additional	resources	are	available	on	the	book’s	website
at	https://edge.sagepub.com/field5e