Epidemiological methods
Epidemiology
 The study of the distribution and
determinants of the frequency of healthrelated
outcomes in specified populations
 Quantitative discipline
 Measurement of disease / condition / risk
factor frequency is central to epidemiology
 Comparisons require measurements
Much of epidemiological research is
taken up trying
 to establish associations between
exposures and disease rates
 to measure the extent to which risk
changes as the level of exposure changes
 to establish whether the associations
observed may be truly causal (rather than
being just consequence of bias or chance)
 Epidemiology has a major role in developing
appropriate strategies to improve public
health through prevention
◦ public health has wider meaning in this sense; it is
about the health of the whole population.
◦ it does not cover only classic areas, such as
immunization or monitoring of diseases, it also
covers factors such as poverty, smoking, nutrition
 In this sense, epidemiology has a crucial role
in trying to put into perspective the effects on
population health of different risk factors.
Measures of association
 Risk of disease, rate of disease in different
groups of population
 Comparison of risks/rates
Measures of effect
We have 2 groups of individuals:
 An exposed group (group with risk factor
of interest) and unexposed group
(without such factor of interest)
 We are interested in comparing the
amount of disease (mortality or other
health outcome) in the exposed group to
that in the unexposed group
Risk ratio
• we calculate the risk ratio (RR) as:
RR=r1/r0
Risk difference
• the absolute difference between two risks (or
rates)
RD = r1 – r0
Example: cohort study of oral
contraceptive use and heart attack
Myocardial infarction
Yes No Total
OC use
Yes 25 400 425
No 75 1500 1575
Total 100 1900 2000
Risk (exposed) = 25/425=0.059
Risk (unexposed) = 75/1575=0.048
Relative risk = 0.059/0.048 = 1.23
 Alternative measure of risk
Odds ratio
The odds of disease is the number of cases divided by
the number of non-cases
Cases
Odds = ------------
Non cases
Odds ratio (OR) is ratio of odds of disease among
exposed (oddsexp) and odds of disease among
unexposed (oddsunexp)
OR= oddsexp/ oddsunexp
We can calculate
• Odds (exposed) Oexp=25/400
• Odds (unexposed) Ounexp=75/1500
• Odds ratio OR = Oexp / Ounexp = 1.25
Myocardial infarction
Yes No Total
OC use
Yes 25 400 425
No 75 1500 1575
Total 100 1900 2000
Odds ratio as an approximation to
the risk ratio
 For a rare disease, odds ratio is
approximately equal to the risk ratio
(because denumerators are very similar)
 For a common conditions, OR overestimates
the true RR
Measures of population impact
 Population attributable risk (PAR) is
the absolute difference between the risk
(or rate) in the whole population and the
risk or rate in the unexposed group
PAR = r – r0
Population attributable risk fraction
(PARF or PAR%)
 It is a measure of the proportion of all cases
in the study population (exposed and
unexposed) that may be attributed to the
exposure, on the assumption of a causal
association
 It is also called the aetiologic fraction, the
percentage population attributable risk or
the attributable fraction
 If r is rate in the total population
PAF = PAR/r
PAR = r – r0
PAF = (r-r0)/r
Risk or rate difference
the absolute difference between two risks (or
rates)
RD = r1 – r0
Measure of the absolute effect
Similar for rates = rate difference = incidence
rate in exposed – incidence rate in unexposed
Measure of
effect
Use of the measure How to interpret results
Risk
Difference
Public Health
Interested in excess disease burden
due to factor (“Attributable risk”)
Close to 0 = little effect
Large difference = large effect
Risk Ratio Epidemiology
Causation
“This factor doubles the risk of the
disease”
Close to 1 = little effect
Large ratio = large effect
Close to 0 = large effect!Odds Ratio As for Risk Ratio
“This factor doubles the odds of the
disease”
Only possibility (case-control study)
More advanced statistical methods
(logistic regression)
Three major issues in interpretation
of results in any epidemiological
study
 Chance (random variation) –
statistics
 Confounding
 Bias (i.e. systematic error)
Three major issues in interpretation
of results in any epidemiological
study
 Chance (random variation) – statistics
 Confounding
 Bias (i.e. systematic error)
Confounding
 Situation when a third factor is associated
with both exposure and disease
 Association between exposure and
disease may not be causal; instead, it is
due to a third factor which is associated
with both exposure and disease.
Confounding
Exposure Disease
Confounding
factor
Case-control study of alcohol and
lung cancer
Alcohol No alcohol
Cases 450 300
Controls 200 250
Estimated odds ratio =1.9
The same data stratified by smoking:
Non-smokers Smokers
Alcohol No alcohol Alcohol No
alcohol
Cases 50 100 400 200
Controls 100 200 100 50
Estimated odds ratio 1.0 1.0
Alcohol and smoking in controls
Alcohol No alcohol
Smokers 100 50
Non-smokers 100 200
Non-drinkers: 1 in 5 were smokers,
Drinkers: 1 in 2 were smokers.
Confounding
Alcohol Lung cancer
Smoking
Three major issues in interpretation
of results in any epidemiological
study
 Chance (random variation) – statistics
 Confounding
 Bias (i.e. systematic error)
Bias
 is a systematic error in the design of an
epidemiological study which leads to a
distortion or error in the study results.
 An association will allow to be distorted
if error is differential
Two main types of bias
Selection bias
due to errors in the way sample is
recruited
Information bias
due to errors in way in which information
collected from the sample
Selection bias
 a distortion that results from procedures
used to select subjects or their
participation
 resulting in a difference in the
characteristics between those who are
included in the study and those in study
population but not included in the study
sample
Information bias
 Errors in the way information about
exposure or disease collected
 Misclassification - putting subjects in
wrong category
 Eg exposed as unexposed, case as control
Misclassification may be
 Random - above / below
 Systematic – all in one direction
 Non–differential (error in one variable
not related to / dependent on the value
of other variables)
 Differential (error in one variable is
related to value of other variable
Assessment of bias
 Non-responders questionnaire
 Baseline characteristics of those lost to
follow can be analysed and compared to
those remaining in study
 Objective validation of self-reported
information
Bias: the silent menace
 Cannot be assessed numerically
 No software to identify bias
 If there is flaw in the design of the study
increasing numbers will not get rid of it!
 Can only be assessed by careful
evaluation of the design
Causality
 1/ we find an association between
exposure and outcome
 2/ we need to ask whether the
association is causal = does the exposure
cause the outcome?
What is a cause?
Rothman (1986):
An event, condition, or characteristic that plays an essential role in
producing an occurrence of the disease. Source - Modern
Epidemiology.
- Something that has an effect
- Alters disease frequency or health status
34
Association versus Causation
• Epidemiological research aims to discover aetiology of
disease
• Epidemiology is the study of the association between a
potential cause (risk factor/determinant) and a specific
disease (outcome).
• Presence of a valid statistical association does not imply
causality
• Association is not the same as causation
• Goes beyond association
• How do we decide whether a given association is causal or
not?
35
Sir Austin Bradford Hill (1897-1991)
Exposure and Disease:Association or Causation?
1. Strength
2. Consistency
3. Specificity
4. Temporality
5. Dose-response
6. Biological plausibility
7. Coherence
8. Reversibility
The Bradford-Hill criteria of causation (J Royal Soc Med 1965; 58:
295-300)
36
Bradford Hill Closing Remarks (1965)
“I do not believe … that we can usefully lay down
some hard-and-fast rules of evidence that must be
observed before we accept cause and effect.
None … can bring indisputable evidence for or
against the cause and-effect hypothesis and none
can be required …
What they can do, with greater or less strength, is to
help us to make up our minds on the fundamental
question - is there any other way of explaining the set
of facts before us, is there any other answer equally,
or more, likely than cause and effect?
37
Causal Inference
 Not just ticking boxes
 Weigh evidence of causal association against other
explanations
 Understanding, judgement & interpretation
 Cannot prove a causal association
 Can only be inferred based on evidence
 May change in the light of new evidence
Evidence of
causality
Weaknesses
in the data
38
Public health policy
 Ideally based on ‘evidence’ - meta-analyses
and systematic reviews
 Considerations of efficiency, costeffectiveness
and harm
 Eradication of poverty for improving health?
 Reduction in social inequality for reducing
health inequality?
39
Summary
 Epidemiology = the study of the
distribution and determinants of disease in
population
 Measures of association
 Bias, confounding, chance
 Causality