USMLE Session
Biostatistics
March 12, 2014
The city of Cancerville had a population of 10,000,000 (50% women) in
1995. In 1995, there were 80,000 women with previously diagnosed
ovarian cancer in Cancerville. Twenty thousand new cases of ovarian
cancer were diagnosed in 1995. What was the incidence rate of ovarian
cancer in Cancerville in 1995?
(A) 2000 per One hundred thousand population
(B) 4000 per One hundred thousand population
(C) 200 per One hundred thousand population
(D) 400 per One hundred thousand population
(E) 1,000 per One hundred thousand population
D 400 per One hundred thousand population
The incidence rate is the number of new cases of a disease
during a specific period per population at risk. Twenty
thousand divided by 5 million women gives a rate of 1 case per
250 women, or 400 cases per One hundred thousand
populations.
A laboratory has developed a new test for rapid ascertainment
of serum parathyroid hormone levels. The test is repeated
twenty times on the same sample with a resulting coefficient of
variation of one percent. This is a measure of
(A) Accuracy
(B) Reliability
(C) Precision
(D) Validity
(E) Mode
B Reliability
-The mode is the most commonly occurring value in a series of
data.
-Reliability is a measure of the reproducibility of a test over
different conditions.
-Accuracy is a measure of the extent to which a test
approximates the real value of that which is measured. New
tests are measured against the gold standard, if one exists.
-Validity is the assessment of the degree to which a test
measures that for which it was designed. In other words, you
need to determine whether it reflect the outcome of interest
or other outcomes.
-Precision is the degree to which a measurement is not subject
to random variation.
At a large university, a study of pulse rates at rest was conducted
on 5000 students. The mean pulse rate was 70, with a standard
deviation of 10. Which of the following statements is true?
(A) Approximately 95% of the students had pulses between 60
and 80
(B) Approximately 68% of the students had pulses between 60
and 80
(C) Approximately 99.7% of the students had pulses between 50
and 90
(D) Approximately 95% of the students had pulses between 40
and 100
(E) Approximately 68% of the students had pulses between 50
and 90
B Approximately 68% of the students had pulses between 60
and 80
When a test is conducted on a normally distributed population,
68% of the population will have values within one standard
deviation of the mean, 95% of the population will have values
within two standard deviations of the mean, and 99.7% of the
population will have values within three standard deviations of
the mean. Therefore, in this population, 68% of the pulses will
be between 60 and 80, 95% between 50 and 90, and 99.7%
between 40 and 100.
A statistician analyzes data for several academic departments. She is free to
choose the appropriate methodology to her perform her analyses. Which of
the following data would best be analyzed by non-parametric statistical
methods?
(A) Results of a study on the effect of a new lipid-lowering drug on LDL
cholesterol
(B) Results of a study on the effect of asbestos exposure on forced vital
capacity
(C) Results of a study on the relationship between gender and lung cancer
(D) Results of a study on the differences in weight distributions between
children in different countries
(E) Results of a study on the relationship between hemoglobin and
reticulocyte count
C Results of a study on the relationship between gender and
lung cancer
Parametric techniques can be used to analyze data where at
least one of the variables is quantitative (interval or ratio) and
where the data is distributed normally. If the data is not
distributed normally or both variables are qualitative (nominal
or ordinal), non-parametric techniques must be used. Gender
and lung cancer are both qualitative variables, so nonparametric
techniques, such as chi-square, are used to
determine the relationship between them.
The public health officials of a particular city wish to evaluate the lead levels
of its constituents. In order to develop a sample population, they choose
every 10th family in the city for the study. This is an example of what kind of
population sample?
(A) Stratified selected sample
(B) Cluster selected sample
(C) Simple random sample
(D) Systematically selected sample
(E) Nonrandom selected sample
B Cluster selected sample
In cluster selected samples, the population of interested is
divided into subunits, such as families, and a random sample of
these units is used.
In simple random samples, each individual member of a
population has an equal probability of being chosen.
In stratified selected samples, individuals are chosen randomly
from within stratified groups, such as age groups.
In systematically selected samples, the population is ordered
by some characteristic, such as age, a starting point for
selection is randomly selected, and then the remainder of the
sample is collected by a predetermined scheme, such as
choosing every x number of people.
In nonrandom selected samples, some predetermined scheme
is used, such as the first x number of people presenting for a
certain disease to a clinic.
In reporting the results from a clinical study of a new anti-inflammatory drug for the
treatment of post-operative pain, the study's authors present data comparing the
total days of hospitalization for comparable groups of patients who have received
either the investigative anti-inflammatory drug or a placebo. The attached table
appears in their report. Which of the following would be a valid interpretation of the
data presented in this table?
(A) The p-value is greater than 0.05, indicating that there is no true treatment effect upon total days of post-operative
hospitalization
(B) The treatment group and placebo groups have unequal numbers of participants, and therefore the statistical test
results are not interpretable
(C) The results are suggestive of a true treatment effect, but the study has limited power to detect the effect due to
the relatively small number of study subjects
(D) Statistical testing of two group means yields a t-value, not a p-value
C The results are suggestive of a true treatment effect, but the
study has limited power to detect the effect due to the
relatively small number of study subjects
While the p-value for the differences between the mean days
of post-operative hospitalization is not below the conventional
level of 0.05, it is relatively close to that value. The values of
the treatment group and placebo group means (3.0 and 4.5
days, respectively) do suggest that there is an effect of
treatment. It is likely that the statistical power of the study is
rather limited, given the modest number of people enrolled in
each group. Ideally, this study would be repeated with larger
numbers of study subjects in each of the two groups. While it
would be a mistake to conclude that there was definitively a
treatment effect, it would also be a mistake to conclude that
there was no evidence for a treatment effect, as well.
In clinical trials, it is not necessary that the comparison groups
have identical numbers of subjects, although there should be a
sufficient number of participants in each study group to
effectively evaluate the treatment being considered. While
statistical testing of two group means may use the t-test, it is
possible to derive a p-value from the use of this test.
Week 7 USMLE Step 1 Review: Biostatistics, Behavioral
Science, and Nutrition, Steven Katz MSIV
en.wikipedia.org