8 The statistical inferences based on one sample and two
independent samples from Bernoulli distribution
Theorem 8.1
Let X1, . . . , Xn be a random sample from Bernoulli distribution A() and let the condition n(1-) >
9 is true. Let M = 1
n
n
i=1
Xi be a sample mean (sometimes referred as to sample proportion). Then
the statistic
U = M-
M(1-M)
n
 N(0, 1). It has to be read: The statistic U follows asymptotic standard normal
distribution.
Thus 100(1 - )% asymptotic confidence limits for the parameter  are:
d = m - m(1-m)
n
u1-/2
h = m + m(1-m)
n
u1-/2
Remark 8.2
It is essential to realize what is the interpretation of the mean M. The random variable X takes the
values: one and zero, where one stands for success. Then
n
i=1
Xi stands for the number of successes
in n independent trials and the fracture 1
n
n
i=1
Xi stands for the proportion of successes. The sample
proportion is the statistic which estimates the parameter  of success probability.
Example 8.3
The marketing department of a particular company analyzes competition market share of the same
product that is manufactured by considered company. Drawing randomly 100 consumers it was found
out that 34 of them use competitor's product, the rest of them use product of considered company.
Find the 95% confidence interval for the proportion of competitor's product in the market.
Solution
Let Xi be a random variable, which takes value 1 if the i-th consumer uses the competitor's product
and the value 0 otherwise; i = 1, 2, . . . , 100.
Then Xi  A() and X1, . . . , Xn is a random sample from Bernoulli distribution. The task is to
construct the confidence interval for the parameter  of this distribution.
n = 100 m = 34
100
u1-/2 = u0,975 = 1, 96
Since the parameter  in approximating condition n(1 - ) > 9 is unknown it should be replaced
by its estimate m.
100.0, 34.0, 66 = 22, 44 > 9. Thus the estimate m satisfies the condition. Then:
d = 0, 34 - 0,34.0,66
100
.1, 96 = 0, 2472 h = 0, 34 + 0,34.0,66
100
.1, 96 = 0, 4328
Thus 0, 2472 <  < 0, 4328 with the probability approximately 0,95.
[ is a probability, that the randomly drawn consumer uses competitor's product; this probability
lies within the limits of interval (0,2472;0,4328). The confidence, that this interval contains the true
parameter, is roughly 95%.]
Theorem 8.4
Let X1, . . . , Xn be a random sample from A(), c  (0, 1), M be a sample mean and let the condition
n(1 - ) > 9 is true.
At the asymptotic confidence level  the null hypothesis H0 :  = c is rejected in favour of the
37
alternative hypothesis H1, if the realization of the test statistic
T0 = M-c
c(1-c)
n
falls within the critical region W. According to the form of the alternative hypothesis
the list of corresponding critical regions follows :
two-tailed test H1 :  = c W = (-, -u1-/2 u1-/2, )
left-tailed test H1 :  < c W = (-, -u1right-tailed
test H1 :  > c W = u1-, )
[If H0 is true, then T0  N(0, 1).]
Remark 8.5
The test statistic is derived using Moivre-Laplace theorem. T0 = M-
(1-)
n
 N(0, 1)
Remark 8.6
The pivotal statistic, which is instrumental towards construction of confidence interval, differs from
the test statistic stated in previous theorem!
Example 8.7
Manufacturing some components, the manufacturer declares, that the probability of manufactured
defective product is  = 0, 01. The sample consisting of 1000 products was drawn randomly and it was
found that 16 products were defective. At the asymptotic significance level 0.05 test the hypothesis
H0 :  = 0, 01 against H1 :  = 0, 01.
Solution
Since the parameter  is unknown the condition of normal approximation n(1 - ) > 9 should be
replaced by the condition nm(1 - m) > 9.
1000. 16
1000
 984
1000
= 15.744 > 9, thus the normal approximation is possible.
The realization of the test statistic follows: t0 = 16/1000-0,01
0,010,99
1000
= 1, 907
The critic region is expressed: W = (-, -u1-/2 u1-/2, ) = (-, -1, 96 1, 96, ).
Since 1, 907  W, H0 is not rejected at the asymptotic significance level 0,05.
[Based on the values of the random sample there is no reason to doubt about declared probability
0.01 of manufacturing the defective product.]
Theorem 8.8
Let us consider two independent samples. Let X11, . . . , X1n1 be a random sample from Bernoulli distribution
A(1) and X21, . . . , X2n2 be a random sample from A(2). Let the conditions nii(1-i) >
9, i = 1, 2 are true. Let M1, M2 be sample means. Then the statistic
U = (M1-M2)-(1-2)
M1(1-M1)
n1
+
M2(1-M2)
n2
 N(0, 1).
Thus 100(1 - )% asymptotic confidence limits for the parametric function 1 - 2 are:
d = m1 - m2 - m1(1-m1)
n1
+ m2(1-m2)
n2
 u1-/2
h = m1 - m2 + m1(1-m1)
n1
+ m2(1-m2)
n2
 u1-/2
Example 8.9
The supermarket management advertised the week of prices reduction. The aim was to find out if the
prices reduction does impact the proportion of the heavy shopping (over 500 Kč). During the week
without reductions it was drawn randomly 200 customers and 97 of them had done heavy shopping.
During the week with reductions the size of the random sample was 300 and the number of heavy
38
shopping was 162. Determine the 95% asymptotic confidence interval for the difference between the
probabilities of heavy shopping during the week without reductions and week with reductions.
Solution
The random variable X1,i takes the value 1, if during the week without reduction in prices the
i-th randomly drawn customer realizes heavy shopping and the value 0 otherwise, i = 1, . . . , 200.
The random variables X1,1, . . . , X1,200 form the random sample from distribution A(1). Further
the random variable X2i takes the value 1, if during the week with reduction in prices the i-th
randomly drawn customer realizes heavy shopping and the value 0 otherwise, i = 1, . . . , 300. The
random variables X2,1, . . . , X2,300 form the random sample from distribution A(2) and this sample
is independent from the previous one.
n1 = 200, n2 = 300, m1 = 97/200, m2 = 162/300.
To verify the conditions nii(1 - i) > 9, i = 1, 2 of normal approximation the unknown parameters
i should be replaced by their estimates mi. Thus this estimates meet the conditions:
200  97/200  103/200 = 49, 955 > 9, 300  162/300  138/300 = 74, 52 > 9.
Thus the 100(1 - )% asymptotic confidence limits for parametric function 1 - 2 follow:
d = m1 - m2 - m1(1-m1)
n1
+ m2(1-m2)
n2
 u1-/2 =
= 97/200 - 162/300 - 97/200(1-97/200)
200
+ 162/300(1-162/300)
300
 1, 96 =
= -0, 1443
h = m1 - m2 + m1(1-m1)
n1
+ m2(1-m2)
n2
 u1-/2 =
= 97/200 - 162/300 + 97/200(1-97/200)
200
+ 162/300(1-162/300)
300
 1, 96 =
= 0, 0343
Hence the parametric funktion
1 - 2  (-0, 1443 , 0, 0343) with the probability approximately 0.95.
Theorem 8.10
Let us consider two independent samples. Let X11, . . . , X1n1 be a random sample from Bernoulli
distribution A(1) and X21, . . . , X2n2 be a random sample from A(2). Let the conditions nii(1 i)
> 9, i = 1, 2 are true. Let M1, M2 be sample means.
At the asymptotic level  the null hypothesis H0 : 1 - 2 = c is rejected in favour of the alternative
hypothesis if the realization of the test statistic
T0 = (M1-M2)-c
M1(1-M1)
n1
+
M2(1-M2)
n2
falls within the critical region W. According to the form of the alternative
hypothesis the list of corresponding critical regions follows :
two-sided test H1 : 1 - 2 = c W = (-, -u1-/2 u1-/2, )
left-sided test H1 : 1 - 2 < c W = (-, -u1right-sided
test H1 : 1 - 2 > c W = u1-, )
[If H0 is true, then T0  N(0, 1).]
Remark 8.11
In the case of H0 : 1 - 2 = 0 (c = 0) the test statistic T0 is preferable,
T0 = M1-M2
M(1-M) 1
n1
+ 1
n2
, where M = n1M1+n2M2
n1+n2
.
[If H0 is true, then T0  N(0, 1).]
Example 8.12
Using the data from exercise 8.9 and at the asymptotic significance level 0.05 test the hypothesis,
39
that the week of prices reductions does not increase the probability of heavy shopping.
Solution
We are running the left tailed test H0 : 1 - 2 = 0 versus H1 : 1 - 2 < 0 at asymptotic  = 0, 05.
n1 = 200, n2 = 300, m1 = 97/200, m2 = 162/300, m = (97 + 162)/500 = 0, 518.
The assumptions of normal approximation have been verified in 8.9
ad a) Using confidence interval method:
For the left-tailed test we use right-sided confidence interval:
h = m1 - m2 + m1(1-m1)
n1
+ m2(1-m2)
n2
 u1- =
= 97/200 - 162/300 + 97/200(1-97/200)
200
+ 162/300(1-162/300)
300
 1, 645 =
= 0, 02
Since the value c = 0 is within the interval (- ; 0, 02), H0 is not rejected at the asymptotic
 = 0, 05, thus the week of prices reductions does not increase the probability of heavy shopping.
ad b) Using classical method:
The test statistic follows:
T0 = M1-M2
M(1-M) 1
n1
+ 1
n2
, kde M = n1M1+n2M2
n1+n2
m = 200.97/200+300.162/300
200+300
= 0, 518
t0 = 97/200-162/300
0,518(1-0,518)( 1
200
+ 1
300 )
= -1, 2058
The critical region follows:
W = (- , -u1- = (- , -u0,95 = (- , -1, 645 .
Since t0  W, H0 is not rejected at the asymptotic  = 0, 05
40