Statistical inference I and II
Statistical inference
Stanislav Katina
ÚMS, MU
17.04.2024 16:04
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 1 / 288
Book Aplikovaná štatistická inferencia I
Figure 1: Book Aplikovaná štatistická inferencia I
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 2 / 288
Table of contents
1 Overview of testing of statistical hypotheses
2 Three test statistics
3 Confidence intervals
4 Overview of the tests
5 Generalised hypotheses
6 One-sample tests about 𝜇
7 Paired tests about 𝜇
8 One-sample tests about 𝜎2
9 One-sample tests about skewness and kurtosis
10 One-sample tests about correlation coeﬀicient
11 One-sample tests about probability
12 One-sample tests about rate
13 Two sample tests about difference of two means
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 3 / 288
10 One-sample tests about correlation coeﬀicient
10.14 One-sample tests about correlation coeﬀicient
1 "PowerRho11" <- function(r, r0, n, alpha) {
2 z <- atanh(r)
3 z0 <- atanh(r0)
4 sila <- 1 - pnorm(qnorm(1 - alpha/2) - sqrt(n - 3) * abs(z0 - z))
5 return(sila)
6 }
7 "PowerRho12" <- function(r, r0, n, alpha) {
8 z <- atanh(r)
9 z0 <- atanh(r0)
10 sila <- 1 - pnorm(qnorm(1 - alpha) + sqrt(n - 3) * (z0 - z))
11 return(sila)
12 }
13 "PowerRho13" <- function(r, r0, n, alpha) {
14 z <- atanh(r)
15 z0 <- atanh(r0)
16 sila <- 1 - pnorm(qnorm(1 - alpha) - sqrt(n - 3) * (z0 - z))
17 return(sila)
18 }
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 175 / 288
10 One-sample tests about correlation coeﬀicient
10.15 One-sample tests about correlation coeﬀicient
−1.0 −0.5 0.0 0.5 1.0
0.00.20.40.60.81.0
n = 10
n = 20
n = 30
n = 40
n = 50
powerfunction
ρ − ρ0
H11 : ρ − ρ0 ≠ 0, ρ0 = 0, α = 0.05
−1.0 −0.5 0.0 0.5 1.0
0.00.20.40.60.81.0
n = 10
n = 20
n = 30
n = 40
n = 50
powerfunction
ρ − ρ0
H12 : ρ − ρ0 > 0, ρ0 = 0, α = 0.05
−1.0 −0.5 0.0 0.5 1.0
0.00.20.40.60.81.0
n = 10
n = 20
n = 30
n = 40
n = 50
powerfunction
ρ − ρ0
H13 : ρ − ρ0 < 0, ρ0 = 0, α = 0.05
Power functions of one-sample 𝑍-statistic 𝑍W for 𝜌 under 𝐻11 (left), 𝐻12 (middle) and
𝐻13 (right). If 𝜌0 not equal to 0, the power function under 𝐻11 is not symmetric around
𝜌0.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 176 / 288
10 One-sample tests about correlation coeﬀicient
10.16 One-sample tests about correlation coeﬀicient
For 𝐻11, using Fisher 𝑍-variable, the minimal sample size for a difference 𝜉 − 𝜉0, 𝛼 and
𝛽 is given by
𝑛 ≥ 3 + (
𝑢 𝛼/2 + 𝑢 𝛽
𝜉 − 𝜉0
)
2
.
For 𝐻12 and 𝐻13, 𝑢 𝛼/2 is change to 𝑢 𝛼.
Example (minimal sample size 𝑛)
Calculate minimal sample size 𝑛 for 𝜌 = 0.1, 0.2, … , 0.9, 𝜌0 = 0 with 𝛼 = 0.05,
1 − 𝛽 = 0.8 and 𝐻11.
Table 1: Minimal sample size 𝑛 for a difference 𝜌 and 𝜌0, where 𝜌0 = 0
𝜌 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
𝑛 783 194 85 47 30 20 14 10 7
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 177 / 288
10 One-sample tests about correlation coeﬀicient
10.17 One-sample tests about correlation coeﬀicient
1 "NRho11a" <- function(r, r0, alpha = 0.05, sila = 0.8) {
2 z <- atanh(r); z0 <- atanh(r0)
3 ualpha <- qnorm(1 - alpha/2)
4 ubeta <- qnorm(sila)
5 n <- ceiling(3 + ((ualpha + ubeta)/(z - z0))^2)
6 names(n) <- paste("r=", seq(0.1, 0.9, by = 0.1), sep = "")
7 return(n)
8 }
9 n80 <- NRho11a(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.8)
10 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
11 ## 783 194 85 47 30 20 14 10 7
12 n85 <- NRho11a(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.85)
13 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
14 ## 895 222 97 54 33 22 15 11 8
15 n90 <- NRho11a(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.9)
16 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
17 ## 1047 259 113 62 38 25 17 12 8
18 n80 <- NRho11a(seq(0.1, 0.9, by = 0.1), 0.5, 0.05, 0.8)
19 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
20 ## 42 69 140 501 Inf 383 81 30 13
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 178 / 288
10 One-sample tests about correlation coeﬀicient
10.18 One-sample tests about correlation coeﬀicient
Example (minimal sample size 𝑛)
Calculate minimal sample size 𝑛 for 𝜌 = 0.1, 0.2, … , 0.9, 𝜌 − 𝜌0 = 0.1, with 𝛼 = 0.05,
1 − 𝛽 = 0.8 and 𝐻11.
Table 2: Minimal sample size 𝑛 for selected differences of 𝜌 and 𝜌0 together with 𝑍R − 𝜉0, as a
function of 𝜌 and 𝜌0
𝜌 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
𝜌0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
𝑍R − 𝜉0 0.1003 0.1024 0.1068 0.1141 0.1257 0.1438 0.1742 0.2313 0.3736
𝑛 783 752 692 606 501 383 262 150 60
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 179 / 288
10 One-sample tests about correlation coeﬀicient
10.19 One-sample tests about correlation coeﬀicient
1 "NRho11b" <- function(r, r0, alpha = 0.05, sila = 0.8) {
2 z <- atanh(r); z0 <- atanh(r0)
3 ualpha <- qnorm(1 - alpha/2)
4 ubeta <- qnorm(sila)
5 n <- ceiling(3 + ((ualpha + ubeta)/(z - z0))^2)
6 return(n)
7 }
8 n80 <- c(
9 NRho11b(0.1, 0, 0.05, 0.8),
10 NRho11b(0.2, 0.1, 0.05, 0.8),
11 NRho11b(0.3, 0.2, 0.05, 0.8),
12 NRho11b(0.4, 0.3, 0.05, 0.8),
13 NRho11b(0.5, 0.4, 0.05, 0.8),
14 NRho11b(0.6, 0.5, 0.05, 0.8),
15 NRho11b(0.7, 0.6, 0.05, 0.8),
16 NRho11b(0.8, 0.7, 0.05, 0.8),
17 NRho11b(0.9, 0.8, 0.05, 0.8)
18 )
19 ## 783 752 692 606 501 383 262 150 60
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 180 / 288
10 One-sample tests about correlation coeﬀicient
10.20 One-sample tests about correlation coeﬀicient
1 "FisherZ11" <- function(r, r0) {
2 z <- atanh(r)
3 z0 <- atanh(r0)
4 z - z0
5 }
6
7 FisherZ <- FisherZ11(seq(0.1, 0.9, by = 0.1), seq(0, 0.8, by = 0.1))
8 ## 0.1003 0.1024 0.1068 0.1141 0.1257 0.1438 0.1742 0.2313 0.3736
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 181 / 288
10 One-sample tests about correlation coeﬀicient
10.21 One-sample tests about correlation coeﬀicient
1 # one sided alternative
2 "NRho12" <- function(r, r0, alpha = 0.05, sila = 0.8) {
3 z <- atanh(r)
4 z0 <- atanh(r0)
5 ualpha <- qnorm(1 - alpha)
6 ubeta <- qnorm(sila)
7 n <- ceiling(3 + ((ualpha + ubeta)/(z - z0))^2)
8 names(n) <- paste("r=", seq(0.1, 0.9, by = 0.1), sep = "")
9 return(n)
10 }
11 n80 <- NRho12(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.8)
12 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
13 ## 618 154 68 38 24 16 12 9 6
14 n85 <- NRho12(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.85)
15 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
16 ## 718 178 79 44 27 18 13 9 7
17 n90 <- NRho12(seq(0.1, 0.9, by = 0.1), 0, 0.05, 0.9)
18 ## r=0.1 r=0.2 r=0.3 r=0.4 r=0.5 r=0.6 r=0.7 r=0.8 r=0.9
19 ## 854 212 93 51 32 21 15 11 7
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 182 / 288
10 One-sample tests about correlation coeﬀicient
10.22 One-sample tests about correlation coeﬀicient
Example (convergence of 𝜌 and 𝜉 to normal distribution)
In , calculate pseudorandom numbers from 𝑁2 (𝜇𝜇𝜇,ΣΣΣ), where 𝜇1 = 0, 𝜇2 = 0,
𝜎1 = 1, 𝜎2 = 1, 𝑛 = 5, 10, 20, 50 and 100, 𝑀 = 10000. Use (a) 𝜌 = 0, (b) 𝜌 = 0.50 and
(c) 𝜌 = 0.9. For each 𝑚 = 1, 2, … , 𝑀, calculate Pearson correlation coeﬀicient 𝑟 𝑚 and
Fisher 𝑍-variable 𝑧 𝑅,𝑚. Visualise histograms of simulated 𝑟 𝑚 and 𝑧 𝑅,𝑚 and superimpose
them with equivalent theoretical normal densities.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 183 / 288
10 One-sample tests about correlation coeﬀicient
10.23 One-sample tests about correlation coeﬀicient
Example (one-sample tests for testing hypotheses about 𝜌)
Having data one-sample-correlation-skull-mf.txt and variables 𝑋 the biggest cranium
hight skull.pH and variable 𝑌 face hight face.H in millimeters (mm) of ancient Egyptian
male population, and we expect bivariate normal distribution of these variables, i.e.,
𝑁2(𝜇𝜇𝜇,ΣΣΣ).
A. Test null hypothesis (against general alternative) that the correlation coeﬀicient of
biggest cranium hight and face hight is equal to 𝜌0 = 0.251 on a significance level
𝛼 = 0.05.
B. Calculate 100 × (1 − 𝛼)% empirical confidence interval for the correlation coeﬀicient,
where the coverage probability (confidence coeﬀicient) is equal to 1 − 𝛼 = 0.95.
Notes: Use (1) Fisher test statistic 𝑍W, (2) likelihood ratio test statistic 𝑈LR and
equivalent confidence intervals. Can you find any equivalent of Fisher 𝑍-test in ?
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 184 / 288
10 One-sample tests about correlation coeﬀicient
10.24 One-sample tests about correlation coeﬀicient
Figure 6: The biggest cranium hight and face hight in millimeters (mm)
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 185 / 288
10 One-sample tests about correlation coeﬀicient
10.25 One-sample tests about correlation coeﬀicient
Mathematical formulation:
Null hypothesis – 𝐻0 ∶ 𝜌 = 𝜌0
Alternative hypothesis – 𝐻1 ∶ 𝜌 ≠ 𝜌0,
where 𝜌0 = 0.251.
Verbal formulation:
Null hypothesis – correlation coeﬀicient of biggest cranium hight and face hight in
ancient Egyptian male population is equal to 𝜌0 = 0.251.
Alternative hypothesis – correlation coeﬀicient of biggest cranium hight and face
hight in ancient Egyptian male population is not equal to 𝜌0 = 0.251.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 186 / 288
10 One-sample tests about correlation coeﬀicient
10.26 One-sample tests about correlation coeﬀicient
The elements of testing procedure:
Realisation of test statistic – correlation coeﬀicient 𝜌 is not known and therefore
we need to estimate it. The estimate of correlation coeﬀicient (unbiased estimate)
𝑟 = ̂𝜌 = 0.331 and 𝑟 𝑛 = ̂𝜌 𝑛 = 0.329 (in this example, the difference between the two
estimates is very small). Then
Realisation of Wald test statistic 𝑧W =
√
𝑛 − 3 (𝑧R − 𝜉0) ≐ 1.105,
Realisation of likelihood ratio test statistic
𝑢LR ≐ 18.743 (if 𝐻0 ∶ 𝜌 = 0),
𝑢LR ≐ 1.242 (if 𝐻0 ∶ 𝜌 = 𝜌0).
Rejection region –
Wald test
critical values 𝑢(1 − 𝛼/2) ≐ −1.960 and 𝑢(𝛼/2) ≐ 1.960,
rejection region
𝒲1 = (𝑧min, 𝑢(1 − 𝛼/2)) ∪ (𝑢(𝛼/2), 𝑧max) ≐ (−∞, −1.960) ∪ (1.960, ∞)
likelihood ratio test,
critical value 𝜒2
1 (𝛼) ≐ 3.841,
rejection region 𝒲 = (𝜒2
1 (𝛼) , 𝑡max) ≐ (3.841, ∞).
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 187 / 288
10 One-sample tests about correlation coeﬀicient
10.27 One-sample tests about correlation coeﬀicient
The elements of testing procedure (cont.):
95% empirical confidence interval for the correlation coeﬀicient
Wald 95% empirical confidence interval for 𝜌
( ̂𝜌 𝐿, ̂𝜌 𝑈 ) ≐ (0.187, 0.461),
likelihood 95% empirical confidence interval for 𝜌
𝒞𝒮0.95 = {𝜌0 ∶ 𝑢LR(𝜌0) < 𝜒2
1 (𝛼)} ≐ (0.188, 0.460).
p-values –
Wald test p-value ≐ 2Pr(𝑍W ≥ 1.105|𝐻0) ≐ 0.269,
likelihood ratio test p-value ≐ Pr(𝑈LR ≥ 1.242|𝐻0) ≐ 0.265.
Statistical conclusion – 𝐻0 is not rejected on a significance level 𝛼 = 0.05, because
(1) tests statistic doesn’t belong to the critical region, (2) 𝜌0 = 0.251 belongs to the
confidence interval, and (3) p-value is greater than 0.05.
Verbal conclusion – We are not rejecting null hypothesis that correlation coeﬀicient
𝜌 of biggest cranium hight and face hight in ancient Egyptian male population is
equal to 𝜌0 = 0.251.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 188 / 288
10 One-sample tests about correlation coeﬀicient
10.28 One-sample tests about correlation coeﬀicient
1 wd <- "~/Documents/TEACHING/LECTURES/M8986-SI-II/DATA/"
2 fname <- paste(wd,"one-sample-correlation-skull-mf.txt", sep = "")
3 DATA <- read.table(fname, header = TRUE)
4 DATA <- na.omit(DATA)
5 skull.pH <- DATA$skull.pH[DATA$sex == "m"]
6 face.H <- DATA$face.H[DATA$sex == "m"]
7 n <- length(skull.pH) ## 164
8 rho_hat <- cor(skull.pH, face.H) ## 0.3306431
9
10 zR <- 1 / 2 * log((1 + rho_hat) / (1 - rho_hat)) ## 0.3435501
11 rho0 <- 0.251
12 xi0 <- 1 / 2 * log((1 + rho0) / (1 - rho0)) ## 0.2564798
13 zW_obs <- sqrt(n - 3) * (zR - xi0) ## 1.104799
14 tW_obs <- sqrt(n - 2) * rho_hat / (sqrt(1 - rho_hat^2)) ## 4.459204
15 uLR_obs <- -n * log(1 - tW_obs^2 / (tW_obs^2 + (n - 2))) ## 18.98718
16 # tW_obs a ani uLR_obs nie je mozne pouzit, lebo rho0 sa nerovna 0
17 uLR_obs <- -n * (log((1 - rho0^2) * (1 - rho_hat^2) /
18 (1 - rho0 * rho_hat)^2)) ## 1.241757
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 189 / 288
10 One-sample tests about correlation coeﬀicient
10.29 One-sample tests about correlation coeﬀicient
1 z_critval_d <- qnorm(0.025) ## -1.959964
2 z_critval_h <- qnorm(0.975) ## 1.959964
3 lr_critval <- qchisq(0.95, df = 1) ## 3.841459
4 "CorrelationCI" <- function(x, y, conf.level = 0.95) {
5 z <- atanh(cor(x, y))
6 n <- length(x)
7 a <- qnorm(1 - (1 - conf.level) / 2) / (n - 3)^0.5
8 CI_xi <- c(z - a, z + a)
9 CI <- tanh(CI_xi)
10 return(CI)
11 }
12 CIz <- CorrelationCI(skull.pH, face.H) ## 0.1868617 0.4605561
13 rho0i <- seq(-0.999, 0.999, by = 0.0000001)
14 uLR_i <- -n * (log((1 - rho0i^2) * (1 - rho_hat^2) /
15 (1 - rho0i * rho_hat)^2))
16 CI_LR <- range(rho0i[which(uLR_i < lr_critval)])
17 ## 0.1879428 0.4596729
18 p_val_zW <- 2 * (1 - pnorm(abs(zW_obs))) ## 0.2692467
19 p_val_LR <- 1 - pchisq(uLR_obs, df = 1) ## 0.2651328
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 190 / 288
10 One-sample tests about correlation coeﬀicient
10.30 One-sample tests about correlation coeﬀicient
One-sample 𝑡-test about correlation 𝜌 in (function cor.test()).
Arguments (inputs):
1 data vectors x and y – x, y
2 alternative – default alternative = "two.sided", other choices "greater",
"less"
3 method – default method = "pearson"
4 coverage probability 1 − 𝛼 – conf.level, default 0.95
Outputs:
1 test statistic 𝑡W – statistic testing only null hypothesis 𝜌 = 0!!! CONFIDENCE
INTERVAL FOR 𝜌 CAN’T BE CREATED
2 degrees of freedom (df) – parameter
3 p-value – p.value
4 alternative hypothesis – alternative hypothesis
5 empirical confidence interval for 𝜌 – conf.int (calculated based on Fisher
𝑍-variable !!! INCOMPATIBILITY)
6 point estimate (correlation coeﬀicient) – estimate
7 name of this test – method
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 191 / 288
11 One-sample tests about probability
Subsection 11
11 One-sample tests about probability
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 192 / 288
11 One-sample tests about probability
11.1 One-sample tests about probability
Tests of probabilities (proportions) are used in anthropology in the evaluation of
qualitative features in living people and on human skeletons. Qualitative features are
features that we do not evaluate by measuring, but we assess their presence or absence.
The populations then differ from each other in the proportion of cases with the
occurrence of the trait and without the occurrence of the trait. Among the quality
features we can include e.g. blood groups with one antigen, in which case it is the
frequency of individuals with the presence or absence of the given antibody. Also the
number of individuals of each gender in the population (sex ratio), side asymmetry
(greater value on the right or greater value on the left) or side preferences (right- or
left-handed) are traditionally described qualitatively. Another example of qualitative
features are epigenetic signs on the skull (e.g., metopism – persistence of sutura
interfrontalis in adulthood), where we usually classify the individual into one of two
categories (sign present or absent) or dermatoglyphic signs, for which individuals we
recommend into two categories (e.g., presence or absence of a digital triradium c).
Another example of a qualitative feature can be the taste sensitivity of
phenylthiocarbamide, where individuals are divided into two groups according to
whether they perceive the bitter taste of substances with this component or not.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 193 / 288
11 One-sample tests about probability
11.2 One-sample tests about probability
A large number of qualitative characteristics are distinguished in anthropology, and the
results of their assessment are easily published in text or tables, which enables extensive
comparisons with the published results of various researches (in contrast to continuous
characteristics, where currently primary data/individual values are mostly not published
and when comparing we have to rely on secondary data). In this way, we can very easily
test whether the proportions found in our research (e.g., the ratio of individuals
perceiving and not perceiving the taste of phenylthiourea in the Czech population) and
the proportion found in a published comparative study differ.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 194 / 288
11 One-sample tests about probability
11.3 One-sample tests about probability
Example (tests about probability)
Let 𝑋 ∼ 𝐵𝑖𝑛(𝑁, 𝑝). 𝐻0 ∶ 𝑝 = 𝑝0 against 𝐻1 ∶ 𝑝 ≠ 𝑝0, where 𝜃𝜃𝜃0 = (𝑝0, 1 − 𝑝0) 𝑇
, 𝜃𝜃𝜃1 = 𝑝0
and ̂𝜃𝜃𝜃2|0 = 1 − 𝑝0. A = a = (1, 0) 𝑇
and a0 = 𝑎0 = 𝑝0. Then
1 𝑈W = (a 𝑇 ̂𝜃𝜃𝜃 − a 𝑇
𝜃𝜃𝜃0) 𝑇
(a 𝑇
ℐ( ̂𝜃𝜃𝜃)a) (a 𝑇 ̂𝜃𝜃𝜃 − a 𝑇
𝜃𝜃𝜃0), then 𝑢W = 𝑁( ̂𝑝−𝑝0)
2
̂𝑝(1− ̂𝑝) ,
2 𝑈LR = −2(ℓ(𝜃𝜃𝜃0|X) − ℓ( ̂𝜃𝜃𝜃|X)), then
𝑢LR = 2 ∑
2
𝑖=1
𝑛𝑖 ln ̂𝑝 𝑖
𝑝0𝑖
= 2 ∑𝑖
observed𝑖 × ln
observed 𝑖
expected 𝑖
,
3 𝑈S = (a 𝑇
𝑆(𝜃𝜃𝜃0)) 𝑇
(a 𝑇
(ℐ(𝜃𝜃𝜃0))
−1
a) a 𝑇
𝑆(𝜃𝜃𝜃0), then 𝑢S = 𝑁( ̂𝑝−𝑝0)
2
𝑝0(1−𝑝0) .
Note: 𝑍2
W = 𝑈W and its realisation is 𝑧2
W = 𝑢W.
𝑢W = ( ̂𝑝 − 𝑝0)
2
ℐ(𝑝) = ( ̂𝑝 − 𝑝0)
2 𝑁
𝑝 (1 − 𝑝)
𝑝= ̂𝑝,𝑥/𝑁= ̂𝑝
=
( ̂𝑝 − 𝑝0)
2
̂𝑝 (1 − ̂𝑝) /𝑁
.
𝑢S =
[ 𝜕
𝜕𝜃 ℓ (𝑝|x)]
2
ℐ(𝑝)
=
( 𝑥
𝑝 − 𝑁−𝑥
1−𝑝 )
2
𝑁
𝑝(1−𝑝)
𝑝=𝑝0,𝑥/𝑁= ̂𝑝
=
( ̂𝑝 − 𝑝0)
2
𝑝0 (1 − 𝑝0) /𝑁
.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 195 / 288
11 One-sample tests about probability
11.4 One-sample tests about probability
Notes (for example above): 𝐻01 vs 𝐻11. Let 𝑋 ∼ 𝐵𝑖𝑛(𝑁, 𝑝), where 𝜃 = 𝑝. Then
log-likelihood function is equal to
ℓ (𝑝|x) = 𝑥 ln 𝑝 + (𝑁 − 𝑥) ln(1 − 𝑝).
If 𝐻01 is true, 𝜃0 = 𝑝0. If 𝐻01 is not true, MLE 𝜃 is equal to ̂𝑝 = 𝑥/𝑁.
𝑢LR = −2(ℓ (𝑝0|x) − ℓ ( ̂𝑝|x))
= −2(𝑥 ln 𝑝0 + (𝑁 − 𝑥) ln(1 − 𝑝0) − 𝑥 ln ̂𝑝 − (𝑁 − 𝑥) ln(1 − ̂𝑝))
= 2 (𝑥 ln
̂𝑝
𝑝0
+ (𝑁 − 𝑥) ln
1 − ̂𝑝
1 − 𝑝0
)
= 2 (𝑥 ln
𝑥
𝑁 𝑝0
+ (𝑁 − 𝑥) ln
𝑁 − 𝑥
𝑁 − 𝑁 𝑝0
) .
Reparametrisation (notation change). Change 𝑝 to 𝑝1 and 𝑞 = 1 − 𝑝 to 𝑝2, 𝑥 to 𝑛1 and
𝑁 − 𝑥 to 𝑛2. Then 𝜃𝜃𝜃 = p = (𝑝1, 𝑝2) 𝑇
, ̂𝜃𝜃𝜃 = ̂p = ( ̂𝑝1, ̂𝑝2) 𝑇
, where ̂𝑝1 = 𝑛1
𝑁 , ̂𝑝2 = 𝑛2
𝑁 ,
𝑝1 + 𝑝2 = 1 and 𝑛1 + 𝑛2 = 𝑁; 𝜃𝜃𝜃0 = p0 = (𝑝01, 𝑝02) 𝑇
, where 𝑝01 = 𝑝0 and 𝑝02 = 1 − 𝑝0.
Then
𝑢LR = 2
2
∑
𝑖=1
𝑛𝑖 ln
̂𝑝𝑖
𝑝0𝑖
= 2 ∑
𝑖
observed𝑖 × ln
observed𝑖
expected𝑖
.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 196 / 288
11 One-sample tests about probability
11.5 One-sample tests about probability
This reparametrisation is important not only to get simpler form of likelihood ratio test
statistics but also for generalisation of this statistics to either more dimensions or to
two-sample and multi-sample cases.
Wald 100 × (1 − 𝛼)% empirical confidence interval for 𝑝 is defined as
𝒞𝒮
(W)
1−𝛼 = {𝑝0 ∶ 𝑈W(𝑝) < 𝜒2
1 (𝛼)} .
Likelihood 100 × (1 − 𝛼)% empirical confidence interval for 𝑝 is defined as
𝒞𝒮
(LR)
1−𝛼 = {𝑝0 ∶ 𝑈LR(𝑝) < 𝜒2
1 (𝛼)} .
Score 100 × (1 − 𝛼)% empirical confidence interval for 𝑝 is defined as
𝒞𝒮
(S)
1−𝛼 = {𝑝0 ∶ 𝑈S(𝑝) < 𝜒2
1 (𝛼)} .
Example (score confidence interval)
Derive score confidence interval.
Notes (for example above): Score confidence interval defined above is equivalent to
𝒞𝒮
(S)
1−𝛼 = {𝑝0 ∶ 𝑍S < 𝑢 𝛼/2}. This can be rewritten to the following quadratic equation
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 197 / 288
11 One-sample tests about probability
11.6 One-sample tests about probability
𝒞𝒮
(S)
1−𝛼 = {𝑝0 ∶ (1 +
𝑢2
𝛼/2
𝑁
) 𝑝2
0 − (2 ̂𝑝 +
𝑢2
𝛼/2
𝑁
) 𝑝0 + ̂𝑝2
≤ 0} .
Based on Wilson (1927), the roots are
2 ̂𝑝 + 𝑢2
𝛼/2/𝑁 ± √(2 ̂𝑝 + 𝑢2
𝛼/2/𝑁)
2
− 4 ̂𝑝2 (1 + 𝑢2
𝛼/2/𝑁)
2 (1 + 𝑢2
𝛼/2/𝑁)
.
Finally, we get
̂𝑝 (
𝑁
𝑁 + 𝑢2
𝛼/2
) +
1
2
(
𝑢2
𝛼/2
𝑁 + 𝑢2
𝛼/2
)
±𝑢 𝛼/2
√
√√
⎷
1
𝑁 + 𝑢2
𝛼/2
[ ̂𝑝 (1 − ̂𝑝) (
𝑁
𝑁 + 𝑢2
𝛼/2
) +
1
4
𝑢2
𝛼/2
𝑁 + 𝑢2
𝛼/2
].
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 198 / 288
11 One-sample tests about probability
11.7 One-sample tests about probability
Note: The mid-point of this interval is weighted average of two elements and is between
̂𝑝 and 0.5 with weight of ̂𝑝 going asymptotically to one. This mid-point is moving ̂𝑝
towards 0.5, and with increasing 𝑁 is this shift getting smaller. The variance is weighted
average of the variance of ̂𝑝 and 0.5 using sample size 𝑁 + 𝑢2
𝛼/2 instead of 𝑁 (Agresti
and Coull 1998).
Minimal sample size: If 𝑐 = 𝑝−𝑝0
𝜎 , then minimal sample size 𝑁 is defined as
𝑁 ≥ (
𝑢 𝛼/2 + 𝑢 𝛽
𝑐
)
2
= (
𝑢 𝛼/2 + 𝑢 𝛽
𝑝 − 𝑝0
)
2
𝑝 (1 − 𝑝) .
For 𝐻12 and 𝐻13, 𝑢 𝛼/2 has to be changed to 𝑢 𝛼.
Minimal sample size for normal approximation: Hald rule 𝑁 𝑝 (1 − 𝑝) > 9 (see table).
Table 3: Minimal sample sizes 𝑁 for different 𝑝 using Hald rule
𝑝 0.01 0.02 0.05 0.10 0.15 0.20 0.30 0.40 0.50
1 − 𝑝 0.99 0.98 0.95 0.90 0.85 0.80 0.70 0.60 0.50
𝑁 910 460 190 100 71 57 43 38 36
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 199 / 288
11 One-sample tests about probability
11.8 One-sample tests about probability
Critical regions and power functions related to 𝑍W
𝐻0 𝐻1 𝒲 1 − 𝛽 (𝑝)
𝑝 = 𝑝0 𝑝 ≠ 𝑝0 𝒲1 = {𝑍W; |𝑍W| ≥ 𝑢 𝛼/2} 1 − Φ (𝑢 𝛼/2 − |𝑝0−𝑝|
√ 𝑝(1−𝑝)
𝑁
)
𝑝 ≤ 𝑝0 𝑝 > 𝑝0 𝒲2 = {𝑍W; 𝑍W ≥ 𝑢 𝛼} 1 − Φ (𝑢 𝛼 + 𝑝0−𝑝
√ 𝑝(1−𝑝)
𝑁
)
𝑝 ≥ 𝑝0 𝑝 < 𝑝0 𝒲3 = {𝑍W; 𝑍W ≤ −𝑢 𝛼} 1 − Φ (𝑢 𝛼 − 𝑝0−𝑝
√ 𝑝(1−𝑝)
𝑁
)
P-values related to 𝑍W
p-value =
⎧
{
⎨
{
⎩
2Pr(𝑍W ≥ |𝑧W||𝐻01), if 𝐻11 ∶ 𝑝 ≠ 𝑝0
Pr(𝑍W ≥ 𝑧W|𝐻02), if 𝐻12 ∶ 𝑝 > 𝑝0
Pr(𝑍W ≤ 𝑧W|𝐻03), if 𝐻13 ∶ 𝑝 < 𝑝0
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 200 / 288
11 One-sample tests about probability
11.9 One-sample tests about probability
Wald 100 × (1 − 𝛼)% empirical CI for 𝑝
𝐻0 𝐻1 ( ̂𝑝 𝐿, ̂𝑝 𝑈 )
𝑝 = 𝑝0 𝑝 ≠ 𝑝0 𝒞𝒮1−𝛼 = {𝑝0 ∶ 𝑝0 ∈ ( ̂𝑝 − 𝑢 𝛼/2
√ ̂𝑝(1− ̂𝑝)
𝑁 , ̂𝑝 + 𝑢 𝛼/2
√ ̂𝑝(1− ̂𝑝)
𝑁 )}
𝑝 ≤ 𝑝0 𝑝 > 𝑝0 𝒞𝒮1−𝛼 = {𝑝0 ∶ 𝑝0 ∈ ( ̂𝑝 − 𝑢 𝛼
√ ̂𝑝(1− ̂𝑝)
𝑁 , 1)}
𝑝 ≥ 𝑝0 𝑝 < 𝑝0 𝒞𝒮1−𝛼 = {𝑝0 ∶ 𝑝0 ∈ (0, ̂𝑝 + 𝑢 𝛼
√ ̂𝑝(1− ̂𝑝)
𝑁 )}
Different 𝑔(𝑝), standard deviations and boundaries of confidence intervals:
1 if 𝑔(𝑝) = 𝑝
1−𝑝 , then ̂𝑆𝐷[𝑔( ̂𝑝)] = √
̂𝑝
𝑁(1− ̂𝑝)
3 , 𝑝 𝐿 = 𝑔 𝐿(𝑝)
1+𝑔 𝐿(𝑝) and 𝑝 𝑈 = 𝑔 𝑈(𝑝)
1+𝑔 𝑈(𝑝) ,
2 if 𝑔(𝑝) = ln( 𝑝
1−𝑝 ), then ̂𝑆𝐷[𝑔( ̂𝑝)] = √ 1
𝑁 ̂𝑝(1− ̂𝑝) , 𝑝 𝐿 = exp(𝑔 𝐿(𝑝))
1+exp(𝑔 𝐿(𝑝)) and
𝑝 𝑈 = exp(𝑔 𝑈(𝑝))
1+exp(𝑔 𝑈(𝑝)) .
3 if 𝑔(𝑝) = arcsin(
√
𝑝), then ̂𝑆𝐷[𝑔( ̂𝑝)] = 1
2
√
𝑁
, 𝑝 𝐿 = sin2
(𝑔 𝐿(𝑝)) and
𝑝 𝑈 = sin2
(𝑔 𝑈 (𝑝)).
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 201 / 288
11 One-sample tests about probability
11.10 One-sample tests about probability
For different 𝑔(𝑝), Wald test statistics as defined as 𝑍
(alt)
W = 𝑔( ̂𝑝)−𝑔(𝑝0)
̂𝑆𝐷[𝑔( ̂𝑝)]
.
Example (variance for different 𝑔(𝑝))
Using delta method derive the following variances ̂𝑉 𝑎𝑟[𝑔( ̂𝑝)] for:
1 𝑔(𝑝) = 𝑝
1−𝑝 ,
2 𝑔(𝑝) = ln( 𝑝
1−𝑝 ).
3 𝑔(𝑝) = arcsin(
√
𝑝).
Notes: If 𝑔(𝑝) = 𝑝
1−𝑝 , then
̂𝑉 𝑎𝑟 [𝑔( ̂𝑝)] =
[ 𝜕
𝜕𝑝 ( 𝑝
1−𝑝 )]
2
− 𝜕2
𝜕𝑝2 ln 𝐿 (𝑝|x)
| 𝑝= ̂𝑝 =
[(1−𝑝)+𝑝
(1−𝑝)
2 ]
2
𝑁
𝑝(1−𝑝)
| 𝑝= ̂𝑝 =
̂𝑝
𝑁 (1 − ̂𝑝)
3
.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 202 / 288
11 One-sample tests about probability
11.11 One-sample tests about probability
Example (comparison of three CIs when ̂𝑝 = 0)
Let 𝑁 = 25, 𝑥 = 0. Calculate
Wald empirical 95% CI for 𝑝,
score empirical 95% CI for 𝑝,
likelihood empirical 95% CI for 𝑝.
Solution (also in ): If 𝑥 = 0 then ̂𝑝 = 0/25 = 0.
Wald empirical 95% CI for 𝑝 is equal to 0 ± 1.96√(0 × 1) /25 = (0, 0).
Score empirical 95% CI for 𝑝 is equal to
1
2
(
1.962
25 + 1.962
) ± 1.96√
1.962
4 (25 + 1.962)
2
= (0.000, 0.133)
The log-likelihood ℓ (𝑝|x) = 25 ln (1 − 𝑝). We know that 𝐿 ( ̂𝑝|x) = 0, then
𝑢LR = −2 (ℓ (𝑝0|x) − ℓ ( ̂𝑝|x)) = −50 ln (1 − 𝑝0) and 95% likelihood CI is equal to
𝒞𝒮1−𝛼 = {𝑝0 ∶ 𝑈LR ≤ 𝜒2
1(𝛼)} , where 𝜒2
1(𝛼) = 3.84.
Then 𝑔 𝑈 ( ̂𝑝) = 1 − exp (−3.84/50) = 0.074, so 95% likelihood CI is equal to
(0.000, 0.074).
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 203 / 288
11 One-sample tests about probability
11.12 One-sample tests about probability
1 N <- 25; p_hat <- 0/N
2 var_phat <- p_hat * (1 - p_hat)/N
3 Wald_CI <- p_hat + c(-1,1) * qnorm(0.975) * sqrt(var_phat) ## 0 0
4 library(Hmisc)
5 Wald_CI <- binconf(x = 0,n = 25, method = "asymptotic") ## 0 0
6
7 skore_CI <- prop.test(x = 0, n = 25, conf.level = 0.95,
8 correct = FALSE)$conf.int ## 0 0.1331923
9 "LRbinom" <- function(pi_0, x, N, alpha) {
10 -2 * (x * log(pi_0) + (N - x) * log(1 - pi_0)) -
11 qchisq(1 - alpha, df = 1)
12 }
13 LR_CI <- uniroot(f = LRbinom, interval = c(0.000001, .999999),
14 x = 0, N = 25, alpha = .05)$root # 0.07395093
The CIs calculated using these three methods are different. If 𝑝 is close to zero, the
distribution of 𝑋/𝑁 is extremely right skewed for small 𝑁. In these situations, score or
likelihood CI should be used.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 204 / 288
11 One-sample tests about probability
11.13 One-sample tests about probability
Example (comparison of several confidence intervals for 𝑝)
Compare the following empirical CIs for 𝑝:
A. Wald CI,
B. likelihood CI,
C. score CI,
D. back-transformed Wald CI for 𝑔(𝑝) = 𝑝
1−𝑝 ,
E. back-transformed Wald CI for 𝑔(𝑝) = ln 𝑝
1−𝑝 ,
F. back-transformed Wald CI for 𝑔(𝑝) = arcsin(
√
𝑝),
where
1 𝑁 = 1400 and 𝑝 = 0.10,
2 𝑁 = 40 and 𝑝 = 0.10,
3 𝑁 = 1400 and 𝑝 = 0.30,
4 𝑁 = 40 and 𝑝 = 0.30,
5 𝑁 = 1400 and 𝑝 = 0.50,
6 𝑁 = 40 and 𝑝 = 0.50.
Stanislav Katina (ÚMS, MU) Statistical inference I and II 17.04.2024 16:04 205 / 288