Statistics for Computer Sciences
Lecture 10 to Lecture 12
Testing of Statistical Hypotheses
Stanislav Katina1
1Institute of Mathematics and Statistics, Masaryk University
Honorary Research Fellow, The University of Glasgow
December 2, 2015
Testing of Statistical Hypotheses
Null and alternative hypothesis
◮ a ’hypothesis’ is a theory which is assumed to be true unless
evidence is obtained which indicates otherwise
◮ ’null’ means ’nothing’ and the term ’null hypothesis’ (H0)
means a ’theory of no change’ – that is ’no change’ from what
would be expected from past experience
◮ ’alternative hypothesis’ (H1) means a ’theory of change’ – that
is ’change’ from what would be expected from past experience
◮ the procedure which is used to decide between these two
opposite theories is called ’hypothesis test’ or sometimes
’signiﬁcance test’
◮ one-tail test – test in which thy alternative hypothesis proposes
a change in parameter in only one direction – increase or
decrease
◮ two-tail test– test in which the alternative hypothesis suggests a
difference in parameter in either direction
Testing of Statistical Hypotheses
Test statistic, rejection and acceptance region, critical value and quantile
◮ the test statistic is calculated from the sample – its value is
used to decide whether the null hypothesis should be rejected
◮ the rejection (or critical) region gives the values of the test
statistic for which the null hypothesis is rejected
◮ the acceptance region gives the values of the test statistic for
which the null hypothesis is not rejected
◮ the boundary value(s) of the rejection region is (are) called the
critical value(s) or quantile(s)
◮ the signiﬁcance level α of a test gives the probability of the test
statistic falling in the rejection region when null hypothesis is true
Testing of Statistical Hypotheses
Hypothesis testing procedure
◮ a hypothesis is a statement about a population parameter base
on a sample from this population
◮ H0 and H1 are two complementary hypotheses in a hypothesis
testing problem
◮ a hypothesis testing procedure or hypothesis test is a rule
that speciﬁes – for which sample values the decision is made to
accept null hypothesis as true – and for which sample values H0
is rejected
◮ the subset of sample space for which H0 will be rejected is called
rejection region (critical region)
◮ the complement of the rejection region is called the acceptance
region
Testing of Statistical Hypotheses
Four possibilities
Four choices:
A H0 is true – our decision is to reject H0
B H0 is true – our decision is not to reject H0
C H1 is true – our decision is not to reject H0
D H1 is true – our decision is to reject H0
Decision-reality table:
decision/reality H0 is true H0 is not true
to reject H0 Type I error true decision
not to reject H0 true decision Type II error
Testing of Statistical Hypotheses
Four possibilities
Four choices:
A) Pr(A) = Pr(Type I error) ≤ α [signiﬁcance level]
B) Pr(B) ≥ 1 − α [coverage probability, conﬁdence coefﬁcient
(level)]
C) Pr(C) = Pr(Type II error) ≤ β
D) Pr(D) ≥ 1 − β [power]
Four choices (formalised):
A) 1 − α ≤ Pr(don’t reject H0|H0 is true)
B) α ≥ Pr(CHPD) = Pr(reject H0|H0 is true)
C) β = Pr(CHDD) = Pr(don’t reject H0|H0 isn’t true)
D) 1 − β = Pr(reject H0|H0 isn’t true)
Testing of Statistical Hypotheses
Empirical 100 × (1 − α)% conﬁdence intervals for parameter θ
Relationship of conﬁdence interval and statistical test
◮ Empirical 100(1 − α)% conﬁdence interval (CI) for parameter θ
◮ α-level hypothesis test about θ
Three types of intervals:
◮ two-tailed CI – Pr(LB(X) < θ < UB(X)) = 1 − α
◮ one-tailed (right-tailed) CI – Pr(θ < UB∗
(X)) = 1 − α
◮ one-tailed (left-tailed)– CI – Pr(LB∗(X) < θ) = 1 − α
Testing of Statistical Hypotheses
Acceptance region
Deﬁnition (Acceptance region of H0)
Let X be a random variable with certain distribution (probabilistic model) dependent on
parameter θ ∈ Θ, g (θ) is parametric function. We are testing null hypothesis
H01 : g (θ) = g(θ0) against two-sided alternative H11 : g (θ) = g(θ0). Let (LB, UB) be
interval estimate of parametric function g (θ) with coverage probability 1 − α. Then
AIS,1 = {LB, UB; g(θ0) ∈ (LB, UB)}
is acceptance region of a test H01 against H11 on signiﬁcance level α. If we are
testing H02 : g (θ) ≤ g(θ0) against one-sided (right) alternative H12 : g (θ) > g(θ0) and
if LB∗ be lower estimate of g (θ) with coverage probability 1 − α, then
AIS,2 = {LB∗; LB∗ < g(θ0)}
is acceptance region of a test H02 against H12 on signiﬁcance level α. If we are
testing H03 : g (θ) ≥ g(θ0) against one-sided (left) alternative H13 : g (θ) < g(θ0) and
if UB∗ is upper estimate of g (θ) with coverage probability 1 − α, then
AIS,3 = {UB∗
; UB∗
> g(θ0)}
is acceptance region of a test H03 against H13 on signiﬁcance level α.
Testing of Statistical Hypotheses
Rejection region
Deﬁnition (Rejection (critical) region of H0)
Let X be a random variable with certain distribution (probabilistic model) dependent on
parameter θ ∈ Θ, g (θ) is parametric function. We are testing null hypothesis
H01 : g (θ) = g(θ0) against two-sided alternative H11 : g (θ) = g(θ0). Let (LB, UB) be
interval estimate of parametric function g (θ) with coverage probability 1 − α. Then
WIS,1 = {LB, UB; g(θ0) /∈ (LB, UB)}
is critical region of a test H01 against H11 on signiﬁcance level α. If we are testing
H02 : g (θ) ≤ g(θ0) against one-sided (right) alternative H12 : g (θ) > g(θ0) and if LB∗
be lower estimate of g (θ) with coverage probability 1 − α, then
WIS,2 = {LB∗; LB∗ ≥ g(θ0)}
is critical region of a test H02 against H12 on signiﬁcance level α. If we are testing
H03 : g (θ) ≥ g(θ0) against one-sided (left) alternative H13 : g (θ) < g(θ0) and if UB∗
is upper estimate of g (θ) with coverage probability 1 − α, then
WIS,3 = {UB∗
; UB∗
≤ g(θ0)}
is critical region of a test H03 against H13 on signiﬁcance level α.
Testing of Statistical Hypotheses
Test criterion
Deﬁnition (Test criterion)
A test criterion is a test statistic T = T0 = T0(X1, X2, . . . , Xn), with
known asymptotic distribution if H0 is known. The set of possible
values of T0 is divided to two subsets, i.e. acceptance region H0
(notation A) and critical region H0 (notation W). These two regions
are divided by critical values tα/2 and t1−α/2, resp. tα and t1−α (for
particular H0 and H1) of the distribution of test statistics T0 (if H0 is
true).
Deﬁnition (Conﬁdence interval)
A conﬁdence interval (CI) is a type of interval estimate of a
population parameter θ. It is an observed, often called empirical,
interval (i.e., it is calculated from the observations) that includes the
value of an unobservable parameter θ if the experiment is repeated.
The frequency that observed interval contains the parameter is
determined by the conﬁdence coefﬁcient 1 − α (i.e. conﬁdence
level, coverage probability).
Testing of Statistical Hypotheses
To carry out a hypothesis test
Step 1 deﬁne the null and alternative hypothesis (H0 and H1)
Step 2 decide on a signiﬁcance level α = 0.1, 0.05, 0.01
Step 3 calculate the test statistic (test criterion) T0
Step 3 determine the critical value(s)
Step 5 decide on the outcome of the test (reject/don’t reject H0)
depending on one of the following ways:
◮ base on critical region W = WT (observed test statistic
t0 = tobs and critical values tα/2 and t1−α/2, resp. tα and
t1−α),
◮ base on critical region WIS, t.j. empirical conﬁdence interval
(and g(θ0)),
◮ base on p-value.
Step 6 state the conclusion in words
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on test statistic and critical value
Deﬁnition (Testing based on critical region W)
Rejecting H0. If observed test statistic (realisation of test statistic) t0
of test statistic T0 is within a critical region W (equivalently is not from
an acceptance region A), H0 is rejected at a signiﬁcance level α, i.e.
we do have sufﬁciently enough evidence to reject H0.
Not rejecting H0. If observed test statistic t0 of test statistic T0 is
within an acceptance region A (equivalently, it is not from a critical
region W), H0 is not rejected at a signiﬁcance level α, i.e. we don’t
have sufﬁciently enough evidence to reject H0.
Let tmin be the smallest possible value of a test criteria T0 and tmax be
the highest possible value of a test criteriaT0, then
1. two-sided alternative – critical region
W1 = (tmin, t1−α/2) ∪ (tα/2, tmax),
2. one-sided (right) alternative – critical region W2 = (tα, tmax),
3. one-sided (left) alternative – critical region W3 = (tmin, t1−α).
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on CI
Deﬁnition (Testing based on CI)
Rejecting H0: If g(θ) = g(θ0) is within CI (H0 is valid), H0 is rejected
at the signiﬁcance level α, i.e. we do have sufﬁciently enough
evidence to reject H0.
Not rejecting H0: If g(θ) = g(θ0) is not within CI (H0 is valid), H0 isn’t
rejected at a signiﬁcance level α, i.e. we don’t have sufﬁciently
enough evidence to reject H0.
Relationship of conﬁdence interval and statistical test
◮ hypothesis testing ≡ CIs
◮ α-level hypothesis test ≡ 100(1 − α)% CI
◮ one-tail test ≡ one-sided CI (left-sided CI ≡ right-sided
alternative, right-sided CI ≡ left-sided alternative
◮ two-tail test ≡ two-sided CI
◮ parameter(s) ∈ CI ≡ not reject H0
◮ parameter(s) /∈ CI ≡ reject H0
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on p-value (observed signiﬁcance level)
Deﬁnition (Testing based on p-value)
Minimal signiﬁcance level α (for some test statistic T0), base on which
H02 : g(θ) ≤ g(θ0) is rejected (tested against H12 : g(θ) > g(θ0)), is
called observed signiﬁcance level or p-value, i.e.
p-value = αobs = sup
θ∈Θ0
Pr (T(X1, X2, . . . , Xn) ≥ T(x1, x2, . . . , xn); θ) .
This could be written less formally as p-value =
Pr(any test statistics equal or greater than observed |H0 is true).
The closer αobs is to zero, the smaller is the probability that any test
statistic T(X1, X2, . . . , Xn) produces a p-value (under H0) equal to or
smaller than that observed, while the probability is higher under H1.
Therefore, p-value could be understood as an indicator of credibility
of H0.
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on p-value (observed signiﬁcance level)
◮ Usually, if αobs < α = 0.05, there is sufﬁciently enough evidence
to reject H0 and the result of a test is statistically signiﬁcant.
◮ While αobs > α = 0.1, there is sufﬁciently enough evidence to
reject H0 and the result of a test is not statistically signiﬁcant.
◮ The values between 0.05 and 0.1 should be taken as reference
points in a broad sense. As αobs gets closer to either boundary
point of the interval 0.05, 0.1 , so this is taken as increasing
evidence for one or other alternative.
◮ Situation with αobs ∈ 0.05, 0.1) are usually most difﬁcult to
handle and the result is here marginally statistically
signiﬁcant.
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on p-value (observed signiﬁcance level)
Wording of the results of a statistical test:
range for p-value stars of signiﬁcance wording of the result
0, 0.001) *** extremely highly statistically signiﬁcant
0.001, 0.01) ** high statistically signiﬁcant
0.01, 0.05) * statistically signiﬁcant
0.05, 0.1) · marginally statistically signiﬁcant
0.1, 1 non-signiﬁcant
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on p-value (observed signiﬁcance level)
Interpretation of p-values:
◮ p-value < 0.001: the prevalence of an estimated effect is smaller than one to
one thousand (the odds of estimated effect is smaller than 1 : 999), if an effect is
not present in a population (the presence of such an effect is highly
improbable, if an effect is not present in a population – and – the presence of
such an effect is highly probable, if an effect is present in a population)
◮ p-value < 0.01: the prevalence of an estimated effect is smaller than one to one
hundred (the odds of estimated effect is smaller than 1 : 99), if an effect is not
present in a population (the presence of such an effect is very improbable, if an
effect is not present in a population – and – the presence of such an effect is
very probable, if an effect is present in a population)
◮ p-value < 0.05: the prevalence of an estimated effect is smaller than one to one
hundred (the odds of estimated effect is smaller than 5 : 95 or 1 : 19), if an effect
is not present in a population (the presence of such an effect is sufﬁciently
improbable, if an effect is not present in a population – and – the presence of
such an effect is sufﬁciently probable, if an effect is present in a population)
◮ p-value ≥ 0.05: the prevalence of an estimated effect is ﬁve to one hundred or
greater (5 % or more);
◮ p-value = k, k ∈ 0.05, 1 : the prevalence of an estimated effect is 100 × k to
one hundred (100 × k % or more).
Testing of Statistical Hypotheses
To carry out a hypothesis test – based on p-value (observed signiﬁcance level)
How is the p-value (mostly) calculated?
1. two-sided alternative –
p-value = 2 min(Pr(T0 ≤ t0|H0), Pr(T0 ≥ t0|H0)), e.g. for normal
and Student distribution of test statistic (symmetric distributions)
and for χ2
df and Fdf1,df2
distribution of test statistic (asymmetric
distributions) or
p-hodnota = min(Pr(T0 ≤ t0|H0), Pr(T0 ≥ t0|H0)), e.g. for χ2
df and
Fdf1,df2
distribution of test statistic (asymmetric distributions)
2. one-sided (right) alternative – p-value = Pr(T0 ≥ t0|H0)
3. one-sided (left) alternative – p-value = Pr(T0 ≤ t0|H0)
Testing of Statistical Hypotheses
On a philosophical level
◮ distinction between ’rejecting H0’ and ’accepting H1’
◮ ’rejecting H0’ – nothing implies about what state the
experimenter is accepting, only that the state deﬁned by H0 is
being rejected
◮ distinction between ’accepting H0’ and ’not rejecting H0’
◮ ’accepting H0’ – the experimenter is willing to assert the state of
nature speciﬁed by H0
◮ ’not rejecting H0’ – the experimenter really does not believe H0
but does not have the evidence to reject it
Testing of Statistical Hypotheses
Conservative and liberal test and CI
Deﬁnition (Conservative and liberal test)
A test with actual/observed signiﬁcance level smaller than
nominal signiﬁcance level α, is called conservative (the test
should theoretically be ”rejecting quickly” H0, but, in reality, it is the
opposite, i.e. the test is ”rejecting slowly”).
A test with actual/observed signiﬁcance level greater than
nominal signiﬁcance level α, is called liberal (the test should
theoretically be ”rejecting slowly” H0, but, in reality, it is the opposite,
i.e. the test ”rejecting quickly”).
Deﬁnition (Conservative and liberal CI)
CI with actual/real coverage probability greater than nominal
coverage probability 1 − α, is called conservative (i.e. the
probability that θ0 is within CI is greater that expected).
CI with actual/real coverage probability smaller than nominal
coverage probability 1 − α, is called liberal (i.e. the probability that
θ0 is within CI is smaller that expected).
Testing of Statistical Hypotheses
Likelihood ratio – generalised relative likelihood
Two types of hypotheses:
1. simple hypothesis – H0 : θ = θ0 against H1 : θ = θ0, then
simple likelihood ratio is equal to
λ(x) = λ =
L(θ0|x)
supθ∈Θ L(θ|x)
=
L(θ0|x)
L(θ|x)
,
where λ(x) = L(θ0|x) is test statistic and L(θ|x) is continuous for
all x.
2. composite hypothesis – H0 : θ ∈ Θ0 against H1 : θ ∈ Θ1, then
generalised likelihood ratio is equal to
λ(x) =
supθ∈Θ0
L(θ|x)
supθ∈Θ L(θ|x)
.
Testing of Statistical Hypotheses
Likelihood ratio test statistic
Subsets of Θ, Θ0 and Θ1, remain the same after monotone
transformation of λ(x), i.e. the statistical tests before and after
transformation are equivalent. Therefore, likelihood ratio test
statistic is equal to
ULR = −2 ln λ(X).
Its realisation, observed likelihood ratio test statistic, is equal to
uLR = −2 ln λ(x), where uLR ∈ (0, ∞).
Testing of Statistical Hypotheses
Three test statistics
Geometrical interpretation:
1. ULR – is measuring properly standardised difference between
log-likelihoods in θ and θ0 (i.e. in direction of y axis)
2. UW – is measuring properly standardised absolute value of a
difference of θ a θ0 (in direction of x axis)
3. US – is measuring properly standardised slope of log-ratio in θ0
Example (normal distribution)
Let X ∼ N(µ, σ2
), where σ2
is known, H0 : θ = θ0 against H1 : θ = θ0,
where θ = µ. Then
1. ULR = −2(l(θ0|X) − l(θ|X)) =
−
n
i=1(Xi − X)2
/σ2
+
n
i=1(Xi − µ0)2
/σ2
= n(X−µ0)2
σ2 ,
2. UW = (X − µ0)2
I(x) = n(X−µ0)2
σ2 ,
3. US = (S(µ0))2
I(µ0) = (n(X−µ0)/σ2
)2
n/σ2 = n(X−µ0)2
σ2 .
All three test statistics are equal, i.e. ULR = UW = US.
Testing of Statistical Hypotheses
Three test statistics
If θ is a scalar, three test statistics are deﬁned as:
1. ULR = −2(l(θ0|X) − l(θ|X))
D
∼ χ2
1,
2. UW = (θ − θ0)2
I(θ)
D
∼ χ2
1 and equivalently U
1/2
W = ZW
D
∼ N(0, 1),
3. US = (S(θ0))2
I(θ0)
D
∼ χ2
1 and equivalently U
1/2
S = ZS
D
∼ N(0, 1),
If θ is a vector, three test statistics are deﬁned as:
1. ULR = −2(l(θ0|X) − l(θ|X))
D
∼ χ2
k ,
2. UW = (θ − θ0)T
I(θ)(θ − θ0)
D
∼ χ2
k ,
3. US = (S(θ0))T
(I(θ0))−1
S(θ0)
D
∼ χ2
k .
Testing of Statistical Hypotheses
Three test statistics and related conﬁdence intervals
If θ is a scalar, three conﬁdence intervals are deﬁned as follows:
1. likelihood ratio empirical (1 − α) × 100% CI for θ is deﬁned as
CS1−a = θ : ULR(θ) < χ2
1(α) ,
where ULR(θ) = −2 ln L(θ|x)
L(θ|x)
.
2. Wald empirical (1 − α) × 100% CI for θ is deﬁned based on a
pivot (pivotal statistics)Tpiv = UW(θ)
3. Score empirical (1 − α) × 100% CI for θ is deﬁned based on a
pivot Tpiv = US(θ)
If θ is a vector, CIs can be generalized to conﬁdence set CS1−a.
◮ If k = 2, CS1−a is an conﬁdence ellipse.
◮ If k > 2, CS1−a is an conﬁdence ellipsoid.
Additionally, if k = 1, CS1−a is an conﬁdence interval.
Testing of Statistical Hypotheses
Conﬁdence intervals
Wald empirical (1 − α) × 100% CI for θ is deﬁned as
(d, h) = θ − tα/2SE[θ], θ + tα/2SE[θ] ,
where the critical value tα/2 depends on the choice of θ.
Likelihood ratio empirical (1 − α) × 100% CI for θ is deﬁned by its
lower and upper bounds as k% cut-offs of standardized relative
log-likelihood as follows
Pr
L(θ|x)
L(θ|x)
> cα = Pr −2 ln
L(θ|x)
L(θ|x)
< −2 ln cα = 1 − α,
where cα = e− 1
2 χ2
1(α)
. Then
◮ if 1 − α = 0.95, then cα = 0.1465001
.
= 0.15 (15% cut-off ),
◮ if 1 − α = 0.90, then cα = 0.2585227
.
= 0.26 (26% cut-off),
◮ if 1 − α = 0.99, then cα = 0.0362452
.
= 0.04 (4% cut-off).
Testing of Statistical Hypotheses
Likelihood conﬁdence intervals – bisection method
Bisection method
Let θ01, θ02 ∈ θL, θU and f(θ01)f(θ02) < 0, f(·) is continuous with at
least one root within the interval θ01, θ02 , where
f(θ) = −2 ln L(θ|x) − χ2
1(α) = 0.
If the ﬁrst derivative of f(·) is having constant sign, then exactly one
root θ∗
∈ θ01, θ02 of f(θ) = 0 exists.
The iterative process is deﬁned as follows:
1. initialisation step – starting point θ(0)
= (θ01 + θ02)/2 and i = 1,
2. updating equations – substitution of the boundaries θ01 and θ02
is deﬁned as
θi1, θi2 =
θi−1,1, θ(i−1)
, if f(θi−1,1)f(θ(i−1)
) < 0
θ(i−1)
, θi−1,2 , if f(θi−1,1)f(θ(i−1)
) > 0
,
if f(θ(i−1)
) = 0, then end, if not,
Testing of Statistical Hypotheses
Likelihood conﬁdence intervals – Brent-Dekker method
Example (Brent-Dekker method)
Let X ∼ Bin(N, p), where N = 10 and n = x = 8. Estimate the
boundaries of empirical 100×(1 − α)% CI for (1) p and (2) odds p
1−p .
The empirical CI are of the two types (A) likelihood and (B) Wald.
Draw the log-likelihood function and its quadratic approximation with
the lower and upper boundary of CI.
Solution (partial)
Wald empirical 100 × (1 − α)% CI for p:
p = 8
10
= 0.8; SE[p] =
p(1−p)
N
= 0.13.
(d, h) = p − uα/2SE[p], p + uα/2SE[p] = (0.55, 1.05).
Likelihood empirical 100 × (1 − α)% CI for p:
CS1−α = p : −2 ln
L(p|x)
L(p|x)
≤ 3.84 , where (d, h) = (0.50, 0.96),
Wald empirical 100 × (1 − α)% CI for g(p):
g(p) = ln p
1−p
= log 0.8
0.2
= 1.39.
∂
∂p
g(p) = 1
p
+ 1
1−p
;
SE[g(p)] = SE[p] 1
p
+ 1
1−p
=
p(1−p)
N
1
p
+ 1
1−p
= 1
n
+ 1
N−n
= 0.79.
Then (dg, hg) = (−0.16, 2.94) and back-transformed (d, h) = (0.46, 0.95).
Testing of Statistical Hypotheses
Likelihood conﬁdence intervals – Brent-Dekker method
1 x <- 8; N <- 10
2 probs <- seq(0.4,.99,length=1000)
3 like <- dbinom(8,10,probs)
4 rellike <- like/max(like)
5 relloglike <- -2*log(rellike)
6 cutoff <- exp(-1/2*qchisq(0.95,df=1)) #0.1465001
7 like.CI.p <- range(probs[rellike>cutoff]) #0.5009910 0.9634234
8 cutoff <- qchisq(0.95,df=1) #3.841459
9 like.CI.p <- range(probs[relloglike<cutoff]) #0.500991 0.9634234
10
11 p.hat <- x/N
12 i.hat <- N/p.hat/(1-p.hat)
13 loglikeapprox <- -i.hat/2*(probs-p.hat)ˆ2
14 ra <- range(log(rellike))
15 wald.is.p <- p.hat + c(-1,1)*qnorm(0.975)*sqrt(1/i.hat)
16 wald.is.p # 0.552082 1.047918
17
18 gprobs <- log(probs) - log(1-probs)
19 gp.hat <- log(p.hat) - log(1-p.hat)
20 i.hat <- x*(N-x)/N
21 lgp <- -i.hat/2*(gprobs-gp.hat)ˆ2
22 x <- (gp.hat+c(-1,1)*qnorm(0.975)*sqrt(1/i.hat)) #-0.1632 2.9358
23 wald.is.gp <- exp(x)/(1+exp(x))
24 wald.is.gp # 0.4592920 0.9495872
Testing of Statistical Hypotheses
Likelihood conﬁdence intervals – other numerical method
0.4 0.6 0.8 1.0
−4−3−2−10
p
log−likelihood
0 1 2 3 4
−4−3−2−10
g(p)
log−likelihood
approx log−like
log−like
Figure: Log-likelihood of p and its quadratic approximation
Testing of Statistical Hypotheses
To carry out a hypothesis test
Number of (in)dependent samples for θ, g(θ), θ and g(θ):
◮ one-sample problem about – mean, variance, probability
distribution, correlation coefﬁcient, probability
◮ two-sample problem about – difference in means, ratio of
variances, difference in probability distributions, difference in
correlation coefﬁcients, difference in probabilities
◮ multiple sample problem about – means, variances, probability
distributions, correlation coefﬁcients, probabilities
◮ paired problem – the mean of the differences
Dimension:
◮ univariate problem
◮ multivariate problem
Testing of Statistical Hypotheses
One-sample problems
◮ one-sample Z-test for the mean of one population
◮ one-sample Student t-test for the mean of one population
◮ one-sample χ2
-test for the variance of one population
◮ one-sample Kolmogorov-Smirnov test for the empirical
probability distribution function of one population
◮ one-sample Z-test for the population proportion of one
population
◮ one-sample T-test for the correlation coefﬁcient of one
population
Testing of Statistical Hypotheses
Two-sample problems
◮ two-sample Z-test for the difference between the means of two
populations
◮ two-sample Student t-test for the difference between the
means of two populations
◮ two-sample F-test for the ratio of the variances of two
populations
◮ two-sample Kolmogorov-Smirnov test for the difference
between two empirical probability distribution functions
◮ two-sample Z-test for the difference between two population
proportions
◮ two-sample T-test for the difference between correlation
coefﬁcients of two populations