Filters in Image Processing
Image Transforms (I)
David Svoboda
email: svoboda@fi.muni.cz
Centre for Biomedical Image Analysis
Faculty of Informatics, Masaryk University, Brno, CZ
October 4, 2019
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 1 / 49
Outline
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 2 / 49
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 3 / 49
Revision
Algebra
Deﬁnition:
Let C ⊂ Rn×n be a square matrix. Vector v ∈ Cn is an eigenvector of
C ⇔ ∃λ ∈ C : Cv = λv. λ is called an eigenvalue.
Properties of eigenvalues:
Cv = λv equals to (C − λE)v = 0
The solution of equation |C − λE| = 0 is identical to the search for roots in
the polynomial of degree n.
if C is symmetric then all the eigenvalues are real
Properties of eigenvectors:
the two eigenvectors corresponding to two diﬀerent eigenvalues are
orthogonal
if C is symmetric then all the eigenvectors are real and orthogonal
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 4 / 49
Revision
Algebra
Properties of symmetric matrices:
if C ⊂ Rn×n : C = CT with its eigenvectors e1, e2, . . . , en and
eigenvalues λ1, λ2, . . . , λn then ∃A ⊂ Rn×n such that AT CA = D,
where
D =







λ1 0 0 . . . 0
0 λ2 0 . . . 0
0 0 λ3 . . . 0
...
...
...
...
...
0 0 0 . . . λn







and A =







e1
e2
e3
...
en







David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 5 / 49
Revision
Statistics
Cross-correlation:
given two 1D signals f (i) and g(i):
(f g)(i) =
k
f (i + k)g(k)
it is a measure of similarity of two signals
Auto-correlation:
given 1D signal f (i):
(f f )(i) =
k
f (i + k)f (k)
it is a measure for ﬁnding repeating patterns
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 6 / 49
Revision
Statistics
Auto-correlation example
0 500 1000
0
20
40
60
80
100
Highly correlated signal
0 500 1000 1500 2000
0
1
2
3
4
5
x 10
6 Auto−correlation
0 500 1000
0
20
40
60
80
100
Less correlated signal
0 500 1000 1500 2000
0
1
2
3
4
5
x 10
6 Auto−correlation
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 7 / 49
Revision
Statistics
Auto-correlation example
Highly correlated signal
50 100 150 200 250
50
100
150
200
250
Auto−correlation
100 200 300 400 500
100
200
300
400
500
Less correlated signal
50 100 150 200 250
50
100
150
200
250
Auto−correlation
100 200 300 400 500
100
200
300
400
500
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 8 / 49
Revision
Statistics
Highly correlated signals
contain repeating patterns
energy (nonzero values) is stretched over the whole space
this is what we usually have
Decorrelated signals
energy (nonzero values) is compacted in one location
easy to compress
this is what we wish to have
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 9 / 49
Revision
Statistics
Covariance
Covariance exhibits how much two signals f and g (let us assume
|f | = |g| = n) vary from the mean with respect to each other:
cov(f , g) =
n
i=1
(fi − f )(gi − g)
n − 1
Covariance Matrix – a matrix of covariances between two signals f
and g:
C =
cov(f , f ) cov(f , g)
cov(g, f ) cov(g, g)
Matrix C is always real and symmetric.
Notice: Two signals f and g are decorrelated iﬀ
cov(f , g) = cov(g, f ) = 0, i.e. when the matrix C is diagonal.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 10 / 49
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 11 / 49
Motivation
The most common linear transforms
sinusoidal transforms
DFT (discrete Fourier transform)
DCT (discrete cosine transform)
DST (discrete sine transform)
rectangular wave transforms
Walsh-Hadamard transform
Haar transform
variable basis
Discrete wavelet transform
eigenvector-based transforms
Karhunen-Loeve transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 12 / 49
Motivation
Energy compaction
Evaluate DFT over some short signal:
f = [1 3 4 2]
⇓ /DFT/
F =
1
√
4
[10 (−3 − i) 0 (−3 + i)]
Measure the energy of the signals:
E(f ) = f (i)
2
= 1
2
+ 3
2
+ 4
2
+ 2
2
= 30
E(F) = F(i)
2
= 25 + 10/4 + 10/4 = 30
The ﬁrst component in f accounts for 3.3% of energy while the ﬁrst
component in F accounts for 83.3% of energy.
Conclusion: Ability of energy compaction ≈ optimality of the transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 13 / 49
Motivation
Why do we need image transforms?
manipulation with data in another domain might be simpler
example of use: low-pass, high-pass ﬁltering
data decorrelation ≈ co-variance removal ≈ energy compaction
example of use: image compression
Which transform properties are the most important?
speed
simple to implement
Notice: The requirements suggest to use linear transforms, i.e. the input
signal f is transformed into the signal F by:
F = Af ,
where A is a transform matrix.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 14 / 49
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 15 / 49
PCA-based Transform
Design of transform matrix A
Task to solve
Given an input discrete signal, design a transformation matrix such that
the transformed signal has decorrelated samples (they are mutually
independent).
Solution
Given a 2D image, let us break it up into k blocks of n pixels each.
1 2 3
4
k
...
...
Each block i is characterized with its vector b(i), i = 1, 2, . . . , k where
length(b(i)) = n.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 16 / 49
PCA-based Transform
Design of transform matrix A
input input expected output
(correlated) (mean centered) (decorrelated)
b(i)
v(i)
= b(i)
− b w(i)
= Av(i)
, i = 1, 2, . . . , k
V = v(1)
, v(2)
, . . . , v(k)
W = AV = w(1)
, w(2)
, . . . , w(k)
CV = V · V T
CW = W · W T
CV . . . real, symmetric CW . . . diagonal
CW = W · W T
= (AV ) · (AV )T
= A(V · V T
)AT
= A · CV · AT
Notice:
the oﬀ-diagonal elements of covariance matrix CV are the covariances of the
v(i)
vectors . . . (V · V T
)ab =
k
i=1 v
(i)
a v
(i)
b
the oﬀ-diagonal elements of covariance matrix CW are zero
the eigenvectors of CV form the rows of a new matrix A
CW is a diagonal matrix formed of the eigenvalues of CV
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 17 / 49
PCA-based Transform
Algorithm:
1 collect the input data b(i)
2 form the matrix V
3 calculate the covariance matrix CV
4 ﬁnd eigenvectors and eigenvalues of CV
5 use eigenvectors to form the transform matrix A
6 use A as a transform matrix to original data
7 get transformed decorrelated data: w(i) = A(b(i) − b)
Notice: Red lines ≡ Principal Component Analysis (PCA).
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 18 / 49
PCA-based Transform
An example
Let us submit the following 2D image to the PCA
2.5 2.4 0.5 0.7
2.2 2.9 1.9 2.2
3.1 3.0 2.3 2.7
2.0 1.6 1.0 1.1
1.5 1.6 1.1 0.9
Notice: Neighbouring pixels have usually similar value.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 19 / 49
PCA-based Transform
An example
We have chosen n = 2, k = 10 and fetched k vectors b(i), i = 1, 2, . . . , k in
the following manner:
X Y
b(1) 2.5 2.4
b(2) 0.5 0.7
b(3) 2.2 2.9
b(4) 1.9 2.2
b(5) 3.1 3.0
b(6) 2.3 2.7
b(7) 2.0 1.6
b(8) 1.0 1.1
b(9) 1.5 1.6
b(10) 1.1 0.9
Correlation of the neighbouring intensities:
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 20 / 49
PCA-based Transform
An example
Modify the input datasets:
X Y X − X Y − Y
b(1) 2.5 2.4 v(1) 0.69 0.49
b(2) 0.5 0.7 v(2) -1.31 -1.21
b(3) 2.2 2.9 v(3) 0.39 0.99
b(4) 1.9 2.2 v(4) 0.09 0.29
b(5) 3.1 3.0 v(5) 1.29 1.09
b(6) 2.3 2.7 v(6) 0.49 0.79
b(7) 2.0 1.6 v(7) 0.19 -0.31
b(8) 1.0 1.1 v(8) -0.81 -0.81
b(9) 1.5 1.6 v(9) -0.31 -0.31
b(10) 1.1 0.9 v(10) -0.71 -1.01
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 21 / 49
PCA-based Transform
An example
Evaluate all the covariances CV (X, X), CV (X, Y ), CV (Y , X), and
CV (Y , Y ) and form the covariance matrix CV :
CV = V · V T
=
0.617 0.615
0.615 0.717
Since the covariance matrix is square, we can calculate the eigenvectors
and eigenvalues for this matrix:
eigenvalues
CW = W · W T
=
0.049 0
0 1.284
eigenvectors
A =
−0.735 0.678
−0.678 −0.735
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 22 / 49
PCA-based Transform
An example
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 23 / 49
PCA-based Transform
An example
The following statements are equal:
the eigenvalue λj is the highest one
the eigenvector ej is more dominant
the energy (the majority in information) is gathered in element
w
(i)
j , i = 1, 2, . . . , k
the element w
(i)
j , i = 1, 2, . . . , k has the greatest variance
The following statements are also equal:
the eigenvalue λl is the lowest one
the eigenvector el is practically useless
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 24 / 49
PCA-based Transform
An example
Transformed (decorrelated) data
X Y
w(1) -0.827 -0.175
w(2) 1.777 0.142
w(3) -0.992 0.384
w(4) -0.274 0.130
w(5) -1.675 -0.209
w(6) -0.912 0.175
w(7) 0.099 -0.349
w(8) 1.144 0.046
w(9) 0.438 0.017
w(10) 1.223 -0.162
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 25 / 49
PCA-based Transform
An example
Clear less important component
X Y
w(1) -0.827 0.000
w(2) 1.777 0.000
w(3) -0.992 0.000
w(4) -0.274 0.000
w(5) -1.675 0.000
w(6) -0.912 0.000
w(7) 0.099 0.000
w(8) 1.144 0.000
w(9) 0.438 0.000
w(10) 1.223 0.000
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 26 / 49
PCA-based Transform
An example
Transformed back with A−1
oldX oldY newX newY
b(1) 2.5 2.4 2.42 2.47
b(2) 0.5 0.7 0.50 0.71
b(3) 2.2 2.9 2.54 2.58
b(4) 1.9 2.2 2.01 2.09
b(5) 3.1 3.0 3.04 3.05
b(6) 2.3 2.7 2.48 2.53
b(7) 2.0 1.6 1.74 1.84
b(8) 1.0 1.1 0.97 1.13
b(9) 1.5 1.6 1.49 1.61
b(10) 1.1 0.9 0.91 1.08
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 27 / 49
PCA-based Transform
Properties:
optimal decorrelation method (see the example)
too heavy for evaluation (searching for eigenvalues and eigenvectors)
matrix A is data dependent (cannot be precomputed)
all the eigenvectors must be kept for inverse transform
Conclusion: rather theoretical method
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 28 / 49
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 29 / 49
Discrete Cosine Transform (DCT)
DFT → DCT
Let f = [f (0) f (1) . . . f (N − 1)] be a discrete signal.
Let us modify it as follows:
f (0) f (1) f (2) . . . f (N − 1)
⇓ mirror
f (N − 1) . . . f (1) f (0) f (0) f (1) f (2) . . . f (N − 1)
⇓ insert zeros
f (N − 1) 0 . . . 0 f (1) 0 f (0) 0 f (0) 0 f (1) 0 f (2) . . . 0 . . . f (N − 1) 0
⇓ DFT domain is periodical
0 f (0) 0 f (1) 0 f (2) . . . 0 . . . f (N − 1) 0 f (N − 1) 0 . . . 0 f (1) 0 f (0)
⇓ name substitution
c(0) c(1) c(2) c(3) c(4) . . . c(4N − 1)
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 30 / 49
Discrete Cosine Transform (DCT)
DFT → DCT
The relationship between signal ’c’ and ’f ’:
c(2n) = 0 iﬀ 0 ≤ n < N
c(2n + 1) = f (n) iﬀ 0 ≤ n < N
c(4N − n) = c(n) iﬀ 0 < n < 2N
The basic properties of signal ’c’:
c is even (ready for DFT working over real even data)
|c| = 4N
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 31 / 49
Discrete Cosine Transform (DCT)
DFT → DCT
Let us apply DFT to c:
C(k) =
4N−1
j=0
c(j)e−2πijk
4N = /symmetry/ =
2N−1
j=0
c(j) e−2πijk
4N + e−
2πi(4N−j)k
4N
=
2N−1
j=0
c(j) e−2πijk
4N + e
2πijk
4N /Euler-Moivre eq./
=
2N−1
j=0
c(j) cos
2πjk
4N
− i sin
2πjk
4N
+ cos
2πjk
4N
+ i sin
2πjk
4N
=
2N−1
j=0
c(j) 2 cos
2πjk
4N
= . . .
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 32 / 49
Discrete Cosine Transform (DCT)
DFT → DCT
C(k) =
2N−1
j=0
c(j) 2 cos
2πjk
4N
/separating odd and even items/
=
N−1
l=0
2c(2l) cos
2π2lk
4N
/j=2l, l ∈ N/
+
N−1
l=0
2c(2l + 1) cos
2π(2l + 1)k
4N
/j=2l+1, l ∈ N/
=
N−1
l=0
2f (l) cos
(2l + 1)πk
2N
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 33 / 49
Discrete Cosine Transform (DCT)
DFT → DCT
Signal C(k) is:
even, because ’c’ is even
twice replicated, because ’c’ is stretched by zeros
0 10 20 30
0
2
4
6
8
input signal
n
f(n)
0 10 20 30
0
2
4
6
8
modified signal
j
c(j)
0 10 20 30
0
10
20
30
40
50
DFT of the modified signal
k
C(k)
Notice: Only the coeﬃcients C(k), k = {0, 1, 2, . . . , N − 1} are used.
This version of DCT is also known as DCT-II.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 34 / 49
Discrete Cosine Transform (DCT)
Deﬁnition
Forward 1D-DCT:
C(k) = α(k)
N−1
l=0
f (l) cos
(2l + 1)πk
2N
, k = {0, 1, 2, . . . , N − 1}
where
α(k) =



1
N iﬀ k = 0
2
N iﬀ k = 0
Inverse 1D-DCT:
f (l) =
N−1
k=0
α(k)C(k) cos
(2l + 1)πk
2N
, l = {0, 1, 2, . . . , N − 1}
Basic properties:
basis formed of sampled cosine waves only
no complex numbers
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 35 / 49
Fast Discrete Cosine Transform (F-DCT)
Derivation
Let us recombine the input signal:
y(l) = f (2l)
y(N − 1 − l) = f (2l + 1) (l = 0, . . . , N/2 − 1)
Example: f = [1 2 3 4 5 6 7 8] → y = [1 3 5 7 8 6 4 2]
Applying DCT on signal f we can deduce:
C(k) = α(k)
N−1
l=0
f (l) cos
(2l + 1)πk
2N
/f → y/ = · · · =
= α(k)
N−1
l=0
y(l) cos
(4l + 1)πk
2N
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 36 / 49
Fast Discrete Cosine Transform (F-DCT)
Derivation (cont’d)
Let us apply DFT on signal y:
Y(k) =
N−1
l=0
y(l)e
−2πikl
N
=
N−1
l=0
y(l) cos
2πkl
N
− i sin
2πkl
N
/ · e− πik
2N /
Real e− πik
2N Y(k) =
N−1
l=0
y(l) cos
2πkl
N
cos
kπ
2N
− sin
2πkl
N
sin
kπ
2N
=
N−1
l=0
y(l) cos
(4l + 1)kπ
2N
= C(k)/α(k)
Imag e− πik
2N Y(k) = omitted
C(k) = α(k)Real e−πik
2N Y(k)
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 37 / 49
Fast Discrete Cosine Transform (F-DCT)
Algorithm
1 recombine input sequence f of length N to get y:
y(l) = f (2l)
y(N − 1 − l) = f (2l + 1) (l = 0, . . . , N/2 − 1)
2 apply FFT to y:
Y = FFT(y)
3 for each k = 0, . . . , N − 1 do:
1 multiply the k-th Fourier coeﬃcient by factor e− πik
2N :
Y (k) = e− πik
2N Y(k)
2 fetch only real part from each Fourier coeﬃcient and normalize the
results:
C(k) = α(k)Real [Y (k)]
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 38 / 49
2D Discrete Cosine Transform (2D-DCT)
Deﬁnition
Forward 2D-DCT:
C(u, v) = α(u)α(v)
N−1
k=0
N−1
l=0
f (k, l) cos
(2k + 1)πu
2N
cos
(2l + 1)πv
2N
where u, v = {0, 1, 2, . . . , N − 1}
Inverse 2D-DCT:
f (k, l) =
N−1
u=0
N−1
v=0
α(u)α(v)C(u, v) cos
(2k + 1)πu
2N
cos
(2l + 1)πv
2N
Basic properties:
2D-DCT it is simply an extension of 1D-DCT
separable
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 39 / 49
2D Discrete Cosine Transform (2D-DCT)
Basis functions
1D-DCT
8 basis 1D functions
(N = 8)
a
0
2
1
3
4
5
6
7
b
2D-DCT
64 basis 2D functions
(N×N = 8×8)
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 40 / 49
Discrete Cosine Transform (DCT)
Properties
Almost as eﬃcient as PCA in term of decorrelation optimality.
The transform matrix can be easily prepared without having the data.
Unlike DFT, DCT works with real data.
As its is derived directly from FFT, there exists fast alternative for
DCT.
Regarding its construction, it works with symmetric data.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 41 / 49
1 Revision
Algebra
Statistics
2 Motivation
3 PCA-based Transform
Design
An example
4 Discrete Cosine Transform (DCT)
1D DCT
1D FastDCT
2D DCT
5 Walsh-Hadamard Transform
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 42 / 49
Walsh-Hadamard Transform
Jacques Salomon Hadamard (1865 – 1963)
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 43 / 49
Walsh-Hadamard Transform
Similarly to DFT we can deﬁne Hadamard matrix which deﬁnes the
transform:
Hm =
1
√
2
Hm−1 Hm−1
Hm−1 −Hm−1
H1 =
1
√
2
1 1
1 −1
H0 = +1
An example:
H2 =
1
2




1 1 1 1
1 −1 1 −1
1 1 −1 −1
1 −1 −1 1




David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 44 / 49
Walsh-Hadamard Transform
An 8-samples long Walsh functions:
a
0
1
2
3
4
5
6
7
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 45 / 49
Walsh-Hadamard Transform
Formal deﬁnition:
H(k) =
1
N
N−1
n=0
f (n)
m−1
i=0
(−1)B(m,i,n,k)
with k ∈ {0, 1, . . . , N − 1}, N = 2m, and B(m, i, n, k) = bi (n)bm−1−i (k),
where bk(v) denotes the k-th bit in the binary representation of a
non-negative integer v.
Notice: Similarly to FT we can deﬁne inverse (IWHT) or fast (FWHT)
Walsh-Hadamard transform.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 46 / 49
Walsh-Hadamard Transform
Properties:
Unlike to DCT the basis functions contain plus and minus ones only.
So-called Walsh functions are used as a basis functions.
Any two basis functions are orthogonal.
In real space only (like DCT).
Worse approximation of PCA than DCT.
Wide application in digital communications.
One can implement the WHT on smaller, cheaper hardware.
Notice: Magnitude in WHT is aﬀected by phase shifts in the signal!
Typically, orthogonality is broken.
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 47 / 49
Bibliography
Gonzalez, R. C., Woods, R. E. Digital image processing / 2nd ed.,
Upper Saddle River: Prentice Hall, c2002, pages 793, ISBN
0201180758
Klette R., Zamperoni P. Handbook of Image Processing Operators,
Wiley, 1996, ISBN-0471956422
Salomon D. Data Compression, The Complete Reference, 4th edition,
Springer, London, 2007, ISBN-1846286025
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 48 / 49
You should know the answers . . .
Explain the diﬀerence between correlated and decorrelated signal.
How does the PCA decorrelate the given signal?
How do we get the PCA transformation matrix A in practice?
What is the content of matrix A used in PCA transform?
Provide your own example (diﬀerent from the examples presented in
the lecture) suitable for PCA transform.
Explain the relationship between DFT and DCT.
Describe the F-DCT algorithm and compute F-DCT([1 6 6 1]).
Analyze the ability to compact the energy for the following
transforms: PCA, DFT, DCT, WHT
David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 49 / 49