Filters in Image Processing Image Transforms (I) David Svoboda email: svoboda@fi.muni.cz Centre for Biomedical Image Analysis Faculty of Informatics, Masaryk University, Brno, CZ October 4, 2019 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 1 / 49 Outline 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 2 / 49 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 3 / 49 Revision Algebra Definition: Let C ⊂ Rn×n be a square matrix. Vector v ∈ Cn is an eigenvector of C ⇔ ∃λ ∈ C : Cv = λv. λ is called an eigenvalue. Properties of eigenvalues: Cv = λv equals to (C − λE)v = 0 The solution of equation |C − λE| = 0 is identical to the search for roots in the polynomial of degree n. if C is symmetric then all the eigenvalues are real Properties of eigenvectors: the two eigenvectors corresponding to two different eigenvalues are orthogonal if C is symmetric then all the eigenvectors are real and orthogonal David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 4 / 49 Revision Algebra Properties of symmetric matrices: if C ⊂ Rn×n : C = CT with its eigenvectors e1, e2, . . . , en and eigenvalues λ1, λ2, . . . , λn then ∃A ⊂ Rn×n such that AT CA = D, where D =        λ1 0 0 . . . 0 0 λ2 0 . . . 0 0 0 λ3 . . . 0 ... ... ... ... ... 0 0 0 . . . λn        and A =        e1 e2 e3 ... en        David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 5 / 49 Revision Statistics Cross-correlation: given two 1D signals f (i) and g(i): (f g)(i) = k f (i + k)g(k) it is a measure of similarity of two signals Auto-correlation: given 1D signal f (i): (f f )(i) = k f (i + k)f (k) it is a measure for finding repeating patterns David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 6 / 49 Revision Statistics Auto-correlation example 0 500 1000 0 20 40 60 80 100 Highly correlated signal 0 500 1000 1500 2000 0 1 2 3 4 5 x 10 6 Auto−correlation 0 500 1000 0 20 40 60 80 100 Less correlated signal 0 500 1000 1500 2000 0 1 2 3 4 5 x 10 6 Auto−correlation David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 7 / 49 Revision Statistics Auto-correlation example Highly correlated signal 50 100 150 200 250 50 100 150 200 250 Auto−correlation 100 200 300 400 500 100 200 300 400 500 Less correlated signal 50 100 150 200 250 50 100 150 200 250 Auto−correlation 100 200 300 400 500 100 200 300 400 500 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 8 / 49 Revision Statistics Highly correlated signals contain repeating patterns energy (nonzero values) is stretched over the whole space this is what we usually have Decorrelated signals energy (nonzero values) is compacted in one location easy to compress this is what we wish to have David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 9 / 49 Revision Statistics Covariance Covariance exhibits how much two signals f and g (let us assume |f | = |g| = n) vary from the mean with respect to each other: cov(f , g) = n i=1 (fi − f )(gi − g) n − 1 Covariance Matrix – a matrix of covariances between two signals f and g: C = cov(f , f ) cov(f , g) cov(g, f ) cov(g, g) Matrix C is always real and symmetric. Notice: Two signals f and g are decorrelated iff cov(f , g) = cov(g, f ) = 0, i.e. when the matrix C is diagonal. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 10 / 49 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 11 / 49 Motivation The most common linear transforms sinusoidal transforms DFT (discrete Fourier transform) DCT (discrete cosine transform) DST (discrete sine transform) rectangular wave transforms Walsh-Hadamard transform Haar transform variable basis Discrete wavelet transform eigenvector-based transforms Karhunen-Loeve transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 12 / 49 Motivation Energy compaction Evaluate DFT over some short signal: f = [1 3 4 2] ⇓ /DFT/ F = 1 √ 4 [10 (−3 − i) 0 (−3 + i)] Measure the energy of the signals: E(f ) = f (i) 2 = 1 2 + 3 2 + 4 2 + 2 2 = 30 E(F) = F(i) 2 = 25 + 10/4 + 10/4 = 30 The first component in f accounts for 3.3% of energy while the first component in F accounts for 83.3% of energy. Conclusion: Ability of energy compaction ≈ optimality of the transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 13 / 49 Motivation Why do we need image transforms? manipulation with data in another domain might be simpler example of use: low-pass, high-pass filtering data decorrelation ≈ co-variance removal ≈ energy compaction example of use: image compression Which transform properties are the most important? speed simple to implement Notice: The requirements suggest to use linear transforms, i.e. the input signal f is transformed into the signal F by: F = Af , where A is a transform matrix. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 14 / 49 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 15 / 49 PCA-based Transform Design of transform matrix A Task to solve Given an input discrete signal, design a transformation matrix such that the transformed signal has decorrelated samples (they are mutually independent). Solution Given a 2D image, let us break it up into k blocks of n pixels each. 1 2 3 4 k ... ... Each block i is characterized with its vector b(i), i = 1, 2, . . . , k where length(b(i)) = n. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 16 / 49 PCA-based Transform Design of transform matrix A input input expected output (correlated) (mean centered) (decorrelated) b(i) v(i) = b(i) − b w(i) = Av(i) , i = 1, 2, . . . , k V = v(1) , v(2) , . . . , v(k) W = AV = w(1) , w(2) , . . . , w(k) CV = V · V T CW = W · W T CV . . . real, symmetric CW . . . diagonal CW = W · W T = (AV ) · (AV )T = A(V · V T )AT = A · CV · AT Notice: the off-diagonal elements of covariance matrix CV are the covariances of the v(i) vectors . . . (V · V T )ab = k i=1 v (i) a v (i) b the off-diagonal elements of covariance matrix CW are zero the eigenvectors of CV form the rows of a new matrix A CW is a diagonal matrix formed of the eigenvalues of CV David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 17 / 49 PCA-based Transform Algorithm: 1 collect the input data b(i) 2 form the matrix V 3 calculate the covariance matrix CV 4 find eigenvectors and eigenvalues of CV 5 use eigenvectors to form the transform matrix A 6 use A as a transform matrix to original data 7 get transformed decorrelated data: w(i) = A(b(i) − b) Notice: Red lines ≡ Principal Component Analysis (PCA). David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 18 / 49 PCA-based Transform An example Let us submit the following 2D image to the PCA 2.5 2.4 0.5 0.7 2.2 2.9 1.9 2.2 3.1 3.0 2.3 2.7 2.0 1.6 1.0 1.1 1.5 1.6 1.1 0.9 Notice: Neighbouring pixels have usually similar value. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 19 / 49 PCA-based Transform An example We have chosen n = 2, k = 10 and fetched k vectors b(i), i = 1, 2, . . . , k in the following manner: X Y b(1) 2.5 2.4 b(2) 0.5 0.7 b(3) 2.2 2.9 b(4) 1.9 2.2 b(5) 3.1 3.0 b(6) 2.3 2.7 b(7) 2.0 1.6 b(8) 1.0 1.1 b(9) 1.5 1.6 b(10) 1.1 0.9 Correlation of the neighbouring intensities: David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 20 / 49 PCA-based Transform An example Modify the input datasets: X Y X − X Y − Y b(1) 2.5 2.4 v(1) 0.69 0.49 b(2) 0.5 0.7 v(2) -1.31 -1.21 b(3) 2.2 2.9 v(3) 0.39 0.99 b(4) 1.9 2.2 v(4) 0.09 0.29 b(5) 3.1 3.0 v(5) 1.29 1.09 b(6) 2.3 2.7 v(6) 0.49 0.79 b(7) 2.0 1.6 v(7) 0.19 -0.31 b(8) 1.0 1.1 v(8) -0.81 -0.81 b(9) 1.5 1.6 v(9) -0.31 -0.31 b(10) 1.1 0.9 v(10) -0.71 -1.01 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 21 / 49 PCA-based Transform An example Evaluate all the covariances CV (X, X), CV (X, Y ), CV (Y , X), and CV (Y , Y ) and form the covariance matrix CV : CV = V · V T = 0.617 0.615 0.615 0.717 Since the covariance matrix is square, we can calculate the eigenvectors and eigenvalues for this matrix: eigenvalues CW = W · W T = 0.049 0 0 1.284 eigenvectors A = −0.735 0.678 −0.678 −0.735 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 22 / 49 PCA-based Transform An example David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 23 / 49 PCA-based Transform An example The following statements are equal: the eigenvalue λj is the highest one the eigenvector ej is more dominant the energy (the majority in information) is gathered in element w (i) j , i = 1, 2, . . . , k the element w (i) j , i = 1, 2, . . . , k has the greatest variance The following statements are also equal: the eigenvalue λl is the lowest one the eigenvector el is practically useless David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 24 / 49 PCA-based Transform An example Transformed (decorrelated) data X Y w(1) -0.827 -0.175 w(2) 1.777 0.142 w(3) -0.992 0.384 w(4) -0.274 0.130 w(5) -1.675 -0.209 w(6) -0.912 0.175 w(7) 0.099 -0.349 w(8) 1.144 0.046 w(9) 0.438 0.017 w(10) 1.223 -0.162 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 25 / 49 PCA-based Transform An example Clear less important component X Y w(1) -0.827 0.000 w(2) 1.777 0.000 w(3) -0.992 0.000 w(4) -0.274 0.000 w(5) -1.675 0.000 w(6) -0.912 0.000 w(7) 0.099 0.000 w(8) 1.144 0.000 w(9) 0.438 0.000 w(10) 1.223 0.000 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 26 / 49 PCA-based Transform An example Transformed back with A−1 oldX oldY newX newY b(1) 2.5 2.4 2.42 2.47 b(2) 0.5 0.7 0.50 0.71 b(3) 2.2 2.9 2.54 2.58 b(4) 1.9 2.2 2.01 2.09 b(5) 3.1 3.0 3.04 3.05 b(6) 2.3 2.7 2.48 2.53 b(7) 2.0 1.6 1.74 1.84 b(8) 1.0 1.1 0.97 1.13 b(9) 1.5 1.6 1.49 1.61 b(10) 1.1 0.9 0.91 1.08 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 27 / 49 PCA-based Transform Properties: optimal decorrelation method (see the example) too heavy for evaluation (searching for eigenvalues and eigenvectors) matrix A is data dependent (cannot be precomputed) all the eigenvectors must be kept for inverse transform Conclusion: rather theoretical method David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 28 / 49 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 29 / 49 Discrete Cosine Transform (DCT) DFT → DCT Let f = [f (0) f (1) . . . f (N − 1)] be a discrete signal. Let us modify it as follows: f (0) f (1) f (2) . . . f (N − 1) ⇓ mirror f (N − 1) . . . f (1) f (0) f (0) f (1) f (2) . . . f (N − 1) ⇓ insert zeros f (N − 1) 0 . . . 0 f (1) 0 f (0) 0 f (0) 0 f (1) 0 f (2) . . . 0 . . . f (N − 1) 0 ⇓ DFT domain is periodical 0 f (0) 0 f (1) 0 f (2) . . . 0 . . . f (N − 1) 0 f (N − 1) 0 . . . 0 f (1) 0 f (0) ⇓ name substitution c(0) c(1) c(2) c(3) c(4) . . . c(4N − 1) David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 30 / 49 Discrete Cosine Transform (DCT) DFT → DCT The relationship between signal ’c’ and ’f ’: c(2n) = 0 iff 0 ≤ n < N c(2n + 1) = f (n) iff 0 ≤ n < N c(4N − n) = c(n) iff 0 < n < 2N The basic properties of signal ’c’: c is even (ready for DFT working over real even data) |c| = 4N David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 31 / 49 Discrete Cosine Transform (DCT) DFT → DCT Let us apply DFT to c: C(k) = 4N−1 j=0 c(j)e−2πijk 4N = /symmetry/ = 2N−1 j=0 c(j) e−2πijk 4N + e− 2πi(4N−j)k 4N = 2N−1 j=0 c(j) e−2πijk 4N + e 2πijk 4N /Euler-Moivre eq./ = 2N−1 j=0 c(j) cos 2πjk 4N − i sin 2πjk 4N + cos 2πjk 4N + i sin 2πjk 4N = 2N−1 j=0 c(j) 2 cos 2πjk 4N = . . . David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 32 / 49 Discrete Cosine Transform (DCT) DFT → DCT C(k) = 2N−1 j=0 c(j) 2 cos 2πjk 4N /separating odd and even items/ = N−1 l=0 2c(2l) cos 2π2lk 4N /j=2l, l ∈ N/ + N−1 l=0 2c(2l + 1) cos 2π(2l + 1)k 4N /j=2l+1, l ∈ N/ = N−1 l=0 2f (l) cos (2l + 1)πk 2N David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 33 / 49 Discrete Cosine Transform (DCT) DFT → DCT Signal C(k) is: even, because ’c’ is even twice replicated, because ’c’ is stretched by zeros 0 10 20 30 0 2 4 6 8 input signal n f(n) 0 10 20 30 0 2 4 6 8 modified signal j c(j) 0 10 20 30 0 10 20 30 40 50 DFT of the modified signal k C(k) Notice: Only the coefficients C(k), k = {0, 1, 2, . . . , N − 1} are used. This version of DCT is also known as DCT-II. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 34 / 49 Discrete Cosine Transform (DCT) Definition Forward 1D-DCT: C(k) = α(k) N−1 l=0 f (l) cos (2l + 1)πk 2N , k = {0, 1, 2, . . . , N − 1} where α(k) =    1 N iff k = 0 2 N iff k = 0 Inverse 1D-DCT: f (l) = N−1 k=0 α(k)C(k) cos (2l + 1)πk 2N , l = {0, 1, 2, . . . , N − 1} Basic properties: basis formed of sampled cosine waves only no complex numbers David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 35 / 49 Fast Discrete Cosine Transform (F-DCT) Derivation Let us recombine the input signal: y(l) = f (2l) y(N − 1 − l) = f (2l + 1) (l = 0, . . . , N/2 − 1) Example: f = [1 2 3 4 5 6 7 8] → y = [1 3 5 7 8 6 4 2] Applying DCT on signal f we can deduce: C(k) = α(k) N−1 l=0 f (l) cos (2l + 1)πk 2N /f → y/ = · · · = = α(k) N−1 l=0 y(l) cos (4l + 1)πk 2N David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 36 / 49 Fast Discrete Cosine Transform (F-DCT) Derivation (cont’d) Let us apply DFT on signal y: Y(k) = N−1 l=0 y(l)e −2πikl N = N−1 l=0 y(l) cos 2πkl N − i sin 2πkl N / · e− πik 2N / Real e− πik 2N Y(k) = N−1 l=0 y(l) cos 2πkl N cos kπ 2N − sin 2πkl N sin kπ 2N = N−1 l=0 y(l) cos (4l + 1)kπ 2N = C(k)/α(k) Imag e− πik 2N Y(k) = omitted C(k) = α(k)Real e−πik 2N Y(k) David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 37 / 49 Fast Discrete Cosine Transform (F-DCT) Algorithm 1 recombine input sequence f of length N to get y: y(l) = f (2l) y(N − 1 − l) = f (2l + 1) (l = 0, . . . , N/2 − 1) 2 apply FFT to y: Y = FFT(y) 3 for each k = 0, . . . , N − 1 do: 1 multiply the k-th Fourier coefficient by factor e− πik 2N : Y (k) = e− πik 2N Y(k) 2 fetch only real part from each Fourier coefficient and normalize the results: C(k) = α(k)Real [Y (k)] David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 38 / 49 2D Discrete Cosine Transform (2D-DCT) Definition Forward 2D-DCT: C(u, v) = α(u)α(v) N−1 k=0 N−1 l=0 f (k, l) cos (2k + 1)πu 2N cos (2l + 1)πv 2N where u, v = {0, 1, 2, . . . , N − 1} Inverse 2D-DCT: f (k, l) = N−1 u=0 N−1 v=0 α(u)α(v)C(u, v) cos (2k + 1)πu 2N cos (2l + 1)πv 2N Basic properties: 2D-DCT it is simply an extension of 1D-DCT separable David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 39 / 49 2D Discrete Cosine Transform (2D-DCT) Basis functions 1D-DCT 8 basis 1D functions (N = 8) a 0 2 1 3 4 5 6 7 b 2D-DCT 64 basis 2D functions (N×N = 8×8) David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 40 / 49 Discrete Cosine Transform (DCT) Properties Almost as efficient as PCA in term of decorrelation optimality. The transform matrix can be easily prepared without having the data. Unlike DFT, DCT works with real data. As its is derived directly from FFT, there exists fast alternative for DCT. Regarding its construction, it works with symmetric data. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 41 / 49 1 Revision Algebra Statistics 2 Motivation 3 PCA-based Transform Design An example 4 Discrete Cosine Transform (DCT) 1D DCT 1D FastDCT 2D DCT 5 Walsh-Hadamard Transform David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 42 / 49 Walsh-Hadamard Transform Jacques Salomon Hadamard (1865 – 1963) David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 43 / 49 Walsh-Hadamard Transform Similarly to DFT we can define Hadamard matrix which defines the transform: Hm = 1 √ 2 Hm−1 Hm−1 Hm−1 −Hm−1 H1 = 1 √ 2 1 1 1 −1 H0 = +1 An example: H2 = 1 2     1 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 −1 1     David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 44 / 49 Walsh-Hadamard Transform An 8-samples long Walsh functions: a 0 1 2 3 4 5 6 7 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 45 / 49 Walsh-Hadamard Transform Formal definition: H(k) = 1 N N−1 n=0 f (n) m−1 i=0 (−1)B(m,i,n,k) with k ∈ {0, 1, . . . , N − 1}, N = 2m, and B(m, i, n, k) = bi (n)bm−1−i (k), where bk(v) denotes the k-th bit in the binary representation of a non-negative integer v. Notice: Similarly to FT we can define inverse (IWHT) or fast (FWHT) Walsh-Hadamard transform. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 46 / 49 Walsh-Hadamard Transform Properties: Unlike to DCT the basis functions contain plus and minus ones only. So-called Walsh functions are used as a basis functions. Any two basis functions are orthogonal. In real space only (like DCT). Worse approximation of PCA than DCT. Wide application in digital communications. One can implement the WHT on smaller, cheaper hardware. Notice: Magnitude in WHT is affected by phase shifts in the signal! Typically, orthogonality is broken. David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 47 / 49 Bibliography Gonzalez, R. C., Woods, R. E. Digital image processing / 2nd ed., Upper Saddle River: Prentice Hall, c2002, pages 793, ISBN 0201180758 Klette R., Zamperoni P. Handbook of Image Processing Operators, Wiley, 1996, ISBN-0471956422 Salomon D. Data Compression, The Complete Reference, 4th edition, Springer, London, 2007, ISBN-1846286025 David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 48 / 49 You should know the answers . . . Explain the difference between correlated and decorrelated signal. How does the PCA decorrelate the given signal? How do we get the PCA transformation matrix A in practice? What is the content of matrix A used in PCA transform? Provide your own example (different from the examples presented in the lecture) suitable for PCA transform. Explain the relationship between DFT and DCT. Describe the F-DCT algorithm and compute F-DCT([1 6 6 1]). Analyze the ability to compact the energy for the following transforms: PCA, DFT, DCT, WHT David Svoboda (CBIA@FI) Filters in Image Processing autumn 2019 49 / 49