Digital Signal Processing
Moslem Amiri, Václav Přenosil
Masaryk University
Understanding Digital Signal Processing, Third Edition, Richard Lyons
(0-13-261480-4) © Pearson Education, 2011.
The Fast Fourier Transform
2
Relationship of FFT to DFT
 Radix-2 FFT algorithm
 A very efficient process for performing DFTs
under constraint that DFT size be an integral
power of two
 Radix-2 FFT greatly reduces the number of
necessary arithmetic operations
 The number of complex multiplications necessary
for an N-point DFT is N2
 The number of complex multiplications for an Npoint
FFT is approximately (N/2)log2N
∑
−
=
−
=
1
0
/2
)()(
N
n
Nmnj
enxmX π
3
Relationship of FFT to DFT
4
Relationship of FFT to DFT
 FFT is not an approximation of DFT
 It’s exactly equal to DFT
 All of performance characteristics of DFT, output
symmetry, linearity, output magnitudes, leakage,
scalloping loss, etc., also describe the behavior of
FFT
5
Hints on Using FFTs in Practice
 Sample fast enough and long enough
 Sampling rate must be greater than twice the
bandwidth of continuous A/D input signal
 Sample at 2.5 to 4 times the signal bandwidth
 If we don’t know signal’s bandwidth, we should
mistrust any FFT results that have significant spectral
components at frequencies near half fs
 Be suspicious of aliasing if there are any spectral
components whose frequencies depend on fs
 If we suspect that aliasing is occurring or continuous
signal contains broadband noise, we’ll have to use an
analog lowpass filter prior to A/D conversion
 Cutoff frequency of lowpass filter must be greater than
frequency band of interest but less than half fs
6
Hints on Using FFTs in Practice
 Sample fast enough and long enough
 How many samples must we collect
 Data collection time interval must be long enough to
satisfy desired FFT frequency resolution for given fs
 Total data collection time interval is N/fs seconds, and
N-point FFT bin-to-bin frequency resolution is fs/N Hz
 For example, if we need a spectral resolution of 5 Hz,
then fs/N = 5 Hz, and
 If fs is, say, 10 kHz, then N must be at least 2000, and
we’d choose N = 2048 because this number is a power
of two
s
ss
f
ff
N 2.0
5resolutiondesired
===
7
Hints on Using FFTs in Practice
 Manipulating time data prior to transformation
 If length of time-domain data sequence is not an
integral power of two, we have two options
 Discard enough data samples so that remaining
sequence length is some integral power of two
 Not recommended
 A better approach is to append enough zerovalued
samples to the end of time data sequence
to match the number of points of the next largest
radix-2 FFT
 Zero-padding technique
8
Hints on Using FFTs in Practice
 Manipulating time data prior to transformation
 We can multiply time data by a window function
to alleviate leakage problem
 But frequency resolution is degraded when windows
are used
 If appending zeros is necessary to extend a time
sequence, append zeros after multiplying original
time data sequence by a window function
9
Hints on Using FFTs in Practice
 Manipulating time data prior to transformation
 Even when windowing is employed, high-level
spectral components can obscure nearby lowlevel
spectral components
 This is especially evident when original time data has a
nonzero average, i.e., it’s riding on a DC bias
 A large-amplitude DC spectral component at 0 Hz will
overshadow its spectral neighbors
 We can eliminate this problem by calculating average
of time sequence and subtracting that average value
from each sample in original sequence
 The averaging and subtraction process must be
performed before windowing
10
Hints on Using FFTs in Practice
 Enhancing FFT results
 To detect signal energy in presence of noise
(enough time-domain data is available), we can
improve sensitivity of processing by averaging
multiple FFTs
 A 2N-point real sequence can be transformed
with a single N-point complex radix-2 FFT to
speed up our processing
 If we need FFT of unwindowed and also
windowed time-domain data, we can perform FFT
of unwindowed data, and then we can perform
frequency-domain windowing to reduce spectral
leakage on any, or all, of FFT bin outputs
11
Hints on Using FFTs in Practice
 Interpreting FFT results
 First step in interpreting FFT results is to compute
absolute frequency of individual FFT bin centers
 Like DFT, FFT bin spacing is fs/N
 For m = 0, 1, 2, 3, . . ., N−1, absolute frequency of mth
bin center is mfs/N
 If FFT’s input time samples are real, only X(m)
outputs from m = 0 to m = N/2 are independent
 We need determine only absolute FFT bin frequencies
for m over range of 0 ≤ m ≤ N/2
 If FFT input samples are complex, all N of FFT outputs
are independent, and we should compute absolute
FFT bin frequencies for m over range of 0 ≤ m ≤ N−1
12
Hints on Using FFTs in Practice
 Interpreting FFT results
 We can determine true amplitude of time-domain
signals from their FFT spectral results
 Radix-2 FFT outputs are complex
 FFT output magnitude samples
are all inherently multiplied by factor N/2, when input
samples are real
 If FFT input samples are complex, scaling factor is N
 So to determine correct amplitudes of time-domain
sinusoidal components, divide FFT magnitudes by N/2
for real inputs and N for complex inputs
)()()( real mjXmXmX imag+=
2
imag
2
realmag )()()()( mXmXmXmX +==
13
Hints on Using FFTs in Practice
 Interpreting FFT results
 If a window function was used on original timedomain
data, some of FFT input samples will be
attenuated
 This reduces the resultant FFT output magnitudes from
their true unwindowed values
 To calculate correct amplitudes of various time-domain
sinusoidal components, we have to further divide FFT
magnitudes by appropriate processing loss factor
associated with the window function used
14
Hints on Using FFTs in Practice
 Interpreting FFT results
 To determine power spectrum XPS(m)
 Normalization through division by (|X(m)|max)2 or
|X(m)|max eliminates effect of any absolute FFT
scale factor (N or N/2) or window scale factor
 No compensation need be performed








⋅=








⋅=
⋅=
+==
max
10
2
max
2
10
2
10
2
imag
2
real
2
)(
)(
log20)(normalized
))((
)(
log10)(normalized
dB))((log10)(
)()()()(
mX
mX
mX
mX
mX
mX
mXmX
mXmXmXmX
dB
dB
dB
PS
15
Hints on Using FFTs in Practice
 Interpreting FFT results
 Phase angles Xø(m)
 Our calculations (or compiler) should detect
occurrences of Xreal(m) = 0 and set corresponding
Xø(m) to 90° if Ximag(m) > 0, set Xø(m) to 0° if Ximag(m) =
0, and set Xø(m) to −90° if Ximag(m) < 0
 FFT outputs containing significant noise
components can cause large fluctuations in the
computed Xø(m) phase angles
 Xø(m) samples are meaningful when corresponding
|X(m)| is well above average FFT output noise level








= −
)(
)(
tan)(
real
1
mX
mX
mX
imag
φ
16
Derivation of Radix-2 FFT Algorithm
 where m is in range 0 to N/2−1
 Index m has that reduced range because each of the
two N/2-point DFTs on the right side are periodic in m
with period N/2
∑∑
∑∑
∑∑
∑
−
=
−
=
==
=
−
=
−
=
=
−
=
+−
−
=
−
−
=
−
++= →
++= →
++=
=
−
−
−
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2
1)2/(
0
2
1)2/(
0
/)12(2
1)2/(
0
/)2(2
1
0
/2
)12()2(
)12()2(
)12()2(
)()(
2/
)2//(2
)/(222
/2
N
n
mn
N
m
N
N
n
mn
N
We
eW
N
n
mn
N
m
N
N
n
mn
N
eW
N
n
Nmnj
N
n
Nmnj
N
n
Nmnj
WnxWWnx
WnxWWnx
enxenx
enxmX
N
Nj
Nj
N
Nj
N
π
π
π
ππ
π
17
Derivation of Radix-2 FFT Algorithm
 We have two N/2 summations whose results can
be combined to give the first N/2 samples of an
N-point DFT
 Benefits of breaking N-point DFT into two parts
 Reduction of number crunching because W terms in
the two summations are identical
 Also the upper half of DFT outputs is easy to calculate
∑∑
−
=
−
=
=
++= →
−
1)2/(
0
2/
1)2/(
0
2/ )12()2()(
)2//(2
2/
N
n
mn
N
m
N
N
n
mn
N
eW
WnxWWnxmX
Nj
N
π
18
Derivation of Radix-2 FFT Algorithm
 We just change sign of twiddle factor and use results
of the two summations from X(m) to get X(m+N/2)
 m goes from 0 to (N/2)−1
 To compute an N-point DFT, we actually perform two
N/2-point DFTs—one N/2-point DFT on even-indexed
and one N/2-point DFT on odd-indexed x(n) samples
∑∑
∑∑
∑∑
−
=
−
=
−+
−+
−
=
++
−
=
+
−
=
−
=
=
+−=+
−=−=== →
====
++=+
++= →
−
1)2/(
0
2/
1)2/(
0
2/
2/22/)2/(factortwiddle
2/2/
2/22
2/
2/
2/2/
)2/(
2/
1)2/(
0
)2/(
2/
)2/(
1)2/(
0
)2/(
2/
1)2/(
0
2/
1)2/(
0
2/
)12()2()2/(
)1()(
)1()(
)12()2()2/(
)12()2()(
)2//(2
2/
N
n
mn
N
m
N
N
n
mn
N
m
N
m
N
NNjm
N
N
N
m
N
Nm
N
mn
N
mn
N
NNnjmn
N
Nn
N
mn
N
Nmn
N
N
n
Nmn
N
Nm
N
N
n
Nmn
N
N
n
mn
N
m
N
N
n
mn
N
eW
WnxWWnxNmX
WWeWWWW
WWeWWWW
WnxWWnxNmX
WnxWWnxmX
Nj
N
π
π
π
19
Derivation of Radix-2 FFT Algorithm
∑
∑
∑
∑
−
=
−
=
−
=
−
=
+−
=+
++
=
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
)12(
)2()2/(
)12(
)2()(
N
n
mn
N
m
N
N
n
mn
N
N
n
mn
N
m
N
N
n
mn
N
WnxW
WnxNmX
WnxW
WnxmX
20
Derivation of Radix-2 FFT Algorithm
 Twiddle factors
 Because −e−j2πm/N = e−j2π(m+N/2)/N, negative W
twiddle factors are implemented with positive W
twiddle factors that follow the lower DFT in Fig. 4-
2
∑∑
∑∑
−
=
−
=
−
=
−
=
+−=+
++=
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
)12()2()2/(
)12()2()(
N
n
mn
N
m
N
N
n
mn
N
N
n
mn
N
m
N
N
n
mn
N
WnxWWnxNmX
WnxWWnxmX
21
Derivation of Radix-2 FFT Algorithm
∑∑
∑∑
∑∑
∑
∑∑
∑∑
−
=
−
=
−
=
−
=
=
−
=
+
−
=
−
=
−
=
−
=
−
=
−
=
+++=
++= →
++=
=
−=+ →
+= →
+−=+
++=
1)4/(
0
4/2/
1)4/(
0
4/
1)4/(
0
4/2/
1)4/(
0
4/
1)4/(
0
)12(
2/
1)4/(
0
2
2/
1)2/(
0
2/
tionsimplifica
tionsimplifica
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
1)2/(
0
2/
)34()14()(
)24()4()(
)24()4(
)2()(
)()()2/(
)()()(
)12()2()2/(
)12()2()(
4/
2
2/
N
n
mn
N
m
N
N
n
mn
N
N
n
mn
N
m
N
N
n
mn
N
WW
N
n
mn
N
N
n
mn
N
N
n
mn
N
m
N
m
N
N
n
mn
N
m
N
N
n
mn
N
N
n
mn
N
m
N
N
n
mn
N
WnxWWnxmB
WnxWWnxmA
WnxWnx
WnxmA
mBWmANmX
mBWmAmX
WnxWWnxNmX
WnxWWnxmX
mn
N
mn
N
22
Derivation of Radix-2 FFT Algorithm
∑
∑
∑
∑
−
=
−
=
−
=
−
=
++
+=
++
=
−=+
+=
1)4/(
0
4/2/
1)4/(
0
4/
1)4/(
0
4/2/
1)4/(
0
4/
)34(
)14()(
)24(
)4()(
)()()2/(
)()()(
N
n
mn
N
m
N
N
n
mn
N
N
n
mn
N
m
N
N
n
mn
N
m
N
m
N
WnxW
WnxmB
WnxW
WnxmA
mBWmANmX
mBWmAmX
Twiddle factor for N = 8,
ranges from to
because the m index, for
A(m) and B(m), goes from 0
to 3
m
NW 2/
0
4W 3
4W
23
Derivation of Radix-2 FFT Algorithm
 Fig. 4-3
 For any N-point DFT, we break each of N/2-point
DFTs into two N/4-point DFTs to further reduce
the number of sine and cosine multiplications
 Eventually, we arrive at an array of 2-point DFTs
where no further computational savings could be
realized
 The 2-point DFT functions cannot be partitioned into
smaller parts
 Butterfly of a single 2-point DFT is shown in Fig. 4-4
24
Derivation of Radix-2 FFT Algorithm
 The 2-point DFT blocks in Fig. 4-3 are replaced
by butterfly in Fig. 4-4 to give a full 8-point FFT
implementation of DFT as shown in Fig. 4-5
1
1
2/22/
/020
−===
==
−−
−
ππ
π
jNNjN
N
Nj
N
eeW
eW
25
Derivation of Radix-2 FFT Algorithm
26
FFT Input/Output Data Index Bit Reversal
 Decimation-in-time FFT implementation
 Was the title of Fig. 4-5
 Decimation-in-time phrase refers to how we
broke DFT input samples into odd and even parts
 This time decimation leads to scrambled order of
input data’s index n in Fig. 4-5
 Shuffling of input data is known as bit reversal
 Because scrambled order of input data index can be
obtained by reversing bits of binary representation of
normal input data index order
27
FFT Input/Output Data Index Bit Reversal
 Input index bit reversal for an 8-point FFT
Normal order of
index n
Binary bits of
index n
Reversed bits
of index n
Bit-reversed
order of index n
0 000 000 0
1 001 100 4
2 010 010 2
3 011 110 6
4 100 001 1
5 101 101 5
6 110 011 3
7 111 111 7
28
Radix-2 FFT Butterfly Structures
 Twiddle factors in Fig. 4-5
 To simplify signal flows, replace twiddle factors
with their equivalent values referenced to
where N = 8
 We show just exponents m of , to get FFT structure
shown in Fig. 4-8
 Fig. 4-8
 from Fig. 4-5 
 from Fig. 4-5 
 …
 1s and −1s in the first stage of Fig. 4-5 are
replaced by 0s and 4s, respectively
m
NW
m
NW
1
4W
2
8W
2
4W 4
8W
29
Radix-2 FFT Butterfly Structures
30
Radix-2 FFT Butterfly Structures
31
Radix-2 FFT Butterfly Structures
 Fig. 4-9
 Input data is in its normal order and output data
indices are bit-reversed
 In this case, a bit-reversal operation needs to be
performed at output of FFT to unscramble
frequency-domain results
 Fig. 4-10
 Shows an FFT signal-flow structure that avoids
bit-reversal problem altogether
32
Radix-2 FFT Butterfly Structures
33
Radix-2 FFT Butterfly Structures
 Bit reversal
 A few years ago, hardware implementations of
FFT spent most of their time performing
multiplications
 Bit-reversal process necessary to access data in
memory wasn’t a significant portion of overall FFT
computational problem
 Now that high-speed multiplier/accumulator
integrated circuits can multiply two numbers in a
single clock cycle, FFT data multiplexing and
memory addressing are more important
 Led to development of efficient algorithms to perform
bit reversal
34
Radix-2 FFT Butterfly Structures
 Decimation-in-frequency algorithm
 Decimation-in-time or -frequency is determined
by whether the DFT inputs or outputs are
partitioned (into odd and even) when deriving a
particular FFT butterfly structure from the DFT
equations
 Decimation-in-frequency butterfly structures
(analogous to structures in Figs. 4-8 through 4-
10) are illustrated in Figs. 4-11 through 4-13
 An equivalent decimation-in-frequency FFT structure
exists for each decimation-in-time FFT structure
 The number of necessary multiplications is the same
for both structures
35
Radix-2 FFT Butterfly Structures
36
Radix-2 FFT Butterfly Structures
37
Radix-2 FFT Butterfly Structures
38
Alternate Single-Butterfly Structures
 Butterfly structures
 FFT butterfly structures are direct result of
derivations of decimation-in-time and decimationin-frequency
algorithms
 Twiddle factors always take general forms shown in
Fig. 4-14(a)
39
Alternate Single-Butterfly Structures
40
Alternate Single-Butterfly Structures
 Fig. 4-14
 To implement decimation-in-time butterfly of (a),
we have to perform two complex multiplications
and two complex additions
 So we replace in (a) with to give us
simplified butterflies in (b)
 Because twiddle factors in (b) differ only by their
signs, the optimized butterflies in (c) can be used
k
N
k
N
NNjk
N
N
N
k
N
Nk
N
Nk
N
k
N
WWeWWWW
yWxy
yWxx
−=−=== →
+=
+=
−+
+
)1()(
'
'
2/22/2/tionsimplifica
2/
π
2/Nk
NW + k
NW−
41
Alternate Single-Butterfly Structures
 Optimized butterflies in 4-14(c)
 Require two complex additions but only one
complex multiplication, thus reducing
computational workload
 Because there are (N/2)log2N butterflies in an Npoint
FFT, the number of complex multiplications
performed by an FFT is (N/2)log2N
 An algorithm is decimation-in-time if the twiddle
factor precedes the −1 in optimized butterflies
 An algorithm is decimation-in-frequency if the
twiddle factor follows the −1 in optimized
butterflies
42
Alternate Single-Butterfly Structures
43
Alternate Single-Butterfly Structures
 In-place FFT algorithms
 An in-place algorithm is depicted in Fig. 4-5
 Output of a butterfly operation can be stored in
the same hardware memory locations that
previously held butterfly’s input data
 No intermediate storage is necessary
 For an N-point FFT, only 2N memory locations
are needed
 The 2 comes from fact that each butterfly node
represents a data value that has both a real and an
imaginary part
 Data routing and memory addressing are rather
complicated
44
Alternate Single-Butterfly Structures
 Double-memory FFT algorithms
 A double-memory FFT structure is depicted in
Fig. 4-10
 Intermediate storage is necessary because we no
longer have standard butterflies, and 4N memory
locations are needed
 Data routing and memory address control are
much simpler in double-memory FFT structures
than in-place technique