Digital Signal Processing Statistics, Probability and Noise Moslem Amiri, Václav Přenosil Masaryk University Resource: “The Scientist and Engineer's Guide to Digital Signal Processing” (www.dspguide.com) B St W S ithBy Steven W. Smith Signal and Graph Terminology  Signal Signal  A description of how one parameter is related to  another one  Continuous signal  Both parameters can assume a continuous range of values  Discrete signal or digitized signal  Signals formed from quantized parameters 2 Signal and Graph Terminology  Two discrete signals Two discrete signals  DomainDomain  Type of parameter on horizontal axis 3 Mean and Standard Deviation  Mean (µ) Mean (µ)  Average value of a signal (N samples) 1 1 N  Standard deviation (σ)   0 1 i ix N  Standard deviation (σ)  A measure of how far the signal fluctuates from mean  Only measures AC portion of a signaly p g  Variance (σ2): power of this fluctuation  1 22 )( 1 N  rms (root‐mean‐square)     0 22 )( 1 i ix N   Measures both AC and DC components of a signal  A signal with no DC → rms = σ 4 Mean and Standard Deviation  Relationship between σ and peak to peak value of Relationship between σ and peak‐to‐peak value of  several waveforms 5 Mean and Standard Deviation  Two limitations of calculation using Two limitations of calculation using      1 0 22 )( 1 1 N i ix N     1 0 1 N i ix N   If µ >> σ → subtracting two numbers that are very close  in value → excessive round off error 01 iN0iN  Running statistics → requires all samples be involved in  each new calculation  Solution ])( 1 [ 1 1 2 11 22    N i N i x N x N  1 00   i i i i NN 6 Mean and Standard Deviation  In some situations In some situations  Mean (µ) = what is being measured  Standard deviation (σ) = noise and other interferenceStandard deviation (σ) = noise and other interference  Signal‐to‐noise ratio (SNR) = µ/σ  Coefficient of variation (CV) = (σ/µ)*100%Coefficient of variation (CV)   (σ/µ) 100%   7 Signal vs. Underlying Process  Statistics Statistics  Interpreting numerical data such as acquired signals  Statistical noiseStatistical noise  Statistics of acquired signal change each time experiment is  repeated  Probability  Used to understand processes that generate signals  Probabilities of underlying process are constant  Typical error in calculating mean of an underlying  d t t ti ti l iprocess due to statistical noise 1 Error   8 2 N Signal vs. Underlying Process  Error in mean Error in mean  Reduces value of σ  To compensate for thisTo compensate for this 2 1 0 22 1 0 2 )( 1 1 )( 1          N i i N i i x N x N  Left equation → σ of acquired signal  Right equation → an estimate of σ of underlying process 00  ii g q y g p 9 Signal vs. Underlying Process  Nonstationary signals Nonstationary signals  Are not a result of statistical noise  Generated from underlying process changingGenerated from underlying process changing  Problem: slowly changing µ interferes with calculating σ  Solution: breaking signal into short sections – calculating  statistics for each section – averaging σs  10 The Histogram, Pmf and Pdf  Histogram Histogram  Displays number of samples having each possible value 11 The Histogram, Pmf and Pdf  Histogram Histogram  Represented by Hi, where i is index for value of sample  H is number of samples that have a value of iHi is number of samples that have a value of i  If M is number of points in histogram and N number of  points in signalp g M i iHN 1 0     M i iiH N 1 0 1      Histogram is formed from an acquired signal → noisy i M i Hi N 2 1 0 2 )( 1 1         Histogram is formed from an acquired signal → noisy 12 The Histogram, Pmf and Pdf  Probability mass function (pmf) Probability mass function (pmf)  Curve for underlying process  What would be obtained with an infinite number ofWhat would be obtained with an infinite number of  samples  can be estimated from histogram or by some g y mathematical technique  Vertical axis of pmf expressed on a fractional basis: each  l i hi t di id d b t t l b f lvalue in histogram divided by total number of samples  Sum of all values in pmf = 1 13 The Histogram, Pmf and Pdf  Probability density (or distribution) function (pdf) Probability density (or distribution) function (pdf)  Pdf is to continuous signals what pmf is to discrete ones  Indicates signal can take on a continuous range of valuesIndicates signal can take on a continuous range of values  Vertical axis of pdf is in units of probability density  Total area under pdf curve = = 1  curveTotal area under pdf curve                 1 curve 14 The Histogram, Pmf and Pdf 15 The Histogram, Pmf and Pdf 16 The Histogram, Pmf and Pdf  Problem in calculating histogram Problem in calculating histogram  Number of levels each sample can take on >> number of  samples in signalp g  Always true for signals represented in floating point  notation  Previously described approach involves counting number  of samples that have each of possible quantization levels N t ibl ith fl ti i t d t → billi f ibl Not possible with floating point data → billions of possible  levels nearly all of which have no samples  Solution: binningSolution: binning  Done by arbitrarily selecting length of histogram to be some  convenient number called bins  Value of each bin → total number of samples having a value  within a certain range 17 The Histogram, Pmf and Pdf 18 The Normal Distribution  Signals formed from random processes Signals formed from random processes  Usually have a bell shaped pdf  Called normal distribution or Gauss distributionCalled normal distribution or Gauss distribution  Basic shape of curve generated from 2 )( x exy    Adding adjustable µ and σ + normalizing (area under  curve = 1) → general form of normal distribution )(y 22 2/)( 2 1 )(     x exP 19 The Normal Distribution 20 The Normal Distribution  Cumulative distribution function (cdf) Cumulative distribution function (cdf)  Integral of pdf  Used to find probability that a signal is within a certain rangeUsed to find probability that a signal is within a certain range  of values  Problem with Gaussian  Cannot be integrated using elementary methods  Solution: calculating by numerical integration – providing a  table for use in calculating probabilities 21 The Normal Distribution 22 Digital Noise Generation  Generate signals that resemble various types of Generate signals that resemble various types of  random noise  To test performance of algorithms that must work in p g presence of noise  Random number generator  Heart of digital noise generation 23 Digital Noise Generation  First method First method  Central limit theorem  A sum of random numbers becomes normally distributed asA sum of random numbers becomes normally distributed as  more and more of random numbers are added together   In Programming languagesg g g g  X = RND → 0 < X < 1, uniform distribution  29.012/1,5.0    X = RND + RND → 0 < X < 2, triangular distribution  6/1,1    X = RND + … + RND (12 times), 0 < X < 12, Gaussian  distribution  16    Can be used to create a normally distributed noise signal 24 1,6   Digital Noise Generation  For each sample in signal For each sample in signal  Add twelve random numbers  Subtract six to make mean equal to zeroSubtract six to make mean equal to zero  Multiply by standard deviation desired  Add desired meanAdd desired mean 25 Digital Noise Generation 26 Digital Noise Generation  Second method Second method  Random number generator invoked twice to obtain R1 and R22  A normally distributed random number, X, is found )2cos()log2( 2 2/1 1 RRX e    To generate normally distributed random signals )()g( 21e 1,0    Take each number generated by equation above  Multiply it by desired standard deviation  Add desired mean 27 Precision and Accuracy  Example Example  An oceanographer measuring water depth using a sonar  systemy  Location is exactly 1000 meters deep (true value)  Oceanographer takes many successive readingsg p y g  Measurements are arranged as following histogram 28 Precision and Accuracy  Accuracy Accuracy  Amount of shift of mean from true value  PrecisionPrecision  Width of distribution, expressed by standard deviation,  signal‐to‐noise ratio, or CVg ,  Poor repeatability  A measurement that has good accuracy, but poor  precision  Poor precision results from random errors  Random errors change each time measurement is repeated  Precision is a measure of random noise A i l t l i i i Averaging several measurements always improves precision 29 Precision and Accuracy  precise measurement but with poor accuracy precise measurement but with poor accuracy  Poor accuracy results from systematic errors  Systematic errors become repeated in exactly same mannerSystematic errors become repeated in exactly same manner  each time measurement is conducted  Accuracy dependant on how system is calibrated  Averaging individual measurements does not improve accuracy 30