9 Statistical errors, confidence intervals and limits In Chapters 5-8, several methods for estimating properties of p.d.f.s (moments and other parameters) have been discussed along with techniques for obtaining the variance of the estimators. Up to now the topic of 'error analysis' lias been limited to reporting the variances (and covariances) of estimators, or eqmva-lently the standard deviations and correlation coefficients. This turns out to be inadequate in certain cases, and other ways of communicating the statistical uncertainty of a measurement must be found. After reviewing in Section 9.1 what is meant by reporting the standard deviation as an estimate of statistical uncertainty, the confidence interval is intro duced in Section 9.2. This allows for a quantitative statement, about the fraction of times that such an interval would contain the true value of the parameter in a large number of repeated experiments. Confidence intervals are treated for a number of important cases in Sections 9.3 through 9.6, and are extended to the multidimensional case in Section 9.7. In Sections 9.8 and 9.9, both Bayesian and classical confidence intervals are used to estimate limits on parameters near a physically excluded region. 9.1 The standard deviation as statistical error Suppose the result of an experiment is an estimate of a certain parameter. I he variance (or equivalently its square root, the standard deviation) of the estimator is a measure of how widely the estimates would be distributed if the experiment were to be repeated many times with the same number of observations per experiment. As such, the standard deviation cr is often reported as the statistical uncertainty of a measurement, and is referred to as the standard error. For example, suppose one has n observations of a random variable a- and a hypothesis for the p.d.f. f(x;6) which contains an unknown parameter 0. From the sample xi,...,xn a function 9(x\,...,xn) is constructed (e.g. using maximum likelihood) as an estimator for 8. Using one of the techniques discussed in Chapters 5-8 (e.g. analytic method, RCF bound, Monte Carlo, graphical) the standard deviation of 8 can be estimated. Let 0obs be the value of the estimator actually observed, and <7g- the estimate of its standard deviation. In reporting the measurement of 9 as 0Obs ± ) entered around some true value 0 and true standard deviation ct-,. whirl, an- estimated to be (?0bs and b^. For most practical estimators, the sampling p.d.f. «/(#) becomes approxi mately Gaussian in the large sample limit. If more than one paramct.T is estimated, then the p.d.f. will become a multidimensional Gaussian characterized by a covariance matrix V. Thus by estimating the standard deviation, or lor more than one parameter the covariance matrix, one effectively summarizes all of the information available about how repeated estimates would be distributed. By using the error propagation techniques of Section 1.0. the covariance matrix also gives the equivalent information, at least approximately, for functions ol tin-estimators. Although the 'standard deviation' definition of statistical error bars could in principle be used regardless of the form of the estimator's p.d.f. u„, and similarly the value vp such that there is a probability 0 to observe 9 < vg. The values ua and vp depend on the true value of 8, and are thus determined by /•CO a = P(0 > ua(9)) = / g(9;6)dO = l-G(uQ(0);e), (9.1) J«D(i) and where G is the c p = P[e^Hity/J.andfl>Uo>whirh nas ;i probability ». Jig. 9.2 Construction of the confidence interval [a,6]given an obser™ v*lue »„(0). 0 < M"). then imply respectively a(0) > 0, b(9) < 0. Equations (9.1) and (9.2) thus become P(a{9) >0) = a, P(b(0) <8)=ß, or taken together. P(a(9)<9 60^s as having equal or less agreement with the hypothesis than the result obtained (a one-sided test), then the resulting P-value of the test is a. For the confidence interval, however, the probability a is specified first, and the value a is a random quantity depending on the data. For a goodness-of-fit test, the hypothesis, here 9 = a, is specified and the P-value is treated as a random variable. Note that one sometimes calls the P-value, here equal to a, the 'confidence level' of the test, whereas the one-sided confidence interval 9 > a has a confidence level of 1 - a. That is, for a test, small a indicates a low level of confidence in the hypothesis 6 — a. For a confidence interval, small a indicates a high level of Confidence interval tor ,i Gaussian distributed estimator 123 confidence that the interval 0 > n includes the true parameti r. To avoid ronfusion we will use the term P value or (observed) significance level for goodness-of-fit tests, and reserve the term confidence level In mean the coverage probability ol a confidence interval. The confidence interval [a.b] is often expressed bj reporting the resull of a measurement as Of}, where 0 is the estimated value, and c = 0 - a and d=b-0 are usually displayed as error bars. In many cases the p.d.f. g{6\9) is approximately Gaussian, so that an interval of phis or minus one standard deviation around the measured value corresponds to a central confidence interval with 1 - 7 = 0.683 (see Section 9.3). The 68.3% central confidence interval is usually adopted as the conventional definition for error bars even when the p.d.f. of the estimator is not Gaussian. If, for example, the result of an experiment is reported as 9+} = 5.79t025i it is meant that if one were to construct the interval [9 - cj + d] according to the prescription described above in a large number of similar experiments with the same number of measurements per experiment, then the interval would include the true value 9 in 1 - a - 6 of the cases. It does not mean that the probability (in the sense of relative frequency) that the true value of 9 is in the fixed interval [5.54,6.11] is 1 - a - 0. In the frequency interpretation, the true parameter 9 is not a random variable and is assumed to not fluctuate from experiment to experiment. In this sense the probability that 9 is in [5.54,6.11] is either 0 or 1, but we do not know which. The interval itself, however, is subject to fluctuations since it is constructed from the data. A difficulty in constructing confidence intervals is that the p.d.f. of the estimator g{9\9), or equivalently the cumulative distribution G{9;9), must be known. An example is given in Section 10.4, where the p.d.f. for the estimator of the mean£ of an exponential distribution is derived, and from this a confidence interval for £ is determined. In many practical applications, estimators are Gaussian distributed (at least approximately). In this case the confidence interval can be determined easily; this is treated in detail in the next section. Even in the case of a non-Gaussian estimator, however, a simple approximate technique can be applied using the likelihood function; this is described in Section 9.6. 9.3 Confidence interval for a Gaussian distributed estimator A simple and very important application of a confidence interval is when the distribution of 9 is Gaussian with mean 9 and standard deviation a§. That is, the cumulative distribution of 9 is G{9;9,*;) = 1 27T_1(/3) = The quantiles $-1(l-a) and — /?) represent how far away the interval limits a and 6 are located with respect to the estimate 9obs in units of the standard deviation er^. The relationship between the quantiles of the standard Gaussian distribution and the confidence level is illustrated in Fig. 9.4(a) for central and Fig. 9.4(b) for one-sided confidence intervals. Consider a central confidence interval with a = ,0 = 7/2. The confidence level 1-7 is often chosen such that the quantile is a small integer, e.g. (1 -7/2) = 1,2,3,.... Similarly, for one-sided intervals (limits) one often chooses a small integer for $-x(l - a). Commonly used values for both central and one-sided intervals are shown in Table 9.1. Alternatively one can choose a round number for the confidence level instead of for the quantile. Commonly used values are shown in Table 9.2. Other possible values can be obtained from [Bra92, Fro79, Dud88] or from computer routines (e.g. the routine GAUSIN in [CER97]). Table 9.1 The values of the confidence level for different values of the quantile of the standard Gaussian $_1: for central intervals (left) the quantile $_1 (1 - t/2) and confidence level 1-7; for one-sided intervals (right) the quantile -1 (1 - a) and confidence level 1 - a. (1-7/2) 1-7 ~a) 1 - a 1 0.6827 1 0.8413 2 0.9544 2 0.9772 3 0.9973 3 0.9987 Table 9.2 The values of the quantile of the standard Caussian $_1 for different values of the confidence level: for central intervals (left) the confidence level 1 - 7 and the quantile - 7/2); for one-sided intervals (right) the confidence level 1 - a and the quantile $-'(1 - a). 1-7 *"1(l-7/2) 1 -a - a) 0.90 1.645 0.90 1.282 0.95 1.960 0.95 1.645 0.99 2.576 0.99 2.326 For the conventional 68.3% central confidence interval one has a — 0 — 7,/2, with $-1(l-7/2) = 1, i.e. a'l a error bar'. This results in the simple prescription [a, b] = [9obs - , the cumulative distribution of the standard Gaussian, since 0bs = "obs, and that from this we would like to construct a confidence interval for the mean v. For the case of a discrete variable, the procedure for determining the confidence interval described in Section 9.2 cannot be directly applied. This is because the functions ua{9) and vp(9), which determine the confidence belt, do not exist for all values of the parameter 9. For the Poisson case, for example, we would need to find ua(v) and vp(y) such that P(v > ua(v)) = a and P(i> < vp{y)) — j3 for all values of the parameter v. But if a and /? are fixed, then because v only takes on discrete values, these equations hold in general only for particular values of v. A confidence interval [a, b] can still be determined, however, by using equations (9.9). For the case of a discrete random variable and a parameter v these become a = P{u > j>obs;a), ß = P(i>obs;b), and in particular for a Poisson variable one has (9.15) a = £ /(«;<*)= 1- £ = Z- n! ' n=n0bs nDbs n = 0 "»bs in 0 -b (9.16) n = 0 For an estimate v = nobs and given probabilities a and j3, these equations can be solved numerically for a and b. Here one can use the following relation between the Poisson and x2 distributions, ylL-e-" = / fxi{;;nd = 2{nobi+l))dz - 1 - Fxi{2v; nd - 2(nobs + 1)), (9.17) where /x, is the * p.d.f. for nd degrees of freedom and Fx> is the corresponding cumulative distribution. One then has a = lF-31{a;nd = 2nobs.).. 2 X (9.18) Quantiles F~? of the \2 distribution can be obtained from standard tables (e.g. in [Bra92]) or from computer routines such as CHISIN in [CER97]. Some values for nohs = 0,..., 10 are shown in Table 9.3. Note that the lower limit a cannot be determined if n0bs = 0. Equations (9.15) say that if v — a (i/ = b), then the probability is a (/?) to observe a value greater (less) than or equal to the one actually observed. Because the case of equality, v — v0bs, is included in the inequalities (9.15), one obtains a conservatively large confidence interval, i.e. P(v>a) > 1-a, p{v < b) > l-ß, P{a 1-a-ß. (9.19) • i ;= «V,pn the observed number nobs is zero, and one An important special case is when the ooserv becomes is interested in establishing an upper limit b. Equation [9.1*) ° b"e-b _6 n=0 (9.20) 130 Statistical errors, confidence intervals and limits for p simply by using the inverse of the transformation (9.22), i.e. .4 = tanlir; and B = tanh 6. Consider for example a sample of size n = 20 for which one has obtained the estimate r = 0.5. From equation (5.17) the standard deviation of r can be estimated as &r = (1 - r'-)/v/» = 0.168. If one were to make the incorrect approximation that r is Gaussian distributed for such a small sample, this would lead to a 68.3% central confidence interval for p of [0.332, 0.668], or [0.067,0.933] at a confidence level of 99%. Thus since the sample correlation coefficient r is almost three times the standard error ) = logL„ 2 (9.37) As in the single-parameter case, one can still use the prescription given by (9.37) even if the likelihood function is not Gaussian, in which case the probability statement (9.34) is only approximate. For an increasing number of parameters, the approach to the Gaussian limit becomes slower as a function of the sample size, and furthermore it is difficult to quantify when a sample is large enough for (9.34) to apply. If needed, one can determine the probability that a region Multidimensional confidence regions 135 constructed accordmg to (9.37) includes the true parameter by means of a Monte Carlo calculation. _ F(,~in _y.n) for several confidence Quantiles of the r distribution Q7 = * i 1 7, »M > -levels 1 -7 and n = 1,2,3,4,5 parameters are given in hbkf confidence level are shown for various values of the quantile Q-, in 1 able 9,). Table 9.4 The values of the confidence level 1 n - 1,2,3,4,5 fitted parameters. for different values of Qi and for 1-7 n = 1 n = 2 n — 3 n = 4 n — 0 1.0 0.683 0.393 0.199 0.090 0.037 2.0 0.843 0.632 0.428 0.264 0.151 4.0 0.954 0.865 0.739 0.594 0.451 9.0 0.997 0.989 0.971 0.939 0.891 Table 9.5 The values of the quantile Q1 for different vah.es of the confidence level 1 - n, for n =;1,2,3,4,5 fitted parameters. 1-7 Q- n = 1 n = 2 n = 3 1! - 4 n — 5 lh683^ 1.00 2.30 3.53 4.72 5.89 0.90 2.71 4.61 6.25 7.78 9.24 0.95 3.84 5.99 7.82 9.49 11.1 0.99 6.63 9.21 11.3 13.3 15.1 For n = 1 the expression (9.36) for Q7 can be shown to imply 7/2), (9.38) where S"1 is the inverse function of the standard normal distribution. The procedure here thus reduces to that for a single parameter given ir.Section J.b where N = JqZ is the half-width of the interval in standard deviations (see wnere iv = ^/y7 is me uoii-«iu».. . equations (9.28), (9.29)). The values for n = 1 in Tables 9.4 and 9.5 are thus related to those in Tables 9.1 and 9.2 by equation (9.38). For increasing n, the confidence level for a given Qy decreases. For example, in the single-parameter case, Q7 = 1 corresponds to 1 — 7 = 0.683. For n = 2, Qy = 1 gives a confidence level of only 0.393, and in order to obtain 1 — 7 = 0.683 one needs Q7 = 2.30. We should emphasize that, as in the single-parameter case, the confidence region Q(8,9) < is a random region in #-space. The confidence region varies upon repetition of the experiment, since 9 is a random variable. The true parameters, on the other hand, are unknown constants. 136 Statistical errors, confidence intervals and limits Limits near a physical boundary 137 9.8 Limits near a physical boundary Often the purpose of an experiment is to search for a new effect, the existence of which would imply that a certain parameter is not equal to zero. For example, one could attempt to measure the mass of the neutrino, which in the standard theory is massless. If the data yield a value of the parameter significantly different from zero, then the new effect has been discovered, and the parameter's value and a confidence interval to reflect its error are given as the result. If, on the other hand, the data result in a fitted value of the parameter that is consistent with zero, then the result of the experiment is reported by giving an upper limit on the parameter. (A similar situation occurs when absence of the new effect corresponds to a parameter being large or infinite; one then places a lower limit. For simplicity we will consider here only upper limits.) Difficulties arise when an estimator can take on values in the excluded region. This can occur if the estimator 8 for a parameter 8 is of the form 9 = x — y, where both x and y are random variables, i.e. they have random measurement errors. The mass squared of a particle, for example, can be estimated by measuring independently its energy E and momentum p, and using ro2 = E7 —p2 ■ Although the mass squared should come out positive, measurement errors in E. and p could result in a negative value for m2. Then the question is how to place a limit on m2, or more generally on a parameter 9 when the estimate is in or near an excluded region. Consider further the example of an estimator 8 — x — y where x and y are Gaussian variables with means /iT, \iy and variances ff2., o~7. One can show that the difference 9 = x — y is also a Gaussian variable with 9 = fJix — l*y and 0"? — c2 -f- o~2. (This can be shown using characteristic functions as described in Chapter 10.) Assume that 6 is known a priori to be non-negative (e.g. like the mass squared), and suppose the experiment has resulted in a value #0bs for the estimator 8. According to (9.12), the upper limit 0up at a confidence level 1 — 0 Jup = 0ohs + (T^-l{\- ß). (9.39) For the commonly used 95% confidence level one obtains from Table 9.2 the quantile $_1(0.95) = 1.645. The interval (—oo, 8up] is constructed to include the true value 9 with a probability of 95%, regardless of what 9 actually is. Suppose now that the standard deviation is even if the estimate is in the physically excluded region. In this way, the average of many experiments (e.g. as in Section 7.6) will converge to the correct value as long as the estimator is unbiased. In cases where the p.d.f. of 8 is significantly non-Gaussian, the entire likelihood function L(9) should be given, which can be combined with that of other experiments as discussed in Section 6.12. Nevertheless, most experimenters want to report some sort of upper limit, and in situations such as the one described above a number of techniques have been proposed (see e.g. [Hig83, Jam91]). There is unfortunately no established convention on how this should be done, and one should therefore state what procedure was used. As a solution to the difficulties posed by an upper limit in an unphysical region, one might be tempted to simply increase the confidence level until the limit enters the allowed region. In the previous example, if we had taken a confidence level 1-/3 = 0.99, then from Table 9.2 one has *_1(0.99) = 2.326, giving #up = 0.326. This would lead one to quote an upper limit that is smaller than the intrinsic resolution of the experiment (-1 (0.97725) = 2.00001, or 0up = 10-5 at a confidence level of 97.725%! In order to avoid this type of difficulty, a commonly used technique is to simply shift a negative estimate to zero before applying equation (9.39), i.e. 8up = max(6>obs, 0) + (1 - 0). (9.40) In this way the upper limit is always at least the same order of magnitude as the resolution of the experiment. If 0obs is positive, the limit coincides with that of the classical procedure. This technique has a certain intuitive appeal and is often used, but the interpretation as an interval that will cover the true parameter value with probability 1 — 0 no longer applies. The coverage probability is clearly greater than 1 — 0, since the shifted upper limit (9.40) is in all cases greater than or equal to the classical one (9.39). Another alternative is to report an interval based on the Bayesian posterior p.d.f. p{9\x). As in Section 6.13, this is obtained from Bayes' theorem, p(0\x) = L(x\0)*(0) fL(x\0')n{0')d91' (9.41) 138 Statistical errors, confidence intervals and limits Upper limit on the mean of Poisson variable with background 139 where x represents the observed data, L(x\9) is the likelihood function and v(6) is the prior p.d.f. for 0. In Section 6.13, the mode of p(0|x) was used as an estimator for 0. and it was shown that this coincides with the ML estimator if the prior density >r(0) is uniform. Here, we can use p(f?|x) to determine an interval [a, 6] such that for given probabilities 0, can easily be incorporated by setting the prior p.d.f. ~(8) to zero in the excluded region. Bayes' theorem then gives a posterior probability p(0|x) with p(0|x) = 0 for 9 < 0. The upper limit is thus determined by 1 - 8 p{9\x)d9 = Jil L[x\0)-(p)dO (9.43) JZj^\0)n(9)d9- The difficulties hero have already been mentioned in Section 6.13, namely that there is no unique way to specify the prior density ~(0). A common choice is ir(0) =. 0 9 < 0 1 9 > 0. (9.44) The prescription says in effect: normalize the likelihood function to unit area in the physical region, and then integrate it out to 9up such that the fraction of area covered is 1 — 0. Although the method is simple, it has some conceptual drawbacks. For the case where one knows 9 > 0 (e.g. the neutrino mass) one does not really believe that 0 < 9 < 1 has the same prior probability as 10 40 < 9 < 1040 + 1. Furthermore, the upper limit derived from it (6) = constant is not invariant with respect to a nonlinear transformation of the parameter. It has been argued [Jef48] that in cases where 0 > 0 but with no other prior information, one should use < o > 0. (9.45) This has the advantage that upper limits are invariant with respect to a transformation of the parameter by raising to an arbitrary power. This is equivalent to a uniform (improper) prior of the form (9.44) for log 9. For this to be usable. however, the likelihood function must go to zero for 9 —>■ 0 and 9 —>■ oo, or else the integrals in (9.43) diverge. It is thus not applicable in a number of cases of practical interest, including the example discussed in this section. Therefore, despite its conceptual difficulties, the uniform prior density is the most commonly used choice for setting limits on parameters. Figure 9.8 shows the upper limits at 95% confidence level derived according to the classical, shifted and Bayesian techniques as a function of 9obs = x — y for <7fl- = 1. For the Bayesian limit, a prior density it(9) = constant was used. The shifted and classical techniques are equal for 9obs > 0. The Bayesian limit is always positive, and is always greater than the classical limit. As 9obs becomes larger than the experimental resolution cr^-, the Bayesian and classical limits rapidly approach each other. o in — classical - shifted Bayesian, it(9) = const. Fig. 9.8 Upper limits at 95% confidence level for the example of Section 9.8 using the classical, shifted and Bayesian techniques. The shifted and classical techniques are equal for ft,h, > 0. 9.9 Upper limit on the mean of Poisson variable with background As a final example, recall Section 9.4 where an upper limit was placed on the mean v of a Poisson variable n. Often one is faced with a somewhat more complicated situation where the observed value of n is the sum of the desired signal events ns as well as background events nh, n = ns + nb, (9.46) where both ns and nb can be regarded as Poisson variables with means vs and I'h, respectively. Suppose for the moment that the mean for the background v\> is known without any uncertainty. For i/s one only knows a priori that z/5 > 0. The goal is to construct an upper limit for the signal parameter i/s given a measured value of n. Since n is the sum of two Poisson variables, one can show that it is itself a Poisson variable, with the probability function 140 Statistical errors, confidence intervals and limits Upper limit on the mean of Poisson variable with background 141 /(n;y.,yb)=(l/' + ,yb>We-C.+^). The ML estimator for (9.47) (9.48) which has zero bias since E[n] = vs + ;/,,. Equations (9.15), which are used to determine the confidence interval, become a = P(!/s > j>°bs; i/sto)= n>nob, ß = P{vs < f/sobs;j/,p) = V ns comes out negative. In this way the average of many experiments will converge to the correct value. If, in addition, one wishes to report an upper limit on us, the Bayesian method can be used with, for example, a uniform prior density [Hel83]. The likelihood function is given by the probability (9.47), now regarded as a function of vs. "obs! (9.51) The posterior probability density for vs is obtained as usual from Bayes' theorem, £("obsK) 7r(l/8) Pl^sKbs) (9.52) o ii co. 12 10 . I1"— 1-—I-1 -1 (b) X X6 events observed \\\ x ••...... -•• - o ~ —- ~--,-- _i '-1-1-1- 10 12 Fig. 9.9 Upper limits j/"p at a confidence level of 1 -0 = 0.95 for different numbers of events observed nobs and as a function of the expected number of background events i/b. (a) The classical limit, (b) The Bayesian limit based on a uniform prior density for v,. Taking it{vt) to be constant for vs > 0 and zero for vs < 0, the upper limit 2/sup at a confidence level of 1 — [3 is given by l-ß _ Jo"' L{n0\^\vs) dv% L(nobs\iss) dvs (9.53) The integrals can be related to incomplete gamma functions (see e.g. [Arf95]), or since nobs is a positive integer, they can be solved by making the substitution x = vs + i/b and integrating by parts nobs times. Equation (9.53) then becomes ß = (9.54) This can be solved numerically for the upper limit fsup. The upper limit as a function of i/b is shown in Fig. 9.9(b) for various values of n.obs. For the case without background, setting vb = 0 gives ß = e-v "ob« "L n=0 Kupf (9.55) which is identical to the equation for the classical upper limit (9.16). This can be seen by comparing Figs 9.9(a) and (b). The Bayesian limit is always greater than or equal to the corresponding classical one, with the two agreeing only for Vb - 0. 142 Statistical errors, confidence intervals and limits The agreement for the case without background must be considered accidental, however, since the Bayesian limit depends on the particular choice of a constant prior density n(vs). Nevertheless, the coincidence spares one the trouble of having to defend either the classical or Bayesian viewpoint, which may account for the general acceptance of the uniform prior density in this case. Often the result of an experiment is not simply the number n of observed events, but includes in addition measured values x\,..., xn of some property of the events. Suppose the probability density for x is Vbfb(x) Vb (9.56) where the components fs(x) for signal and fb{x) for background events are both assumed to be known. If these p.d.f.s have different shapes, then the values of x contain additional information on whether the observed events were signal or background. This information can be incorporated into the limit vs by using the extended likelihood function, L{vs) {Vb + O" -(*.+«/„) TT "s/sfot) + VbfbjXi) n! e-(".+"b) „2, n JJ [vsfs{Xi) + Vbfb{Xi (9.57) as defined in Section 6.9, or by using the corresponding formula for binned data as discussed in Section 6.10. In the classical case, one uses the likelihood function to find the estimator vs. In order to find the classical upper limit, however, one requires the p.d.f. of £>s. This is no longer as simple to find as before, where only the number of events was counted, and must in general be determined numerically. For example, one can perform Monte Carlo experiments using a given value of vs (and the known value vb) to generate numbers ns and nb from a Poisson distribution, and corresponding x values according to /s(x; i/s) and fb{x; fb)- By adjusting vB, one can find that value for which there is a probability 0 to obtain i>5 < v°hs. Here one must still deal with the problem that the limit can turn out negative. In the Bayesian approach, L(vs) is used directly in Bayes' theorem as before. Solving equation (9.53) for i/"p must in general be done numerically. This has the advantage of not requiring the sampling p.d.f. for the estimator i>s, in addition to the previously mentioned advantage of automatically incorporating the prior knowledge v% > 0 into the limit. Further discussion of the issue of Bayesian versus classical limits can be found in [Hig83, Jam91, Cou95]. A technique for incorporating systematic uncertainties in the limit is given in [Cou92]. 10 Characteristic functions and related examples 10.1 Definition and properties of the characteristic function The characteristic function 0x(k) for a random variable x with p.d.f. /(*) is defined as the expectation value of cikx. J>x(k) = E{eik*} = ikx f(x)dx. (10.1) This is essentially the Fourier transform of the probability density function. It is useful in proving a number of important theorems, in particular those involving sums of random variables. One can show that there is a one-to-one correspondence between the p.d.f. and the characteristic function, so that knowledge of one is equivalent to knowledge of the other. Some characteristic functions of important p.d.f.s are given in Table 10.1. Suppose one has n independent random variables Xi,...,x„, with p.d.f.s /l(*i)> • • •, fn(xn), and corresponding characteristic functions i(k), ■ • •' and consider the sum z = £V xi ■ The characteristic function <£z (k) for z is related to those of the Xi by ^(Ifc) = j...Jexp^k^x^fi(xi):.J„{xn)dxi./:dxn = J e^fi^dxi ...J eikx"fn(xn)dxn =