Statistical aspects of quantitative real-time PCR experiment design Robert R. Kitchen a , Mikael Kubista b,c , Ales Tichopad c,d,* a School of Physics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK b TATAA Biocenter, Odinsgatan 28, 411 03 Göteborg, Sweden c Institute of Biotechnology AS CR, Lb Building, 2nd Floor, Videnska 1083, 142 20 Prague 4, Czech Republic d Technical University Munich, Physiology Weihenstephan, Weihenstephaner Berg 3, 85354 Freising-Weihenstephan, Germany a r t i c l e i n f o Article history: Accepted 18 January 2010 Available online 28 January 2010 Keywords: Real-time PCR qPCR Gene expression Experiment design Sampling plan Error propagation Statistical power Prospective power analysis Nested analysis of variance powerNest a b s t r a c t Experiments using quantitative real-time PCR to test hypotheses are limited by technical and biological variability; we seek to minimise sources of confounding variability through optimum use of biological and technical replicates. The quality of an experiment design is commonly assessed by calculating its prospective power. Such calculations rely on knowledge of the expected variances of the measurements of each group of samples and the magnitude of the treatment effect; the estimation of which is often uninformed and unreliable. Here we introduce a method that exploits a small pilot study to estimate the biological and technical variances in order to improve the design of a subsequent large experiment. We measure the variance contributions at several ‘levels’ of the experiment design and provide a means of using this information to predict both the total variance and the prospective power of the assay. A validation of the method is provided through a variance analysis of representative genes in several bovine tissuetypes. We also discuss the effect of normalisation to a reference gene in terms of the measured variance components of the gene of interest. Finally, we describe a software implementation of these methods, powerNest, that gives the user the opportunity to input data from a pilot study and interactively modify the design of the assay. The software automatically calculates expected variances, statistical power, and optimal design of the larger experiment. powerNest enables the researcher to minimise the total confounding variance and maximise prospective power for a specified maximum cost for the large study. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction 1.1. The importance of experiment design The typical quantitative real-time PCR (qPCR) experiment is designed to test the hypothesis that there is no difference in the expression of a gene between two or more subpopulations; this is based on experiments performed on representative groups of biological subjects that, for example, exhibit different phenotypic traits or have been exposed to different treatments [1–5]. If this hypothesis is unlikely to be true, the alternative hypothesis is supported stating that there is a difference between the subpopulations. The ability of the researcher to obtain a statistically significant result from the testing of these hypotheses is governed by three factors: (1) the ‘treatment effect’, that is the magnitude of the mean differential expression between the chosen subpopulations; (2) the inherent and expected ‘biological variability’ in the expression of the gene between subjects randomly selected from within each subpopulation; (3) the ‘technical noise’ introduced through measurement error. The larger the treatment effect, the easier it becomes to resolve from the confounding noise. Biological variability is generally unavoidable, but one can seek to minimise its impact by randomly selecting large numbers of subjects (biological replicates) from each subpopulation. Technical noise can be minimised through careful lab practice, the use of technical replicates, and the addition of appropriate controls [6]. The concept of treatment effect and measurement variability is the basis of statistical power. The power of a statistical test is the probability of rejecting the null hypothesis, given that the null is false and the alternative hypothesis is true [7]. In other words, the power is the biological resolution of the experiment; it quantifies the likelihood of being able to resolve any differential expression between treatment groups based on the variance of available measurements. Power increases with increasing magnitude of the differential expression, increasing number of biological replicates, increasing measurement precision, and decreasing biological variability. The objective is to maximise the statistical resolution of the 1046-2023/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.ymeth.2010.01.025 * Corresponding author. Address: Technical University Munich, Physiology Weihenstephan, Weihenstephaner Berg 3, 85354 Freising-Weihenstephan, Germany. Fax: +49 8161 71 4204. E-mail address: ales@tichopad.de (A. Tichopad). Methods 50 (2010) 231–236 Contents lists available at ScienceDirect Methods journal homepage: www.elsevier.com/locate/ymeth assay, by minimising the confounding variance in the measured experiment data, such that a determination of the treatment effect can be more confidently reported. It may sometimes be the case that results collected with an assay appear more reproducible where small numbers of biological and technical replicates are employed. This apparent increase in precision and power is illusory, however, and significant results may simply reflect the chance fluctuations in the particular subjects or measurement processes chosen for the experiment [8]. It is generally considered good experimental practice to vary the conditions of the assay, by sampling multiple subjects and analysing multiple technical replicates, to increase the chance that the statistical significance of the results obtained is real and reproducible in different settings [9]. 1.2. qPCR experiment design and error propagation Between the selection of subjects from the subpopulations and the gathering of expression data by qPCR there are several steps of sample-preprocessing that are necessary to prepare the genetic material for analysis. These procedures, illustrated in Fig. 1, are typically: (1) the sampling of material from each subject and the extraction of the nucleic acid; (2) in the case of RNA analysis, the reverse-transcription (RT) of RNA to convert it into cDNA; (3) the amplification of the cDNA by qPCR. Some protocols may include additional steps such as fixation of the sample, transportation, and storage. All of these procedures are susceptible to the introduction of error [10] and, combined, they represent the technical noise in the obtained RT qPCR measurement. In addition to the biological variability between subjects, these sources of technical noise all contribute to the total variance of the measured expressions reported by the qPCR. The minimisation of this variance can be achieved through effective, informed experiment designs and sampling plans that employ replicates where they are expected to have the greatest benefit. The challenge is therefore to design experiments with the optimal number of biological and technical replicates such that the statistical power is maximised and sufficient to test a biological hypothesis, while maintaining an affordable and realistic sampling plan. It is assumed that technical noise introduced into the experiment from each subsequent stage in the sample-preprocessing procedure is independent and, as such, the effect on the overall noise of the assay is additive. Namely, the magnitude of error introduced due to pipetting, uncertainties in instrument readings, and chemical noise in the different processing steps are not considered to be interdependent. There are, however, a few exceptions where this assumption is invalid; for example interference due to the presence of an inhibitor may not be independent if the same inhibitor impairs the performance of several steps of the sample processing. 1.3. Focus of the paper Due to the large scope for the introduction of error into qPCR measurements and results, it is not only essential that experiments are well conducted and validated, but also that they are carefully designed and documented; this enables the researcher to maximise the likelihood of accurately and reproducibly reporting interesting biological phenomena [11]. Power analyses afford the researcher a valuable tool with which to estimate the resolution of the assay, in terms of its ability to test a specified hypothesis while the experiment is still in the design phase. The importance and utility of these prospective power analyses is universally accepted, however, the calculation of power is often reduced to mere guesswork due to the fact that, by definition, the magnitude of the effect to be studied and the measurement variation in the prospective assay cannot be precisely known [12]. After the assay has been performed and data are available, however, these variables are known (at least in terms of the set of samples analysed) allowing a more accurate calculation of power. Such retrospective power calculations have been shown to be useful, although the measured effect size is often less informative than the variance estimated from the data [13]. In this paper we describe a method of estimating the components of biological variation and technical noise directly from qPCR measurements. This is achieved through the exploitation of a small number of biological and technical replicates at each stage of the sample processing procedure; these stages being the inter-subject, inter-sample, inter-RT, and inter-qPCR. These biological replicates form a pilot study to a larger, prospective investigation and, as such, should be drawn randomly from a larger cohort of subjects that are to be used as the basis of the future investigation. The variance components estimated from this small pilot study are used to determine the optimal experiment design and sampling plan for the subsequent prospective study. We further exploit these measured variances by including them in the prospective power calculation for the future study, providing a more accurate, evidencebased, estimate of the expected experiment error. As a validation of the method, variance components are estimated for several genes from a number of bovine tissue-types and contrasted with the components from the same data following normalisation to a reference gene. We use these data to qualitatively assess the utility of technical replication at only the qPCR level; a common practice in qPCR assay design and one for which the rationale is unclear, except perhaps as a low-cost insurance against a failed PCR. Finally, we present powerNest; a software implementation of these methods that provides an intuitive and efficient means of optimising the sampling plan given the data from such a pilot study. 2. Description of method 2.1. Model We define the model for any given Cq measurement by qPCR based on four processing steps that account for both the biological Fig. 1. Example 2  3  3  3 experiment design. Example experiment design for 2 subjects belonging to a single subpopulation. Three qPCRs are performed for each of 3 RTs of 3 samples from each subject. The result is 27 Cq measurements for each subject, 54 in total for the subpopulation. From this design, variance components at each stage of sampling can be estimated. Mouse image appears courtesy of the Wellcome Trust Sanger Institute. 232 R.R. Kitchen et al. / Methods 50 (2010) 231–236 variability and the technical noise that influence the measured value. These are the choice of subjects from the subpopulation, the replicate samples extracted from each subject, the replicate RTs of mRNA from the same sample, and the replicate qPCRs of cDNAs from the same RT tube. These effects are all assumed to be independent and randomly drawn from normal distributions on the logarithmic scale [14,15]. Although the introduction of biological and technical noise at each of the sampling levels is independent, the observed variances are not. The variation introduced at a given level propagates additively throughout subsequent levels, allowing these variance contributions to be modelled. All factors therefore meet in unique combinations and so a nested, or hierarchical, model of additive noise is applied to the measured Cq values [16] such that any given measured Cq can be expressed as Cqgijkl ¼ lg þ agi þ bgij þ cgijk þ dgijkl; ð1Þ where lg is the geometric-mean expression of the gene in the gth subpopulation (which is equivalent to the arithmetic average of the fold-change or expression of the gene on the log-scale), agi is the random effect of the ith subject in the gth subpopulation, bgij is the random effect of the jth sample extracted from the ith subject in the gth subpopulation, cgijk is the random effect of the kth RT reaction from the jth sample extracted from the ith subject in the gth subpopulation, and dgijkl is the random effect of the lth qPCR from the kth RT of the jth sample extracted from the ith subject in the gth subpopulation. This model was previously justified in terms of its application to qPCR experiment design in [17]. The total variance of any given Cq measurement follows directly from the model in Eq. (1) and is defined as r2 Cq ¼ r2 g þ r2 i þ r2 j þ r2 k þ r2 l : ð2Þ In simpler terms, the expected variance of each measurement can be divided into two categories; the first is the treatment variation between subpopulations that is expressed by the r2 g term; the second is confounding biological variance and processing noise that is encompassed by the sum of the remaining variance components corresponding to inter-subject, inter-sample, inter-RT, and inter-qPCR variation. To maximise the statistical power of the assay one should minimise the confounding variance to be able to accurately resolve the treatment effect. The variance model in Eq. (2) is used to define a nested analysis of variance (nested-ANOVA) that produces estimates of each of the four modelled components of variance. The calculation of these variance components is performed as described in [18], by a procedure essentially based on the subtraction of the sum-squared variations of each level from that of the respective immediate higher level. The relative contribution of each component, vcx, to the total variance is expressed as a percentage: vcx ¼ 100  r2 x =ðr2 i þ r2 j þ r2 k þ r2 l Þ; ð3Þ where x = i, j, k, or l. 2.2. Experiment optimisation In terms of the optimisation of the experimental design, it is the objective to minimise the total expected technical and biological variation within each treatment group, g, which is defined as ^r2 Cqg ¼ s2 i =ni þ s2 j =ninj þ s2 k=ninjnk þ s2 l =ninjnknl ð4Þ where s2 i ; s2 j ; s2 k ; and s2 l are the standard deviations of the subject, sample, RT, and qPCR levels, respectively, estimated from the pilot data. Additionally ni is the number of subjects, nj is the number of replicate samples from each subject, nk is the number of replicate RTs from each sample, and nl is the number of replicate qPCRs from each RT. By varying the n replicates at each level the ^r2 Cqg can be changed. The optimal design is the one in which ^r2 Cqg is minimised. The inclusion of a financial cost into the calculation of the optimal design is trivial; the total cost of the experiment is CT ¼ cini þ cjninj þ ckninjnk þ clninjnknl ð5Þ where ci, cj, ck, and cl are the costs of producing a subject, sample, RT, and qPCR. 2.3. Statistical power The power of a statistical test is the probability of rejecting the null hypothesis, given that the null is false and the alternative hypothesis is true. Power is simply a restatement of the Type II error rate, b, of falsely accepting a null hypothesis; power = (1 À b). The power depends on two factors; the significance criterion and the effect size. The significance criterion, a, is the Type I error rate of falsely rejecting a null hypothesis and must be specified before the power can be calculated. The a is often referred to as the rate of false-positives and the b as the rate of false-negatives. The a and b symbols used in terms of the significance criterion and power bear no relation to the agi and bgij introduced in Eq. (1). For the purposes of this method only two classes of power calculation are considered; that used for the testing of the average expression of a single subpopulation in terms of a difference from a prespecified value, and that used for the testing of the means of two subpopulations in terms of the difference from each other. The effect size, d1, in the case of a comparison of the mean expression of a single subpopulation from some pre-determined value, c, is simply d1 ¼ ðmA À cÞ=rA ð6Þ where mA and rA are the mean and standard deviation of the subpopulation, respectively, and correspond directly to lg in Eq. (1) and ^r2 Cqg in Eq. (4). The effect size, d2, in the case of a comparison of two subpopulations with unequal variances is defined as the difference between the means of the subpopulations divided by the precision of the measurement of each, thus d2 ¼ jmA À mBj=½ðr2 A þ r2 BÞ=2Šð1=2Þ ð7Þ where mA and r2 A are the mean and variance of one subpopulation and mB and r2 B are the mean and variance of the second subpopulation. Given the number of samples in the subpopulation(s) and the desired significance criterion the power can either be determined from a table, as found in [7] for example, or calculated from the cumulative distribution function of the t-distribution. N.B. a compensation is required when using a table to find the power of a test using a single subpopulation such that the effect size, d1, should be multiplied by p 2 to compensate for the fact that the c is a hypothetical population parameter without any associated sampling error. 2.4. Software implementation Here we present a software implementation of this model. The software has been customised for use with small pilot datasets for which the variance structure of the experiment design is estimated. Cq data can be entered into the software as MS Excel spreadsheets or plain text files. The Cq values must be allocated to the correct position in the experiment design hierarchy so that the R.R. Kitchen et al. / Methods 50 (2010) 231–236 233 software can determine to which subject, sample, and RT replicate each given data point belongs. This allocation can be performed manually in the software or be pre-specified in the input data file. In the case of the latter, a template file is available; the use of which enables the software to automatically parse the experiment design. The user-interface for data input, design specification, and results analysis is shown in Fig. 2. Once the data have been allocated to their position in the experimental hierarchy, the nested-ANOVA automatically performs the analysis to determine estimates for the variance components of each of the four levels. These components are reported in terms of the relative, fractional contributions to the total variance in the data as detailed in Eq. (3). In the situation where experiment data are unavailable, a facility to manually input error estimates is also provided for each level. The results generated by using this facility must be interpreted with caution, depending on the researcher’s confidence about the quality of the input variance estimations. Once the data are allocated to their correct position within the design of the pilot experiment the user is provided the opportunity to modify the design and sampling plan by adding and removing biological and technical replicates at each level of the design. Using the measured variance structure, error estimates and statistical power are automatically calculated and displayed for each of the modified designs. A facility is also provided for inputting the approximate financial cost of performing a single replicate for each level. If this information is available, the software will display the total cost of each design as well as the expected total error. Given the variance and costing information, the software is capable of determining an experiment design that minimises the total variance for a specified financial cost. This is achieved through an implementation of Eqs. (4) and (5). The user can choose from various designs such as those optimised for cost-performance, for the minimisation of biological and technical error, or for the maximisation of statistical power. 2.5. Power calculation For a single dataset from a single treatment group, the power of different experiment designs can be estimated in terms of the difference of the mean of the given data compared to a pre-specified value. The population variance is estimated either from the variance of the input data, or by manual estimate. Data can be entered for multiple treatment groups such that the entire experiment design can be optimised based on the observed variances. Given this information, the software provides an automatic power calculation such that the statistical resolution of the assay for the desired contrast can be maximised before the experiment is performed. The automatic optimisation of the entire experiment design is capable of producing designs where the replicate structure of each treatment group is unique, enabling the overall error of the entire experiment to be minimised (i.e. different designs for each subject group depending on the result of the nested-ANOVA). The power is calculated based on the measured variance structure of the input data for the treatment group(s) using the effect size formulae defined in Eq. (6) or Eq. (7). For each design, the power is calculated using the number of biologically distinct observations (usually subjects, sometimes samples), the difference between the means of the treatment groups, and the precision of the measurements. The difference between the means can be either specified manually (preferred) or estimated from the data. The software can also plot a graph of the number of biologically distinct observations vs. estimated power, examples of which are illustrated in Fig. 3. 2.6. Experimental application This method and the software were first used in the analysis of several different types of biological material by Tichopad et al. [17], in which the relative contributions of each of the processing levels to the total variance were estimated for bovine liver, blood, and culture samples for a number of different genes. The sampling plans for each of the sample types in this study were designed to include sufficient biological and technical replication to allow the estimation of the variance components by the nested-ANOVA, Fig. 4(A). Here we extend this analysis to estimate variance components of the same data normalised to the reference gene, ActB, in each of the three tissues, Fig. 4(B). We use these data to compare Fig. 2. The powerNest software user interface. The main interface of the powerNest software. Cq data for a number of subject groups from a pilot study can be entered and grouped to provide estimates of the variance components of the experiment. Subjects within each group are assumed to be biological replicates; qPCRs from the same RT, RTs from the same sample, and samples from the same subject are assumed to be technical replicates. 234 R.R. Kitchen et al. / Methods 50 (2010) 231–236 the measured variance structures before (Cq) and after (DCq) normalisation to the reference gene. Prior to normalisation, the analysis of the liver tissue revealed substantial variation with an average total standard deviation that, in terms of the Cq, corresponds to a 2.6-fold variation between measurements. In blood, the noise arising from sampling and extraction was consistently small across all of the studied genes, both before and after normalisation to the reference gene, indicating that this step is very reproducible for such samples. The cell culture samples were found to exhibit the lowest overall confounding variation, attributable to the clonal nature of these cultures. In all studied genes, with the exception of the low-expressed FGF7 in liver and IFNc in blood, the magnitude of variance attributed to the RT step was reduced after normalisation. Excluding FGF7 and IFNc, the estimated standard deviations at the RT step ranged between 0.18 and 0.46 cycles with a mean of 0.31 cycles in raw data, and were reduced to 0.03–0.25 cycles with a mean of 0.17 cycles following normalisation. The total standard deviations observed in blood and culture samples were only marginally affected by normalisation, while the total standard deviations of genes in liver (excluding FGF7) were dramatically reduced. The total standard deviation in FGF7 more than doubled following normalisation due to a large increase in the variance attributed to both the sampling and RT steps; the reason for this is unknown and with only a single observation we cannot speculate as to the significance of this result. Many published reports have described the use of experimental protocols that perform only qPCR replicates. On the basis of the variance contributions we have estimated for the 3 studied sample types, we are able to evaluate the importance of qPCR replicates. Again excluding the low-expressed genes, FGF7 and IFNc, we found the standard deviations in raw data at the qPCR level to be 0.07–0.21 cycles, with a mean of 0.13 cycles; similar to previous findings [14]. We conclude that a qPCR standard deviation of 0.13 cycles is a good estimate for genes that are expressed at reasonable levels and assayed with a protocol that yields at least some 25 template copies per qPCR. 3. Concluding remarks The powerNest software application was specifically developed to implement the method presented in this article; it calculates the biological and technical variance components for a given dataset and can deliver cost-optimal, variance-minimising experiment designs. Multiple datasets can be analysed simultaneously such that an estimate of statistical power can be calculated for a specified contrast between them. There are currently several published algorithms and software tools that address the analysis of gene expression with data generated by qPCR experiments; these include, among other things, different approaches to normalisation, the use of reference genes, and clustering of multiple targets and samples [19]. In addition, generalised software implementations of the nested-ANOVA and power calculations are also available [20,21]. powerNest, however, represents the first dedicated tool to assist the researcher throughout the planning phase of an experiment and is available online at www.powernest.net. General results for each of the pre-processing levels in the experiments described here highlight the importance of choosing the correct design for the specific environment of the experiment, such as the tissues and genes to be analysed. Across all of the tissues and genes analysed, the variance contribution from the qPCR step was only around 10% of the total and the contribution from the RT was found to exhibit about 2-times this variability, a result that is in agreement with earlier findings [22]. Along with the RT step, the variance of the qPCR replicates was found to be independent of the gene being assayed. We conclude that the use of technical replicates at the qPCR level have minimal impact on the precision of the estimated Cq value, in agreement with previous findings [17,23]. In almost all observations, normalisation to the reference gene reduced the variance attributable to the RT step and the total variance was reduced in cases where the variance structure of the reference was similar to that of the gene of interest. The variability between sample replicates was found to be highly tissue-dependent and inconsistent estimates of the intersubject variation in blood and culture tissues suggest that this variation may be gene-dependent. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 4 20 36 52 68 84 100 116 132 148 164 180 196 212 228 244 260 276 292 Power Total number of samples x=0.05 x=0.1 x=0.2 x=0.3 x=0.5 x=1.0 x=1.5 x=2.0 x=3.0 x=5.0 Fig. 3. Example power-curves. Example power-curves for a number of theoretical experiments at various measurement precisions at an alpha-level of 5%. The power is calculated for unpaired, two-tailed t-tests between two groups of samples of equal size and equal variance. For each curve, the variable ‘x’ is defined as the fraction of the standard deviation of the two groups compared to the difference in the means of the two groups. For example x = 0.5 corresponds to a standard deviation that is equal to half the difference in the means, x = 1.0 corresponds to a standard deviation that is equal to the difference in the means, and x = 3.0 corresponds to a standard deviation that is equal to three times the difference in the means. R.R. Kitchen et al. / Methods 50 (2010) 231–236 235 It should be highlighted that in order for the technique described here to be valid it is essential that the subjects, samples, and pre-processing procedures used for the pilot are representative of those taken forward to the larger assay. It is obvious that the likelihood of the pilot being representative is increased through the use of larger numbers of biological and technical replicates; however, a sensible compromise must be made to limit the size and cost of the pilot study. We would generally recommend that, for the pilot to offer meaningful variance estimates, no fewer than three replicates are used at each level. In addition, although the use of technical replicates increases the statistical power of the assay by increasing the precision of the measurements, technical replicates are not independent and do not increase the number of biological observations of the given subpopulation. When measurement is expensive and/or the individual measurements are very precise it is preferable to add biological replicates rather than technical replicates. In conditions, exemplified by the bovine liver described above, where the dominant source of variability is between measurements rather than between the biological replicates, the use of technical replicates will be very effective in increasing precision. In general, however, the most effective means of increasing the power and validity of qPCR experiments is to increase the number of independent biological replicates randomly selected from within each subpopulation. Acknowledgements We thank Becky Carlyle and Jano van Hemert for helpful discussions. This work was in part funded by EU Framework VI FP6-2004IST-NMP-2: SmartHEALTH and GAAV IAA500520809. RRK is funded by the UK EPSRC. References [1] S.A. Bustin, J. Mol. Endocrinol. 25 (2) (2000) 169–193. [2] M. Kubista, J.M. Andrade, M. Bengtsson, A. Forootan, J. Jonák, K. Lind, R. Sindelka, R. Sjöback, B. Sjögreen, L. Strömbom, A. Ståhlberg, N. Zoric, Mol. Aspects Med. 27 (2006) 95–125. [3] T. Nolan, R.E. Hands, S.A. Bustin, Nat. Protocol 1 (3) (2006) 1559–1582. [4] M.L. Wong, J.F. Medrano, BioTechniques 39 (1) (2005) 75–85. [5] K.J. Livak, T.D. Schmittgen, Methods 25 (4) (2001) 402–408. [6] R.A. Fisher, The Design of Experiments, ninth ed., Hafner Press, 1971. p. 248. [7] J. Cohen, Statistical Power Analysis for the Behavioral Sciences, second ed., Hillsdale, N.J., 1988. p. 567. [8] G.A. Churchill, Nat. Genet. 32 (Suppl.) (2002) 490–495. [9] P.R. Rosenbaum, Replicating effects and biases, Am. Stat. 55 (3) (2001) 223– 227. [10] S.A. Bustin, T. Nolan, J. Biomol. Tech. 15 (3) (2004) 155–166. [11] S.A. Bustin, V. Benes, J.A. Garson, J. Hellemans, J. Huggett, M. Kubista, R. Mueller, T. Nolan, M.W. Pfaffl, G.L. Shipley, J. Vandesompele, C.T. Wittwer, Clin. Chem. 55 (4) (2009) 611–622. [12] R.V. Lenth, Am. Stat. 55 (3) (2001) 187–193. [13] L. Thomas, Conservation Biol. 11 (1) (1997) 276–280. [14] A. Ståhlberg, J. Håkansson, X. Xian, H. Semb, M. Kubista, Clin. Chem. 50 (3) (2004) 509–515. [15] E. Limpert, W.A. Stahel, M. Abbt, Bioscience 51 (5) (2001) 341–352. [16] A.L. Oberg, D.W. Mahoney, Methods Mol. Biol. 404 (2007) 213–234. [17] A. Tichopad, R. Kitchen, I. Riedmaier, C. Becker, A. Ståhlberg, M. Kubista, Clin. Chem. 55 (10) (2009) 1816–1823. [18] G.W. Snedecor, W.G. Cochran, Statistical Methods, eighth ed., Iowa State Univ Press, 1989, p. 503. [19] M.W. Pfaffl, J. Vandesompele, M. Kubista, Data analysis software, in: J. Logan, K. Edwards, N. Saunders (Eds.), Real-Time PCR: Current Technology and Applications, Caister Academic Press, 2009. ISBN: 978-1-904455-39-4. [20] R Development Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, ISBN: 3-900051-07-0, URL: http://www.R-project.org, 2008. [21] MultiD Analyses, GenEx software, URL: http://www.multid.se, 2009. [22] A. Ståhlberg, M. Kubista, M. Pfaffl, Clin. Chem. 50 (9) (2004) 1678–1680. [23] M. Bengtsson, M. Hemberg, P. Rorsman, A. Ståhlberg, BMC Mol. Biol. 9 (2008) 63. ActB H istone IL8 Bcl2 ActB C asp 3 IL-1b IFN g 0 0.5 1 1.5 2 2.5 3 ActB C asp 3 IL-1b FG F7 Variance Subjects Samples RTs qPCRs 0 0.5 1 1.5 2 2.5 3 C asp 3 IL-1b FG F7 Variance C asp 3 IL-1b IFN g H istone IL8 Bcl2 Liver Blood Cell culture A B Fig. 4. Estimated confounding variation contributed by the sample processing steps. The estimated variance contribution of each of the four sampling levels to the overall variance in the measurements for several bovine tissues and genes. The top row of plots (A) illustrates the variances of raw Cq data while the second row (B) illustrates variances of DCq data after normalising to a reference gene-ActB. 236 R.R. Kitchen et al. / Methods 50 (2010) 231–236