Stano Pekár 0 2 4 6 8 10 12 14 0.00.10.20.30.40.5 x y R  Gamma and lognormal data arise: • precise measurements of small quantities (concentration), weight, time, etc. • measurements are continuous - negative values and zeros are not allowed - distribution is skewed to the right • logarithmic transformation of measurements will homogenise variance and adjust asymmetry of distribution • moments - 2 parameters (μtr, σtr) - while on log scale variance is independent of mean, on original scale variance is a function of expected mean • predicted values:    22 2exp1exp)( trtrtryVar           2 exp)( 2 tr tryE     medianQ exp • used to model inverse polynomials moments - 2 parameters (μ, φ) • dispersion parameter (φ) = Var(y) / μ2 )(yE 2 )( yVar 0 2 4 6 8 10 12 14 0.00.10.20.30.40.5 x y 0 2 4 6 8 10 12 14 0.00.20.40.60.8 x y 0 2 4 6 8 10 12 14 0.00.51.01.5 x y x ba y 11  bxa y  12 cxbxa y x  • Welch test (t.test) to compare two means with heterogenous variances • glm(formula, Gamma(link= ...)) • links: - inverse (default) - logarithmic (log) - identity (identity) • lm(log(y)~..) y 1 Background In euryphagous predators the size of prey is positively related to their body size. There is an upper limit due to e.g. morphological constraints. Design In the laboratory, acceptance of food was studied in 36 species of granivorous beetles. Each carabid beetle was offered seeds of various sizes [g]. Preferred seed size was recorded. For each beetle body size [mm] was recorded too. Hypotheses Is size of seeds related to the carabid body size? What is the shape of the relationship? Variables body seed Coefficient of determination: Asymptote: Background In the gift-giving spider a male brings a prey to a female in order to avoid being cannibalised. Several variables can potentially influence how quickly female will accept the gift. Design In the laboratory, effect of two variables was studied: satiation of female (satiated, starved) and their mating experience (mated, virgin). Time [s] of the gift presentation was recorded. Experiment was fully factorial, for each combination 10 males and females were used. Hypotheses Is presentation time affected by any of the two variables? If it is what is the difference between factor levels? Variables MATING: mated, virgin FEED: satiated, starved time Background The nutritional quality of the diet affects growth of organisms in a various ways. To find optimal diet for cockroaches the following experiments was performed. Design Effect of five diet types (control, lipid1, lipid2, protein1, protein2) was tested on body weight [g] of male and female cockroaches. For each diet 10 females and 7 males were used. Their body weight [g] was recorded before and after the experiment. Hypotheses Is weight influenced by the diet type? If so which diet resulted in largest weight? Is weight on diets similar for males and females? Variables DIET: control, lipid1, lipid2, protein1, protein2 SEX: male, female start weight Stano Pekár 0 2 4 6 8 10 12 14 0.00.10.20.30.40.5 x y R  Poisson data arise when data are: - counts/frequencies of individuals, species, cells - events of behaviour, etc. - always positive integers - counts are often low (including 0) • we count how many times an event occurred but we do not know how often it did not occur (we do not know n) • moment: )()( yVaryE   • χ2 test (chisq.test) to analyse 2-dimension tables • Fisher exact test (fisher.test) to analyse 2x2 tables • Mantel-Haenszel test (mantelhaen.test) to analyse 3dimension tables for independence • Log-linear analysis (loglin) to study complex frequency tables • Contingency tables (xtabs) to study effect of factors • Standard regression (lm) can be used after transformation - squareroot transformation - can predict values out of bounds (negative) • Poisson GLM (glm) to study effect of both factorial and continuous predictors y •glm(..., family = poisson(link=...)) link functions: - logarithmic (log) - squareroot (sqrt) - identity (identity) • estimated parameters are on logaritmic scale (-, +) • inverse function to log is exp Q e Background Diversity of organisms changes with the age of the habitat. According to the intermediate disturbance hypothesis, the diversity increases and then decreases with age, thus being highest at medium age. Design In 15 apple orchards diversity of arachnids was studied on trees. The orchards were of variable age, classified into 3 classes: 0-9, 10-19 and 20-30 years old. Each class was represented by 5 orchards. Hypotheses Is diversity related to the age of orchards? What is the trend of change? Variables ORCHARD: young, older, oldest divers • arises when dispersion parameter φ i.e. the residual deviance is not similar to the residual degrees of freedom - overdispersion: variance is larger  φ > 1 - underdispersion: variance is smaller  φ < 1 • causes: - if the distribution is aggregated - if counts are not independent - lack of important variables, etc. - suspicious data 1)(E)(Var  yy  )()( yVaryE • solution: use quasipoisson family • this will influence SE of parameter estimates - if φ > 1 then SE will be larger - if φ < 1 then SE will be smaller • without correction for overdispersion there would be too many false positive results (in favour of HA) • when using quasipoisson 2- and z- tests have to change to F- and t- tests Background Abundance of carabid beetles in cereals depends on abiotic and biotic factors. If we understand how abiotic factors influence abundance of carabids then we can adapt certain management practices to increase the abundance when needed. Design In the field, on 21 wheat plots the abundance of carabid beetles was studied by means of pitfall traps. At every site average day temperature [ºC] and average sun activity [W/m2] was recorded. Hypotheses Was abundance of beetles affected by any of the two variables? If so what is the model of the relationship? Variables temp sun abun Background Some spiders are specialised in their diet. Specialisation can involve evolution of physiological and behavioural traits, such as preyspecific venom and number of attacks. Design In the lab, the number of attacks of an ant-eating spider on ants of two subfamilies was observed. For each subfamily 20 species of ants were used. Each ant species was tested once. For each ant body size was recorded as it may influence its susceptibility to venom. Hypotheses Was the number of attack related to ant size? Was the number of attacks similar for ants of both subfamilies? What is the shape of the relationship? Variables ANT: famA, famB size number Background Some predators use conditional strategies to catch prey. The use of strategy often depends on the characteristics of prey. Design In the field, it was observed which of three strategies spiders used to capture prey. For each trial, size (two size classes) and movement (slow or fast) of prey was recorded. Altogether 88 trials were observed. Hypotheses Is use of strategy influenced by prey size and its movement? If so which prey is captured by strategy A, B and C? Variables PREY: fast, slow SIZE: large, small STRATEGY: stratA, stratB, stratC freq small large small large stratA 19 10 21 12 stratB 4 10 0 8 stratC 0 1 1 2 slow fast Stano Pekár 0 2 4 6 8 10 12 14 0.00.10.20.30.40.5 x y R  NB is a parametric alternative to Poisson model with overdispersion • distribution of y is strongly asymmetric with many zeros • NB has two parameters, μ and θ • moments: - θ is aggregation parameter (0,) - if θ  1 .. random distribution, θ < 1 .. aggregated distribution - θ can be estimated from    2 )( yVar ys y   2 2 ˆ )(yE glm.nb(formula) from MASS library • links: log (default) sqrt identity • begin with Poisson model, if overdispersion is large switch to glm.nb Background Grain beetles are serious pests in grain stores. They may occur not only in the grain but also in crevices of corridors. It is essential to know where they occur before control methods are applied. Design Density of grain beetles was surveyed in a grain store by means of sticky traps. Traps were installed in two places: 25 traps in the corridors and 25 traps in the grain. After few days number of beetles was recorded. Hypotheses Is density of beetles similar on both places? If not how different it is? Variables PLACE: floor, grain density Stano Pekár 0 2 4 6 8 10 12 14 0.00.10.20.30.40.5 x y R ■ Binomial data arise: • when we count response to a certain stimulus  dose-response studies • whenever we record whether an event has occurred or not within a known population (n) • events: death, birth, germination, attack, consumption, reaction, etc. • there are no classical replications - records are clustered to p or q • p .. probability of successes, q .. probability of failures • clustering of responses: 6.0 500 300 300 200 200 100 p 58.0 2 667.05.0   p • distribution is bounded [0 < p < 1] • variance is not constant, maximal when p = q = 0.5 • moments • estimated parameters are on logit scale (-, +) • logistic model will always asymptote at 0 and 1 - predicted values are then always within [0, 1] • inverse function to logit is anti-logit where Q is a parameter estimate • odds ratio )1()(   nyVarnyE )( bxa p p       1 log Q e y    1 1 ˆ Q e p p   1 • Exact binomial test (binom.test) to compare a single proportion • Proportion test (prop.test) to compare two proportions • Contingency tables (xtabs) to study effect of factors • Logistic regression to study effect of continuous predictors • Standard regression (lm) can be used after transformation - angular transformation - can predict values out of bounds (negative or >1) • Binomial GLM (glm) to study effect of both factorial and continuous predictors parcsin • glm(..., family = binomial(link=...)) link functions: - logit (logit) - probit (probit) - complementary logit (cloglog) ))1log(log( p        p p 1 log 0 1 p cloglog probit logit x Data format: • Binomial distribution ... individuals within a group are homogenous - two vectors (y, n-y) or (y, n) of integers • Bernoulli (binary) distribution ... individuals within a group are heterogenous, each characterised by a continuous character - n = 1 - single vector of 0’s or 1’s Background Some weed seeds may germinate following water priming (by rain) more than others thus attaining likely competitive advantage. Design The effect of water priming on the germination of weed seeds of 4 genera was studied in the laboratory. Each of 5 days 400 seeds of each genus were sown (200 seeds on control and 200 seeds on wet soil). Altogether 2000 seeds per genus were sown. Germination was recorded thereafter. Based on assumption of similar conditions during 5 days, data from 5 days were pooled. Hypotheses • Does water priming promote germination? • If it does was the effect similar for all four genera? • Which species germinated most and least? Variables: TREATMENT: control, water GENUS: genA, genB, genC, genD germ n • statistical and biological effects are not identical • statistical effects are affected by precision of measurements, number of measurements, type of test • Cohen’s coefficient: • h < 0.2 … weak effect • h > 0.8 … strong effect • arises when dispersion parameter φ - overdispersion: variance is larger  φ > 1 - underdispersion: variance is smaller  φ < 1 • causes: - if the model is mispecified - lacks important explanatory variables - relative frequency is not constant within a group • solution: use quasibinomial family in which variance is estimated as instead of 1)(E)(Var  yy  )1()(  nyVar )1()(   nyVar • this will influence SE of parameter estimates - if φ > 1 then SE will be larger - if φ < 1 then SE will be smaller • when using quasibinomial 2- and z- tests have to change to F- and t- tests changes P values Background Production of eggsac is influenced by a number of variables, such as body size, i.e. amount of consumed food. For an experimental study we need to be able to predict probability of production at a range of body sizes. Design In the laboratory, production of eggsacs was studied in a spider with a variable body size [mm]. As the body size was measured with the precision of 0.5 mm, all 160 individuals were classified into size classes each containing 15 to 30 specimens. Females that produced eggsac were recorded. Hypotheses • Is eggsac production related to the body size? • If it is what is the shape of the relationship? • What is the model that can be used to predict eggsac production for spider sizes of 3–12 mm? Variables: body n eggs Background Synthetic insecticides often have a species-specific efficiency. The recommended doses or concentrations then have to adjusted. Design In the laboratory an effect of an insecticide on the mortality of two aphid species was studied. The insecticide was applied at 6 concentrations [ppm]. Each concentration was tested on 30 individuals of both aphid species. Hypotheses • Is mortality affected by the concentration? • Was the efficiency similar for both species? • What is the LC50 (i.e. 50% lethal concentration) for both species? Variables: SPECIES: A, B conc n dead Background Granivorous ants collect various seeds and bring them into nest. Sympatrically occurring species may show trophic niche partitioning related to the size of collected seeds. Design Seed preference of two ant species was studied in the laboratory. Each of 25 ants of both species was offered seeds of variable size expressed as its weight [mg]. Response of ants was classified as “yes” or “no” if it took or refused to take a seed, respectively. Hypotheses • Is acceptance related to the seed size? • Did both species have similar preference for seed sizes? • If not what is the threshold size of seeds for both species? (The threshold size is defined as a size that is accepted with higher than 90% probability) Variables: SPECIES: specA, specB seed take • several for GLM models • McFaden’s coefficient – based on likelihood of models • ranges from 0 to 1