Stano Pekár
■ Gamma and lognormal data arise:
• precise measurements of small quantities (concentration), weight, time, etc.
• measurements are continuous
- negative values and zeros are not allowed
- distribution is skewed to the right
• logarithmic transformation of measurements will homogenise variance and adjust asymmetry of distribution
• moments - 2 parameters (jutr, atr)
- while on log scale variance is independent of mean, on original scale variance is a function of expected mean
• predicted values: ^^^^^^Q
• used to model inverse polynomials moments - 2 parameters (ju, cp)
Var(y) =
dispersion parameter (cp) = Var(y) / ju:
X
X
Welch test (t. test) to compare two means with heterogenous
variances
glm (formula, Gamma (link= . . .) )
• links:
- inverse (default)
- logarithmic (log)
- identity (identity)
• lm (log (y) ~ . .)
Background
In euryphagous predators the size of prey is positively related to their body size. There is an upper limit due to e.g. morphological constraints.
Design
In the laboratory, acceptance of food was studied in 36 species of granivorous beetles. Each carabid beetle was offered seeds of various sizes [g]. Preferred seed size was recorded. For each beetle body size [mm] was recorded too.
1 a 1
= a + p
kde seedi ~ Gamaifi^
ml <- glm anova(ml, test=rrF11)
Analysis of Deviance Table
Model: Gamma, link:: inverse Response: seed
Terms added sequentially (first to last)
Df Deviance Resid, Df Resid* Dev F Pr(>F)
NULL 35 15.3681
I(l/body) 1 8.3662 34 7.0019 42.624 1.787e-07 ***
1 o 1 1
= a + p-+ y-
fit body t hody\
seed i - Gama(fip
iti2 <- glm(seed - I + i anova (ml, m2, test=TTFM)
Analysis of Deviance Table
Model 1: Model 2: Resid.
_
seed - I(1/body)
seed ~ I (1/body) + I (l/bodyA2)
Df Resid, Dev Df Deviance F Pr(>F)
34 7,0019
33 7.0016 1 0.0003 0.0013 0.9713
o d
O
S ° 75 °
Vj o
CD
CN
O I
o i
o
O
Residuals vs Fitted
B
oil
o
o $> o .... o ....
lA
o 0
1 5 1 1 1 10 15 20 Predicted values glm[seed~l[l ]/body) i 25
-o o
O
CN O
O
d
CN
o
o I
0 Q
i-1-1-1-1-1-1-r
0 5 10 15 20 25 30 35
body
> summary (ml)
Call:
glm(formula = seed - l(l/body), family = Gamma)
Deviance Residuals:
Min IQ Median 3Q Max
-0.7530027 -0.4237538 0.0008676 0.2527096 0.7024871
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.7418 0.3162 5.508 3.76e-Q6 ***
I(l/body) 11.8626 2.4463 4.849 2.69e-05 ***
Signif. codes: 0 0.001 0,01 **' 0.05 1.f 0.1 1 ' 1
(Dispersion parameter for Gamma family taken to be 0.1962785)
Null deviance: 15.3681 on 35 degrees of freedom Residual deviance: 7.0019 on 34 degrees of freedom AIC: -49.67 6
REBBBI |(1536S1 - 7.0019) / 15.3681 = 0.54. 1/1.7418 = 0.574,
Coefficient of determination: Asymptote: immia
seed; = a + f}hodyi + yhodyf + z{, kde z{ - JV(0, a2), nezávisle pro jednotlivé jedince.
> m3 <- Im(seed ~ poly(body,2))
> summary(m3)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.30842 0.02318 13.305 B.17e-15 ***
poly(body, 2)1 0.55682 0.13908 4.004 0.000333 ***
poly(body, 2)2 -0.41591 0.13908 -2.990 0.005235 **
Signif. codes: 0 0.001 0.01 »*' 0.05 0.1 1 ' 1
Residual standard error: 0.1391 on 33 degrees of freedom Multiple R-Squared: 0.4307, Adjusted R-squared: 0.3962
F-statistic: 12.49 on 2 and 33 BE, p-value: 9.173e-05
-o
Q
l.-i
Ö
— Ö o
IS*
Q
CM
o
body
CM Ö
o Ö
a)
Od
CM Ö
o o
0.1
B
Residuals vs Fitted
o 34fl
0.2
—I-1-
0.3 0.4
27 o
0.5
Fitted values lm(seed - poly (body, 2JJ
2-way ANOYA
Background
In the gift-giving spider a male brings a prey to a female in order to avoid being cannibalised. Several variables can potentially influence how quickly female will accept the gift.
In the laboratory, effect of two variables was studied: satiation of female (satiated, starved) and their mating experience (mated, virgin). Time [s] of the gift presentation was recorded. Experiment was fully factorial, for each combination 10 males and females were used.
Hypotheses
Is presentation time affected by any of the two variables? If it is what is the difference between factor levels?
Variables
MATING: mated, virgin FEED: satiated, starved time
timeijk = a + MATING■ + FEEDk + MATING\FEEDjk + e()jt, s - JV(0, a2), nezävisle pro jednotliva pozoroväni.
> ml <- lm(time ~ mating*feed)
> anova(ml)
Analysis of Variance Table
Response: time
Df Sum Sq Mean Sq F value Pr(>F)
mating _ 165122 165122 0 .2558 0. 616098
feed _ 5625000 5625000 8 .7142 0. 005528 **
mating r feed _ 40322 40322 C: .0625 0. 904058
Residuals 36 23237845 645496
> anova(m3)
Analysis of Variance Table
Response: time
Df Sum Sq Mean Sq F value Pr(>F)
feed 1 5625000 5625000 9.1177 0.004507 **
Residuals 38 23443290 616929
Fitted values Im (time feed)
Predicted valjes glm(Hme ~ feed)
log(f^) = a + MATING} + FEEDk + MATING:FEEDjk, s ftme^ ~ Gama(fyk,
nezavisle pro jednotliva pozorovani,
> m4 <- glm(time - mating*feed, Gamma anova(m4, test=rrFTT)
4 1- ■ Df Deviance Resid. Df Resid, Dev F Px(>F)
NULL 39 122.018
mating 1 0.564 38 121.454 0. 3888 0. 5368618
feed 1 26.258 37 95.196 18, 1021 0. 0001425 ***
mating: feed 1 0.083 36 95.113 C. 0570 0. 8126218
---_
> anova(m6, test="F">
■t w m Df Deviance Resid. Df Resid. Dev F Pr(>F)
NULL 39 122. 018
feed 1 25.922 38 96. 096 20 .3 6.138e-05 ***
Signif, codes: 0 0,001 ,**' 0,01 »*' 0.05 0,1 1 ' 1
> summary (m6)
Coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) £.8222 feedstarved -1.6982
0.2527 26.999 < 2e-16 *** 0.3573 -4.752 2.87e-05 ***
> m7 <- lm(log(time) - mating*feed}
> anova(m7)
Analysis of Variance Table
Response: log(time)
Df Sim Sq Mean Sq F value Pr(>F)
mating 1 11.432 11.432 2.7578 0.10547
feed 1 19.262 19.262 4.6468 0.03787
matingrfeed 1 0.019 0.019 0.0045 0.94681 Residuals 36 149.226 4.145
> m8 <- lm(log(time) - feed)
> summary (m8)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.4658 0.4598 11.887 2.27e-14 ***
feedstarved -1.3379 0.6503 -2.134 0.0393 *
Background
The nutritional quality of the diet affects growth of organisms in a various ways. To find optimal diet for cockroaches the following experiments was performed.
Design
Effect of five diet types (control, lipidl, lipid2, proteinl, protein2) was tested on body weight [g] of male and female cockroaches. For each diet 10 females and 7 males were used. Their body weight [g] was recorded before and after the experiment.
Hypotheses
Is weight influenced by the diet type?
If so which diet resulted in largest weight?
Is weight on diets similar for males and females?
0.2 0.3 0.4 0.5
0.2 0.3 0.4 0.5 L_I_I_I_I
lipidl Iipid2 prof&in1 protein2
O ° O O
O ^° A 0 4 A o
°o°^ Aoo0 o o OA j, A A 0 A A Q O Q O 0 o A A O A A*0 & A ^ O A A
4 H
5
2 H
i H
n i i í
0.2 0.3 0.4 0.5
0 2 0 3 0 4 0.5
0.2 0.3 0.4 0.5
sta rt
> ml <- lm(log(weight) - diet*sex*start)
> anova(ml)
Analysis of Variance Table
Response: log(weight)
Df Sum Sq Mean Sq F value Pr (>F)
diet 4 16.1349 4.0337 150.3981 <2e-16 ***
sex 1 0 .0261 0.0261 0.9732 0 .3275
start 1 0 .0455 0.0455 1.6956 0 .1975
diet:sex 4 0 .0866 0.0217 0 . 8073 0 . 5250
diet:start 4 0 .0244 0.0061 0.2272 0.9222
sex:start 1 0 .0315 0.0315 1 .1743 0.2825
diet:sex:start 4 0 .1829 0.0457 1 . 7048 0 .1596
Residuals 65 1 .7433 0.0268
> anova(lm(log(weight) ~ sex*diet*start))
Analysis of Variance Table
Response: log(weight)
sex die ~ start sex:diet sex:start diet:start
Df Sum Sq Mean Sq F value Pr(>F)
1 0.0261
4 16.1349
1 0.0455
4 0.0366
1 0.0196
4 0.0363
sex:diet:start 4 0.1329 Residuals 65 1.7433
0.0261 0.9732 0.3275
4.0337 150.3981 <2e-16 ***
0.0455 1.6956 0.1975
0.0217 0.8073 0.5250
0.0196 0.7302 0.3959
0.0091 0.3382 0.8512
0.0457 1.7043 0.1596 0.0268
log(weightiJk) = a + DIET, + SEXk + fistarti + ystartf + DIET:SEXjk + 5jistarti +
8lkstartI + Sjkstarti + w^startf + a; ik$tartl + a> ^start? + £jyjt, (9-13) kde £ľí>it - MO, tf2)> nezávisle pro jednotlivá měření.
> m2 <- lm(log(weight) - diet*sex*poly(start,2>)
> anova(ml, m2)
Analysis of Variance Table
Model 1: log (weight) - diet * sex * start Model 2: log (weight) - diet * sex * poly(start, 2) Res.Df RSS Df Sum of Sq F Pr (>F)
1 65 1.7433
2 55 1.4122 10 0.33113 1.2896 0.2592 > anova(m3)
Analysis of Variance Table
Response: log(weight)
Df Sum Sq Mean Sq J ■ value Pr(>F)
die 7 4 16.1349 4 .0337 144 .4941 <2e-16
sex _ 0.0261 0 .0261 0 .9350 0.3369
start _ 0.0455 0 .0455 _ .6290 0.2061
diet:sex 4 0.0366 0 .0217 0 .7756 0.5448
diet:start 4 0.0244 0 .0061 0 .2133 0.9274
sex:start _ 0.0315 0 .0315 _ .1282 0.2919
Residuals 69 1.9262 0 .0279
> summary (m8 )
Call:
lm(formula = log(weight) ~ diet)
Residuals:
Min 1Q Median 3Q Max
-0.33311 -0.09764 -0.02934 0,11146 0.41505
Coefficients:
Estimate Std. Error t value Pr(>1t1)
(Intercept) -0.05319 0 .03967 -1.341 0.184
dietlipidl 0,55181 0 .05610 9,836 2.02e-15 * * *
dietlipid2 0,52190 0 .05610 9,303 2 .23e-14 * * X
dietproteinl 1.17298 0 .05610 20. 908 < 2e-16 * * X
dietprotein2 1,12984 0 .05610 20.139 < 2e-16
---
> summary (m9)
■i * ■ Coefficients:
Estimate std. Err M" t value Pr (>|t 1 )
(Intercept) - 0 .05319 C , ,03940 -1 ,35 0 .181
diet21ipid 0 .53686 C: , ,04825 11 ,13 <2e- 16 ***
diet2prot _ .15141 C: , .04825 23 .86 <2e- 16 ***
>í/jalyse%
Stano Pekár
■ Poisson data arise when data are:
- counts/frequencies of individuals, species, cells
- events of behaviour, etc.
- always positive integers
- counts are often low (including 0)
• we count how many times an event occurred but we do not know how often it did not occur (we do not know n)
moment: ^^^^^^^^^^U
IF
die
• x2 test (chisq. test) to analyse 2-dimension tables
• Fisher exact test (fisher .test) to analyse 2x2 tables
• Mantel-Haenszel test (mantelhaen .test) to analyse 3-dimension tables for independence
• Log-linear analysis (loglin) to study complex frequency tables
• Contingency tables (xtabs) to study effect of factors
• Standard regression (lm) can be used after transformation
- squareroot transformation
- can predict values out of bounds (negative)
• Poisson GLM (glm) to study effect of both factorial and continuous predictors
•glm(..., family = poisson(link=...))
link functions:
- logarithmic (log)
- squareroot (sqrt)
- identity (identity)
• estimated parameters are on logaritmic scale (-oo, +oc)
• inverse function to log is exp
1-wiy IIO¥I
Background ^ Diversity of organisms changes with the age of the habitat. According to the intermediate disturbance hypothesis, the diversity increases and then decreases with age, thus being highest at medium age.
■
Design KIHBHBai^^™
In 15 apple orchards diversity of arachnids was studied on trees, The orchards were of variable age, classified into 3 classes: 0-9, 10-19 and 20-30 years old. Each class was represented by 5 orchards.
> ml <- glm (divers orchard, f amily=poisson)
> anova (ml f test="ChiM)
Analysis of Deviance Table
Model: poisson, link: log Response: divers
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 14 38.964
orchard 2 26.246 12 12.718 1.999e-06
Coefficients :
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.2192 0.1474 15.051 < 2e-16
orchardolder 0.3442 0.1763 4.788 1.63e-06
orchardoldest 0.3457 0.1927 1.794 0.0727
> contrasts(orchard) <- "contr.helmertIT
> m2 <- glni (divers orchard, family=poisson)
> summary (rn2)
+ + +
Coefficients :
Estimate Std. Error z value Pr(>|z|) (Intercept) 2.61585 0.07186 36.404 < 2e-16 *** orchardl 0.42209 0.08815 4.783 1.68e-06 +** orchard2 -0.02545 0.05072 -0.502 0.616
> orchardl <- ordered(orchard)
> m3 <- glm(divers ^ orchardl, family=poisson)
> summary (m3)
Coefficients :
Estimate Std. Error z value Pr(>1z 1 )
(Intercept) 2.61585 0 .07186 36.404 < 2e-16 ***
orchardl.L 0.24448 0 .13624 1.794 0.0727 .
orchardl.Q -0.54 813 0 .11144 -4.919 8.71e-07 ***
> m.3 <- glm (divers ~ orchard - 1, poisson)
> summary (m3)
Coefficients :
orchardyoung orchardolder orchardoldest
Estimate Std. Error z value Pr(>|z|)
2.21920 3.06339 2.56495
0 .14744 0 .09667 0 .12403
15.05 31. 69 20. 68
<2e-16 *** <2e-16 *** <2e-16 ***
> exp {confint (m3) )
Waiting for profiling to be done...
2.5% 97.5% orchardyoung 6.790864 12.12010 orchardolder 17.597063 25.71441 orchardoldest 10.090235 16.42096
• arises when dispersion parameter cp K^^^BHwarHBI
i.e. the residual deviance is not similar to the residual degrees of freedom _
- overdispersion: variance is larger —» cp > 1
- underdispersion: variance is smaller cp<\
• causes:
- if the distribution is aggregated
- if counts are not independent
- lack of important variables, etc.
- suspicious data
• solution: use quasipoisson family
• this will influence SE of parameter estimates - if cp > 1 then SE will be larger
-if cp < 1 then SE will be smaller
• without correction for overdispersion there would be too many false positive results (in favour of HA)
• when using quasipoisson y}- and z- tests have to change to F- and t- tests
Background
Abundance of carabid beetles in cereals depends on abiotic and biotic factors. If we understand how abiotic factors influence abundance of carabids then we can adapt certain management practices to increase the abundance when needed.
1
Design
In the field, on 21 wheat plots the abundance of carabid beetles was studied by means of pitfall traps. At every site average day temperature [°C] and average sun activity [W/m2] was recorded.
Hypotheses
Was abundance of beetles affected by any of the two variables? If so what is the model of the relationship?
Variables
temp
sun
abun
log(/iŕ) = a + ßltempi + j82swni + ötempisuni, kde öfrwri,. ~ Poi(fit), nezávisle pro jednotlivé porosty.
> ml <- glm(abun ~ temp*sunf family=poisson)
> summary(ml)
Coefficients
(Intercept) temp sun
temp:sun
Estimate Std. Error z value Pr(>|z|)
4.195e+G0 4.745e-01 8.840 < 2e-16 ***
-5.386e-G2 2.258e-Q2 -2.385 0.0171 *
■1.151e-03 2.364e-04 -4.869 1.12e-06 ***
6.257e-05 1.006e-05 6.221 4.95e-10 ***
Signif. codes: 0
^ -k -k -k f
0.001
\ -kk r
0.01
0.05
0.1
1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 317.229 on 20 degrees of freedom Residual deviance: 98.657 on 17 degrees of freedom
> m2 <- update(mlf family=quasipoisson)
> anova(m2 f test="F")
Df Deviance Resid.
NULL temp sun
temp:sun
1 1 1
153.10 27 .90 37 .57
Df Resid. Dev F Pr(>F)
20 317.23
19 164.12 24.5836 0.0001196 ***
18 136.23 4.4796 0.0493541 *
17 98.66 6.0324 0.0251002 *
> m3 <- glm anova(m3, test=rrChi")
4 i m Df Deviance Resid. Df Resid. Dev P(>I Chi|)
NULL 19 75 .292
temp 1 40.291 18 35 .001 2.188e-10
sun 1 12 .165 17 22 .836 4.870e-04
temp:sun 1 0.117 16 22 .719 0.732
> m.4 <- update (m3, ~.-temp: sun)
> anova(m4, test=rľChi")
■■ ■■ ■ Df Deviance Resid. Df Resid. Dev P(>l Chi | )
NULL 19 75 .292
temp 1 4 0 .291 35 .001 2.188e-10
sun 1 _1 .165 17 22 .836 4.870e-04
> library(car)
> Anova(m4)
Analysis of Deviance Table (Type II tests)
Response: abun
LR Chisq Df Pr(>Chisq)
temp 12.567 1 0.0003926 * * *
sun 12.165 1 0.0004870 * * *
> summary (m4)
4 » ■
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.283e+Q0 2.088e-01 10.933 < 2e-16 *** temp 3.781e-Q2 1.070e-02 3.534 0.000409 *** sun 1.954e-04 5.655e-05 3.455 0.000550 ***
Signif. codes: 0 0.001 0.01 0.05 0.1 1 ' 1
{Dispersion parameter for poisson family taken to be 1)
Null deviance: 75.2 92 on 19 degrees of freedom
Residual deviance: 22.836 on 17 degrees of freedom AIC: 135.76
Some spiders are specialised in their diet. Specialisation can involve evolution of physiological and behavioural traits, such as prey-specific venom and number of attacks.
Design
In the lab, the number of attacks of an ant-eating spider on ants of two subfamilies was observed. For each subfamily 20 species of ants were used. Each ant species was tested once. For each ant body size was recorded as it may influence its susceptibility to venom.
Hypotheses
Was the number of attack related to ant size?
Was the number of attacks similar for ants of both subfamilies?
What is the shape of the relationship?
Variables
ANT: famA, famB
size
number
> ml <- glm(number * size*ant, family=poisson)
> anova(mlf test=MChiM)
NULL s ize ant
s ize:ant
Df Deviance Resid.
1 93.395 1 75,555 1 25.804
Df Resid. Dev P(>|Chi|) 39 215.561 38 122.167 4.284e-22 37 46.612 3,554e-18
36 20.808 3.779e-07
> summary (ml)
m m m
Coefficients :
Estimate £
(Intercept) 0.89794
size -0.02154
antfamB -2.66924
size:antfamB 0.70407
Signif. codes: 0
d. Error z value Pr (>|z| ) 0.64904 1.383 0.166512 0.12456 -0.173 0.862735 0.80637 -3.310 0.000932 *** 0.14579 4.829 1.37e-06 ***
0.001 0.01 0.05 0.
{Dispersion parameter for poisson family taken to be 1)
Null deviance: 215.561 on 39 degrees of freedom Residual deviance: 20.803 on 36 degrees of freedom AIC: 153.15
A
Residuals vs Fitted
B
o
CM -
ol3 34 o
0 □ o
°\ 0 V ° O 0 a o
"O" o 0
0
0 o 32°
0.5
.0
i 1.5
2.0 2.5
3.0
Predicted values glmfnumber ~ size * ant]
o
o
d
o d
d o
si;
> m2 <- glm(number ^ poly(sizef2)*antr poisson)
> anova(mlf m2 , test=MChi")
Analysis of Deviance Table
Model 1: number - size * ant
Model 2: number - poly(size, 2) * ant
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 36 20.3084
2 34 20.7673 2 0.0411 0.9797
Jnumber = a + fisizei + ysxze] + e kde e. ~ N(0, cr2), nezávisle pro jednotlivá pozorování.
> m3 <- lm (sqrt (number) -* size + I(sizeA2), subset= (ant==" f amB,T) )
> anova(m3)
Analysis of Variance Table
Response: sqrt (number)
Df Sum Sq Mean Sq F value Pr(>F) size 1 28.3476 28.3476 161.0631 4.253e-10 ***
I(size^2) 1 1.1930 1.1930 6.7783 0.01855 * Residuals 17 2.9921 0.1760
o
=)
ač
A
Residuals vs Fitted
B
o
CM
1
E
o
Fitted values lm(sqrt(number)~size + l(sizeA2))
size
Background
Some predators use conditional strategies to catch prey. The use of strategy often depends on the characteristics of prey.
Design
In the field, it was observed which of three strategies spiders used to capture prey. For each trial, size (two size classes) and movement (slow or fast) of prey was recorded. Altogether 88 trials were observed.
Hypotheses
Is use of strategy influenced by prey size and its movement? If so which prey is captured by strategy A, B and C?
Variables
PREY: fast, slow
SIZE: large, small
STRATEGY: stratA, stratB, stratC
freq
A
B
stratA stratB stratC stratA stratB stratC
strategy strategy
> ml <- glm(freq ~ strategy*size*preyF family=poisson)
> summary(ml)
Call:
glm(formula = freq ~ strategy * size * prey, family = poisson)
Deviance Residuals: [1] 000000000000
Coefficients :
Estimate Sto L. Error z value Pr(>|z|)
{Intercept) : .485e+00 2. 887e-01 8 .608 <2e-16
strategystratB -4 .Q55e-01 4. 564e-01 -0.888 0. 3744
strategystratC -1 .792e+00 7. 638e-01 -2.346 0. 0190
sizesmall 5 .596e-01 3. 619e-01 1 .546 0. 1220
preyslow -1 .823e-01 4. 282e-01 -0.426 0. 6702
strategystratB: sizesmall -2 .594e+01 6. 965e+G4 -0 .000372 0. 9997
strategystratC: sizesmall -1 .253e+G0 1. 277e+00 -0.981 0. 3266
strategystratB: preyslow 4 .055e-01 6. 390e-01 0 .635 0. 5257
strategystratC: preyslow -5 .108e-01 1. 297e+00 -0.394 0. 6938
sizesmall:preyslow B .224e-02 5. 325e-01 0 .154 0. 8773
strategystratB: sizesmall :preyslow : .433e+01 6. 965e+G4 0 .000350 0. 9997
strategystratC: sizesmall :preyslow -2 .269e+01 6. 965e+04 -0 .000326 0. 9997
> anova(ml, test="Chi")
+ + +
Df Deviance
NULL
strategy 2 64. .205
s ize 1 0. .045
prey 1 0. .000
strategy:size 2 15, . 939
strategy:prey 2 2. . 962
size:prey 1 0. .507
strategy:size:prey 2 4, .307
Resid. Df Resid. Dev P(>|Chi|)
11 87 .966
23.761 1 .143e-14
8 23.715 0 .331
7 23.715 1.000
5 7 .776 3 .458e-04
3 4 .814 0 .227
2 4 .307 0 .476
0 3.033e-10 0.116
> m2 <- update(mlr ~.-strategy: size rprey)
> anova(m2, test=M Chi")
■ ■ ■ Df Deviance Resid . Df Resid. Dev P (> IChi|)
NULL 11 87. 966
strategy : 64 .205 9 23. 761 1.143e-14
size 1 0 .045 6 23. 715 0.831
prey 1 0 .000 7 23. 715 1.000
strategy:size : 15 .939 5 7. 776 3.458e-04
strategy:prey : 2 .962 3 4. 814 0.227
size:prey 1 0 .507 2 4. 307 0.476
> m.3 <- update (m2 , ~*. - strategy: prey)
> anova (m.3 r test=" ChiM)
+ + + Df Deviance Resid . Df Resid. Dev P(>|Chi|)
NULL 11 87. 966
strategy 2 64 .205 9 23. 761 1.143e-14
size 1 0 .045 8 23. 715 0.831
prey 1 0 .000 7 23. 715 1.000
strategy:size 2 15.939 5 7. 776 3.458e-04
size:prey 1 0 .045 4 7. 731 0.831
> summary (m3)
Call:
glm(formula = freq - strategy + size + prey + strategy:size + size:prey, family = poisson)
Deviance Residuals:
: : 3 4 5 t 7
-0.3233 1.2076 - : .0111 -0.2297 0 .3990 -0 . 4079 0.3227
S 9 10 11 12
-1.9777 0.6395 G .2194 -0.4077 0 .3585
Coefficients :
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2 .42083 0 .26010 9 .307 < 2e-16 ***
strategystratB -0 .20067 0 .31782 -0 . 631 G .527782
strategystrate -1 .99243 0 .61546 -3 .237 G .001207 **
s izesmall 0 .55237 0 .34042 1 . 623 G .104669
preyslow -0 .04652 0 .30508 -0 .152 G .878805
strategystratB: sizesmall -2 .10191 0 .61318 -3 .428 G .000608 ***
strategystratC: sizesmall -1 .69645 1 .18481 -1 .432 G .152193
sizesmallipreyslow 0 .09097 0 .42662 0 .213 G .831142
> attacks <- tapply (predict (m3 , type=MresponseTT) , list (size, strategy) , mean)
> attacks
stratA stratB stratC large 11 9 1.5
small 20 2 0.5
30-
25-
in
ij 20H o
Ž 15H o
10-
5-
0-
I a roe
stratA stratB
srnal
stratA stratB
+
stratC
Strategy
stratC _I_
■ NB is a parametric alternative to Poisson model with overdispersion
• distribution of y is strongly asymmetric with many zeros
• NB has two parameters, ju and 6
• moments:
- 6 is aggregation parameter (0?oo)
- if 6 > 1 .. random distribution, 6 < 1 .. aggregated distribution
- 6 can be estimated from
glm. nb (f ormula) from MASS library
• links:
log (default) sqrt
identity
• begin with Poisson model, if overdispersion is large switch to glm. nb
Grain beetles are serious pests in grain stores. They may occur not only in the grain but also in crevices of corridors. It is essential to know where they occur before control methods are applied.
Design
Density of grain beetles was surveyed in a grain store by means of sticky traps. Traps were installed in two places: 25 traps in the corridors and 25 traps in the grain. After few days number of beetles was recorded.
log(fy) = a + PLACEj, kde density. ~ Poi(ß), nezávisle pro jednotlivé pasti.
> ml <- glm(density ~ placef family=quasipoisson)
> anova(mlf test="F")
m- m- -m
Df Deviance Resid. Df Resid. Dev F Pr(>F)
NULL 4 9 80 2 6.5
place 1 1350.1 48 6676.4 6.0434 0.01762 *
> summary (nil)
■■■■■■
Coefficients :
Estimate Std. Error t value Pr(>|t|) (Intercept) 4.5161 0.3125 14.45 <2e-16 ***
placegrain -1.6280 0.7715 -2.11 0.0401 *
Signif. codes: 0 0.001 0.01 **' 0.05 0.1 1 ' 1
(Dispersion parameter for quasipoisson family taken to be 223.3983)
log(ty) = a + PLACE., kde density j ~ NB(pip 0), nezávisle pro jednotlivé pasti.
> tapply(density, place, var)/tapply(densityf placef mean)
floor grain 386.53096 60.20546
> tapply(density, place, function(x) mean(x)A2/(var(x)-mean(x)))
floor grain 0.2372524 0.3033504
> library(MASS)
> m2 <- glm.nb(density ~ place)
> anova(m2)
Analysis of Deviance Table
Model: Negative Binomial(0.3318), link: log Response: density
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev P(>|Chi|) NULL 49 70.174 place 1 9.877 48 60.297 0.002
Warning message:
In anova.negbin(m2) : tests made without re-estimating 'theta'
> summary (m.2)
Call:
glm.nb(formula = density - place, init.theta = 0.33184400 6124825,
link = log) Coefficients :
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.5161 0.3 47 8 12.98 4 < 2e-16 ***
placegrain -1.6280 0.4937 -3.297 0.000976 ***
Signif. codes: 0 0.001 0.01 0.05 *0.1 1 ' 1
(Dispersion parameter for Negative Binomial(0.3318) family taken to be 1)
Null deviance: 70.174 on 4 9 degrees of freedom Residual deviance: 60.297 on 48 degrees of freedom AIC: 430.95
Number of Fisher Scoring iterations: 1
Theta: 0.3318 Std. Err.: 0.0610
2 x log-likelihood: -424.9480
> a <- split(x=density, f=place)
> m.3 <- glm, nb (floor ~ 1)
> summary (m3)
m- m- -m
Null deviance: 31.307 Residual deviance: 31.307 AIC: 245.47
on 24 degrees of freedom on 24 degrees of freedom
Number of Fisher Scoring iterations: 1
> m4 <- glm«nb(grain ~ 1)
> summary {m.4)
■ m m
Null deviance: 2 9.197
Residual deviance: 2 9.197 AIC: 186.78
on 24 degrees of freedom on 24 degrees of freedom
Number of Fisher Scoring iterations: 1
Theta: 0.39 9 Std. Err.: 0.111
2 x log-likelihood:
-182.780
> m.5 <- glm. nb (density ~ place-1)
> exp {confint 1)
• Binomial GLM (glm) to study effect of both factorial and continuous predictors
• glm(..., family = binomial(link=...))
link functions:
- logit (logit)
- probit (probit)
- complementary logit (cloglog)
Data format:
• Binomial distribution ... individuals within a group are homogenous
- two vectors (y? n-y) or (y? n) of integers
• Bernoulli (binary) distribution ... individuals within a group are heterogenous, each characterised by a continuous character
-n=\
- single vector of O's or 1 's
Background
Some weed seeds may germinate following water priming (by rain) more than others thus attaining likely competitive advantage.
m
Design ^^^^^m^^^^m^H
The effect of water priming on the germination of weed seeds of 4 genera was studied in the laboratory. Each of 5 days 400 seeds of each genus were sown (200 seeds on control and 200 seeds on wet soil). Altogether 2000 seeds per genus were sown. Germination was recorded thereafter. Based on assumption of similar conditions during 5 days, data from 5 days were pooled.
Hypotheses
• Does water priming promote germination?
• If it does was the effect similar for all four
• Which species germinated most and least?
ienera?
Variables:
TREATMENT: control, water GENUS: genA, genB, genC, genD
germ
> y <- abind(germ, n-germ)
> ml <- glm(y ~ genus*treatmentf family=binomial)
> anova(mlf test="ChiM)
Analysis of Deviance Table
Model: binomial, link: logit Response: y
Terms added sequentially (first to last)
Df
NULL
genus 3 treatment 1 genus:treatment 3
Deviance Resid. Df
7
638.74 4 30.23 3 0 .37 0
esid. Dev P(>|Chi|) 669.34 30.60 4.026e-138 0.37 3.840e-08 1.212e-13 0.95
> m2 <- update(mlf ~.-genus:pesticide)
> summary(m2)
Coefficients
(Intercept) genusgenB genusgenC genusgenD
Estimate Std. Error z value Pr(>|z|)
0 .56138 -1.59933 -1.15462 -1.06030
0.05256 10.681
treatmentwater 0.25859
0 0 0 0
06860 06614 06583 04710
-23.313 -17.457 -16.106 5.491
<2e-16 ***
<2e-16
-k -k -k
<2e-16 *** <2e-16 *** 4e-08 ***
> 1/ (1+ exp(-0.56138))
[1] 0.6367718
> 1/(1 + exp(-0.56138+1.59933))
[1] 0.2615457
> genus1 <- genus
> levels(genusl)
[1] "genA" "genB" "genC" "genD"
> levels(genusl)[3:4] <- "genCD"
> m3 <- glm(y ~ genusl + treatment, binomial)
> anova(m2, m3, test="Chi,T)
Analysis of Deviance Table
Model 1: y - genus + treatment Model 2: y - genusl + treatment
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 3 0.37316
2 4 2.49523 -1 -2.12207 0.14519
> summary(m3)
m- m- -m
Coefficients :
Estimate Std. Error z value Pr(>|z|)
(Intercept) genuslgenB genuslgenCD
-1.59933
-1.10723
0.56141
0.05749
0.06860
0.05256
23.31
19.26
10.68
< 2e-16 ***
< 2e-16 ***
< 2e-16 ***
treatmentwater
0 .25852
0.04709
5.49 4.02e-08 ***
> genus2 <- genus1
> levels(genus2)
[1] "genA" "genB" "genCD"
> levels(genus2) [2 : 3] <- "genBCD"
> m4 <- glm(y ~ genus2 + treatmentf binomial)
> anova (m.3 f m.4 f test="Chi,T) Analysis of Deviance Table
Model 1: Model 2: Resid.
1
2
y - genus1 + treatment y - genus2 + treatment Df Resid. Dev Df Deviance P(>|Chi|)
4 2.495
5 73.684 -1 -71.189 3.246e-17
• statistical and biological effects are not identical
• statistical effects are affected by precision of measurements, number of measurements, type of test
• Cohen's coefficient:
• h < 0.2 ... weak effect
• h > 0.8 ... strong effect
> both <- paste(pesticide,genus!
> m4 <- glm(y ~ factor(both) - 1, binomial)
> 1/(1+exp (-confint (m4 ) ) )
(Waiting for profiling to be done...
2.5% 97.5%
factor (both)control genA 0.6048442 0.6644666 factor (both)control genB 0.2315221 0 .2857134 factor(both)control genCD 0.3485230 0.3908104 factor(both)water genA 0.6670153 0.7239840 factor(both)water genB 0.2896266 0.3473026 factor(both)water genCD 0.4044330 0.4477560
• arises when dispersion parameter cp P^EffWffpfWBI
- overdispersion: variance is larger —» cp > 1
- underdispersion: variance is smaller —» cp < 1
• causes:
- if the model is mispecified
- lacks important explanatory variables
- relative frequency is not constant within a group
• solution: use quasibinomial family in which variance is
estimated as ^^^^^^^^^^ instead of m
this will influence SE of parameter estimates
- if (p > 1 then SE will be larger
- if cp < 1 then SE will be smaller
changes P values
• when using quasibinomial %2- and z- tests have to change to F- and t- tests
Rtartssioii
Background ^ Production of eggsac is influenced by a number of variables, such as body size, i.e. amount of consumed food. For an experimental study we need to be able to predict probability of production at a range of body sizes. 9^^^BI^^^H^^^H
Design ^^^^^^■■■^^^^B
In the laboratory, production of eggsacs was studied in a spider with a variable body size [mm]. As the body size was measured with the precision of 0.5 mm, all 160 individuals were classified into size classes each containing 15 to 30 specimens. Females that produced eggsac were recorded.
Hypotheses
• Is eggsac production related to the body size?
• If it is what is the shape of the relationship?
• What is the model that can be used to predict e; for spider sizes of 3-12 mm?
Variables:
;sac production
> tr <- asin(sqrt(p))
> ml <- lm(tr ~ body + I(bodyA2), weights=n)
> summary(ml)
Coefficients :
Estimate Std. Error t (Intercept) -2.34592 0.59329 body 1.30161 0.24776 I(bodyA2) -0.11121 0.02433
> m2 <- update(ml, -.-I(bodyA2))
> summary(m2)
■ m m
Coefficients :
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.28836 0.31429 0.918 0.4010
body 0.17649 0.06279 2.811 0.0375 *
value Pr(>|t|) -3.954 0.01676 *
5.254 0.00628 ** -4.571 0.01025 *
body
body
> y <- cbind(eggs, n-eggs)
> m3 <- glm(y ~ body + I(bodyA2), family=binomial)
> summary(m3)
m -m m
Coefficients :
Estimate Std. Error z value Pr(>|z|) (Intercept) -13.7857 3.8482 -3.582 0.000340 ***
body 5.7218 1.6771 3.412 0.000645 ***
I(bodyA2) -0.4825 0.1695 -2.846 0.004427 **
Signif. codes: 0 0.001 ***' 0.01 %*' 0.05 *.r 0.1 1 ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 44.2136 on 6 degrees of freedom Residual deviance: 3.3357 on 4 degrees of freedom
> summary (m.4)
■ ■ ■ Coefficients :
Estimate Std. Error z value Pr(> z|)
(Intercept) -3.9270 1.1038 -3.558 0.000374 ***
body 1.2079 0.2756 4.383 1.17e-05 ***
Signif. codes: 0 *# o .001 ***' 0.01 **' 0.05 y.r 0.1 1 ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 44 .214 on 6 degrees of freedom
Residual deviance: 11 .072 on 5 degrees of freedom
> m.5 <- update (m.4 f f amily=quasibinomial)
> summary (m5)
(Dispersion parameter for quasibinomial family taken to be 3.332466)
> anova(m5, test="F")
+ * * Df Deviance Resid. Df Resid. Dev F Pr(>F)
NULL 6 44 .214
body 1 33.141 5 11 .072 9. 945 0.02528 *
body
Background i>iti mm.
Synthetic insecticides often have a species-specific efficiency. The recommended doses or concentrations then have to adjusted.
Design
In the laboratory an effect of an insecticide on the mortality of two aphid species was studied. The insecticide was applied at 6 concentrations [ppm]. Each concentration was tested on 30 individuals of both aphid species.
Hypotheses
•Is mortality affected by the concentration?
• Was the efficiency similar for both species?
• What is the LC50 (i.e. 50% lethal concentration) for both species?
Variables: SPECIES: A, B
cone
dead
log
/ 7t ^
\
= a + SPECIES . + fJlogiconCi) + ôjlogiconc^)^
kde decide ~ Binin^ nezávisle pro jednotlivá pozorování.
> y <- cbind(dead, n-dead)
> ml <- glm (y ** log (cone) *species F binomial)
> anava(ml)
■■■■■■ Df Deviance Resid . Df Resid. Dev P(>1 Chi 1)
NULL 11 185 .807
log(conc) 1 110.170 10 75 . 638 8.996e-26
species 1 62.087 9 13 .551 3.286e-15
log(cone):species 1 1.343 8 12 .207 0.246
> m2 <- update(mlf ^. -log(cone) :species)
> anova(m2)
m- m- -m Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 11 185.807
log(conc) 1 110.170 10 75.638 8 .996e-26
species 1 62.087 9 13.551 3.286e-15
> summary (m2)
Coefficients :
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3825 0.2201 6.280 3.39e-10 ***
log(conc) 1.2328 0.1348 9. 146 < 2e-16 ***
speciesB -2.2117 0.3180 -6.955 3.52e-12 ***
Signif. codes: 0 0.001 0.01 %*' 0.05 0
(Dispersion parameter for binomial family taken to be
Null deviance: 185.807 on 11 degrees of freedom Residual deviance: 13.551 on 9 degrees of freedom
> m.3 <- glm(y ~ species + log (cone) - 1, binomial)
> summary (m3)
Coefficients :
Estimate Std. Error z value Pr(>|z|)
speciesA speciesB log(cone)
1.3325 -0.8293 1.2328
0 .2201 0 .2020 0.1343
> library(MASS)
> dose.p(m3, cf=c(l,3), p=0.5)
Dose SE p = 0.5: -1.121418 0.1627097
> dose.p(m3, cf=c(2,3), p=0.5)
Dose SE p = 0.5: 0.6726313 0.159251
6.230 3.39e-10 *** 4.106 4.02e-05 *** 9.146 < 2e-16 ***
exp(-1.121) - 0326
exp(0,673) = 1.96.
Granivorous ants collect various seeds and bring them into nest. Sympatrically occurring species may show trophic niche partitionin related to the size of collected seeds.
Design
Seed preference of two ant species was studied in the laboratory. Each of 25 ants of both species was offered seeds of variable size expressed as its weight [mg]. Response of ants was classified as "yes" or "no" if it took or refused to take a seed, respectively.
Hypotheses
• Is acceptance related to the seed size?
• Did both species have similar preference for seed sizes?
• If not what is the threshold size of seeds for both species?
(The threshold size is defined as a size that is accepted with higher than 90% probability)
Variables:
SPECIES: specA, specB
seed
take
0.8-
O.ó-
0.4-
0.2-
0.0-
O ODWQ DO O O
QEO CD
i-1-1-1-1-1- i-1-1-1-1-r
0.0 0.5 1.0 1.5 2.0 2.5
seed
> ml <- glm(take - seed*speciesf family=binomial)
> summary(ml)
■ m m
Coefficients :
Estimate Std. Error z value Pr(>1z 1 )
(Intercept) 4.012 1. 646 2 .437 0.01480 -J:
seed -8.346 3.315 -2.517 0.01182 ■J;
speciesspecB -10.957 3. 697 -2.964 0.00304 * *
seed:speciesspecB 19.147 6.141 3.118 0.00182 * *
Signif. codes: 0 \-k**r 0.001 A **' 0.01 1 * ' 0 . 05 y.r 0 .1
{Dispersion parameter for binomial family taken to be 1)
Null deviance: 68.593 on 4 9 degrees of freedom Residual deviance: 24.726 on 4 6 degrees of freedom
> anova(mlf test="Chi")
* + * Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 49 68 .593
seed 1 0.054 48 68 .539 0.817
species 1 0.325 47 68 .214 0 .568
seed:species 1 43.488 46 24 .726 4.267e-ll
> m2 <- glm(take ** log (seed) *species, binomial)
> AIC (ml, m2)
df AIC ml 4 32.72631 m2 4 32.23823
> m3 <- glm(talce ** seed*species, binomial (link=cloglog) )
> AIC(m3)
[1] 31.63241
• several for GLM models
• McFaden's coefficient - based on likelihood of models
• ranges from 0 to 1
binomial) > 1-logLik(ml)/logLik(m4)
'logLik' 0.6395213 (df=4)
Residuals vs Fitted
Normal Q-Q
Predicted values
Obs. number
> (log<0.9/0.1)-4.012)/-8.346
[1] 0.2174425
> (log<0.9/0.1)-4.012+10.957)/{-8.346+19.147)
[1] 0.3464239