Gaussova distribuce a testování hypotéz Mojmír Dočekal 2024-04-30 2 Contents 1 Dvě témata 5 2 Gaussova distribuce 7 2.1 Nejznámější typ distribuce . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Rozdíly mezi Gaussovými křivkami . . . . . . . . . . . . . . . . . 10 2.3 Random effects v experimentech . . . . . . . . . . . . . . . . . . 12 3 Testování hypotéz 29 3.1 Historie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Příklad s mincí . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Lingvistická aplikace . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Zpět k random efektům . . . . . . . . . . . . . . . . . . . . . . . 34 3.5 Domácí úkol? (spíš ne) . . . . . . . . . . . . . . . . . . . . . . . . 49 3 4 CONTENTS Chapter 1 Dvě témata • Gaussova distribuce • testování hypotéz: t-distribuce, t.test 5 6 CHAPTER 1. DVĚ TÉMATA Chapter 2 Gaussova distribuce 2.1 Nejznámější typ distribuce • klasický příklad x = seq(-4, 4, 0.1) y = dnorm(x) plot(x, y, xlab = "x", ylab = "density", ylim = c(0, 0.8)) −4 −2 0 2 4 0.00.20.40.60.8 x density • některé příklady: 1. průměrná výška, váha, … 7 8 CHAPTER 2. GAUSSOVA DISTRIBUCE library(HistData) data("Galton") hist(Galton$parent) Histogram of Galton$parent Galton$parent Frequency 64 66 68 70 72 050100150200 summary(Galton$parent) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 64.00 67.50 68.50 68.31 69.50 73.00 sd(Galton$parent) ## [1] 1.787333 • a děti: library(HistData) data("Galton") hist(Galton$child) 2.1. NEJZNÁMĚJŠÍ TYP DISTRIBUCE 9 Histogram of Galton$child Galton$child Frequency 62 64 66 68 70 72 74 050100150 summary(Galton$child) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 61.70 66.20 68.20 68.09 70.20 73.70 sd(Galton$child) ## [1] 2.517941 2. chyby v měření (historie) • historie • Gaussovo nármální rozložení • objeveno Gaussem při řešení problému dráhu trpasličí planety Ceres • vycházel z chyb měření (každé měření nutně vnáší chybu) • jejich aproximací přesně vypočítal, kde se Ceres objeví po průletu za sluncem První využití ve vědách o člověku • Adolph Qutelet • autor Body Mass Indexu • ‘homme moyen’: mean • aplikace Gaussovy distribuce na “sociální fyziku” • random effects z experimentu 10 CHAPTER 2. GAUSSOVA DISTRIBUCE 2.2 Rozdíly mezi Gaussovými křivkami • hustota pravděpodobnosti (density function) x = seq(-4, 4, 0.1) y = dnorm(x) plot(x, y, xlab = "x", ylab = "density", ylim = c(0, 0.8)) −4 −2 0 2 4 0.00.20.40.60.8 x density • other mean x = seq(0, 8, 0.1) y = dnorm(x, mean = 4) plot(x, y, xlab = "x", ylab = "density", ylim = c(0, 0.8)) 2.2. ROZDÍLY MEZI GAUSSOVÝMI KŘIVKAMI 11 0 2 4 6 8 0.00.20.40.60.8 x density • jiná standardní odchylka x = seq(-4, 4, 0.1) y = dnorm(x, mean = 0, sd = 2) plot(x, y, xlab = "x", ylab = "density", ylim = c(0, 0.8)) −4 −2 0 2 4 0.00.20.40.60.8 x density • ještě jiná standardní odchylka x = seq(-4, 4, 0.1) y = dnorm(x, mean = 0, sd = 0.5) plot(x, y, xlab = "x", ylab = "density", ylim = c(0, 0.8)) 12 CHAPTER 2. GAUSSOVA DISTRIBUCE −4 −2 0 2 4 0.00.20.40.60.8 x density 2.3 Random effects v experimentech • odpovědi nemají nutně normální distribuci • ani na fillery ani na podmínky • načtení itemů Nicméně random effects mají Gaussovo rozdělení • napřed deskriptivní statistika items <- items[-which(items$participant %in% c(32)),] items <- items %>% mutate(condition=replace(condition, condition=="item-méně_než", "fewer")) %>% mutate(condition=replace(condition, condition=="item-nanejvýš", "at-most")) %>% mutate(condition=replace(condition, condition=="item-ne_víc_než", "no-more")) %>% mutate(condition=replace(condition, condition=="item-trochu_méně", "slightly-less" as.data.frame() ddply(items, .(condition), summarise, Means = mean(rating1, na.rm=TRUE)) ## condition Means ## 1 at-most 1.265464 ## 2 fewer 2.507732 ## 3 no-more 1.314433 ## 4 slightly-less 2.203608 2.3. RANDOM EFFECTS V EXPERIMENTECH 13 ddply(items, .(condition), summarise, Medians = median(rating1,na.rm=TRUE)) ## condition Medians ## 1 at-most 1 ## 2 fewer 2 ## 3 no-more 1 ## 4 slightly-less 2 library(ggplot2) data.to.plot <- items graph_to_plot <- ggplot(data.to.plot, aes(condition,rating1)) graph_to_plot + geom_boxplot() + stat_summary(fun.y=mean, geom="point", size=3) ## Warning: The `fun.y` argument of `stat_summary()` is deprecated as of ggplot2 3.3.0. ## i Please use the `fun` argument instead. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. 1 2 3 4 5 at−most fewer no−more slightly−less condition rating1 items$classAB <- "A" items$classAB[items$condition == "at-most"] <- "B" items$classAB[items$condition == "no-more"] <- "B" 14 CHAPTER 2. GAUSSOVA DISTRIBUCE p <- ggplot(items, aes(condition, rating1, fill = classAB)) + stat_summary(geom = "bar", fun.y = mean, position = "dodge") + stat_summary(geom = "errorbar", fun.data = mean_se, size=.3, width=.2, position=position_dodge(.9)) ## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0. ## i Please use `linewidth` instead. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. p + labs(y = "acceptability") + theme(axis.text=element_text(size=15), axis.title=element_text(size=17,face="bold")) 0 1 2 at−most fewer no−more slightly−less condition acceptability classAB A B # histograms • vybereme jednu, druhou podmínku • pak sloučíme # first by condition at_most <- filter(items, condition=="at-most") head(at_most) ## materials participant item condition position question.order rating1 2.3. RANDOM EFFECTS V EXPERIMENTECH 15 ## 1 Items 1 2 at-most 21 1 1 ## 2 Items 1 6 at-most 32 1 1 ## 3 Items 1 10 at-most 28 1 1 ## 4 Items 1 14 at-most 9 1 1 ## 5 Items 2 1 at-most 18 1 1 ## 6 Items 2 5 at-most 26 1 1 ## ## 1 Kontext: Aleš čte na obalu piv ## 2 Kontext: Aleš si čte pracovní řád firmy, kde je napsáno: ## 3 Kontext: Aleš se dívá na válečný dokument, kde uslyší následující větu: „Ten den mo ## 4 Kontext: Aleš čte článek o jedné exotické ## 5 Kontext: Aleš čte na obalu čokolády následující vět ## 6 Kontext: Aleš si čte studijní řád místní univerzity, kde vidí následující větu: "Diplomová p ## classAB ## 1 B ## 2 B ## 3 B ## 4 B ## 5 B ## 6 B write.csv(at_most, "at_most.csv") ggplot(at_most, aes(x=rating1)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. 16 CHAPTER 2. GAUSSOVA DISTRIBUCE 0 100 200 300 1 2 3 4 5 rating1 count # two conditions at_most_no_more <- filter(items, condition=="at-most" | condition == "fewer") head(at_most_no_more) ## materials participant item condition position question.order rating1 ## 1 Items 1 1 fewer 17 1 4 ## 2 Items 1 2 at-most 21 1 1 ## 3 Items 1 5 fewer 5 1 5 ## 4 Items 1 6 at-most 32 1 1 ## 5 Items 1 9 fewer 34 1 2 ## 6 Items 1 10 at-most 28 1 1 ## ## 1 Kontext: Aleš čte na obalu čokolády násle ## 2 Kontext: Aleš čte na ## 3 Kontext: Aleš si čte studijní řád místní univerzity, kde vidí následující větu: "D ## 4 Kontext: Aleš si čte pracovní řád firmy, kde j ## 5 Kontext: Aleš si čte zprávu valné hromady firmy EB, kde stojí následu ## 6 Kontext: Aleš se dívá na válečný dokument, kde uslyší následující větu: „ ## classAB ## 1 A ## 2 B ## 3 A ## 4 B ## 5 A 2.3. RANDOM EFFECTS V EXPERIMENTECH 17 ## 6 B ggplot(at_most_no_more, aes(x = rating1)) + geom_histogram(fill = "white", colour = "black") + facet_grid(condition ~ .) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. at−mostfewer 1 2 3 4 5 0 100 200 300 0 100 200 300 rating1 count • jasně ne-Gaussovské (normální rozložení) # all conditions # https://r-graphics.org/recipe-distribution-multi-hist ggplot(items, aes(x = rating1)) + geom_histogram(fill = "white", colour = "black") + facet_grid(condition ~ .) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. 18 CHAPTER 2. GAUSSOVA DISTRIBUCE at−mostfewerno−moreslightly−less 1 2 3 4 5 0 100 200 300 0 100 200 300 0 100 200 300 0 100 200 300 rating1 count • testy # tests for normality shapiro.test(filter(at_most_no_more, condition == "at-most")$rating1) ## ## Shapiro-Wilk normality test ## ## data: filter(at_most_no_more, condition == "at-most")$rating1 ## W = 0.39231, p-value < 2.2e-16 # filter(at_most_no_more, condition == "at-most")$rating1 # filter(at_most_no_more, condition == "at-most") # not normal distribution # Wilcox test wilcox.test(filter(at_most_no_more, condition == "at-most")$rating1, + filter(at_most_no_more, condition == "fewer")$rating1) ## ## Wilcoxon rank sum test with continuity correction ## ## data: filter(at_most_no_more, condition == "at-most")$rating1 and +filter(at_most_n ## W = 42624, p-value < 2.2e-16 ## alternative hypothesis: true location shift is not equal to 0 2.3. RANDOM EFFECTS V EXPERIMENTECH 19 # quantile graphs qqnorm(filter(at_most_no_more, condition == "at-most")$rating1) −3 −2 −1 0 1 2 3 12345 Normal Q−Q Plot Theoretical Quantiles SampleQuantiles mean(filter(at_most_no_more, condition == "at-most")$rating1) ## [1] 1.265464 sd(filter(at_most_no_more, condition == "at-most")$rating1) ## [1] 0.7703438 # vs. Gauss distribution qqnorm(rnorm(length(filter(at_most_no_more, condition == "at-most")$rating1), 1.27, 0.77)) 20 CHAPTER 2. GAUSSOVA DISTRIBUCE −3 −2 −1 0 1 2 3 −10123 Normal Q−Q Plot Theoretical Quantiles SampleQuantiles ggplot(items, aes(x = rating1, fill = condition)) + geom_histogram(position = "dodge", alpha = 1, binwidth = 0.5) 0 100 200 300 1 2 3 4 5 rating1 count condition at−most fewer no−more slightly−less summary(items) ## materials participant item condition 2.3. RANDOM EFFECTS V EXPERIMENTECH 21 ## Length:1552 Min. : 1.00 1 : 97 Length:1552 ## Class :character 1st Qu.:25.00 10 : 97 Class :character ## Mode :character Median :50.00 11 : 97 Mode :character ## Mean :49.68 12 : 97 ## 3rd Qu.:74.00 13 : 97 ## Max. :98.00 14 : 97 ## (Other):970 ## position question.order rating1 content ## Min. : 3.00 Min. :1 Min. :1.000 Length:1552 ## 1st Qu.:10.75 1st Qu.:1 1st Qu.:1.000 Class :character ## Median :18.50 Median :1 Median :1.000 Mode :character ## Mean :18.41 Mean :1 Mean :1.823 ## 3rd Qu.:26.25 3rd Qu.:1 3rd Qu.:2.000 ## Max. :34.00 Max. :1 Max. :5.000 ## ## classAB ## Length:1552 ## Class :character ## Mode :character ## ## ## ## # linear model library(lmerTest) ## Loading required package: lme4 ## Loading required package: Matrix ## ## Attaching package: 'lmerTest' ## The following object is masked from 'package:lme4': ## ## lmer ## The following object is masked from 'package:stats': ## ## step items$condition <- as.factor(items$condition) items$condition <- relevel(items$condition, ref="at-most") m1 <- lmer(as.numeric(rating1) ~ condition + (1|participant) + (1|item), data=items) 22 CHAPTER 2. GAUSSOVA DISTRIBUCE summary(m1) ## Linear mixed model fit by REML. t-tests use Satterthwaite's method [ ## lmerModLmerTest] ## Formula: as.numeric(rating1) ~ condition + (1 | participant) + (1 | item) ## Data: items ## ## REML criterion at convergence: 4879.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.1573 -0.6556 -0.1995 0.4451 3.6016 ## ## Random effects: ## Groups Name Variance Std.Dev. ## participant (Intercept) 0.1438 0.3793 ## item (Intercept) 0.1341 0.3662 ## Residual 1.2334 1.1106 ## Number of obs: 1552, groups: participant, 97; item, 16 ## ## Fixed effects: ## Estimate Std. Error df t value Pr(>|t|) ## (Intercept) 1.276e+00 1.142e-01 3.003e+01 11.176 3.2e-12 *** ## conditionfewer 1.231e+00 7.980e-02 1.437e+03 15.428 < 2e-16 *** ## conditionno-more 3.811e-02 7.980e-02 1.437e+03 0.478 0.633 ## conditionslightly-less 9.162e-01 7.983e-02 1.438e+03 11.477 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Correlation of Fixed Effects: ## (Intr) cndtnf cndtnn## conditinfwr -0.349 ## conditnn-mr -0.349 0.500 ## cndtnslght- -0.349 0.500 0.500 library(emmeans) A konečně random effects # random effects m1.ranef <- ranef(m1) m1.ranef ## $participant ## (Intercept) 2.3. RANDOM EFFECTS V EXPERIMENTECH 23 ## 1 -0.047403661 ## 2 -0.047403661 ## 3 0.115362893 ## 4 -0.372936769 ## 5 0.033979616 ## 6 0.033979616 ## 7 0.725737470 ## 8 -0.169478577 ## 9 0.440896000 ## 10 -0.535703323 ## 11 0.115362893 ## 12 -0.535703323 ## 13 -0.413628407 ## 14 0.237437808 ## 15 -0.535703323 ## 16 0.033979616 ## 17 0.440896000 ## 18 -0.088095300 ## 19 0.033979616 ## 20 0.115362893 ## 21 -0.413628407 ## 22 -0.088095300 ## 23 0.156054531 ## 24 0.196746169 ## 25 -0.495011684 ## 26 0.847812385 ## 27 -0.210170215 ## 28 -0.047403661 ## 29 -0.047403661 ## 30 0.156054531 ## 31 -0.210170215 ## 33 -0.535703323 ## 34 -0.495011684 ## 35 0.278129446 ## 36 -0.047403661 ## 37 -0.047403661 ## 38 0.522279277 ## 39 -0.006712023 ## 40 0.033979616 ## 41 0.074671254 ## 42 0.237437808 ## 43 -0.372936769 ## 44 0.318821085 ## 45 -0.291553492 ## 46 0.156054531 ## 47 -0.454320046 24 CHAPTER 2. GAUSSOVA DISTRIBUCE ## 48 -0.210170215 ## 49 0.074671254 ## 50 0.115362893 ## 51 0.278129446 ## 52 -0.250861854 ## 53 0.440896000 ## 54 -0.006712023 ## 55 -0.413628407 ## 56 0.400204362 ## 57 -0.210170215 ## 58 0.074671254 ## 59 0.318821085 ## 60 -0.047403661 ## 61 -0.047403661 ## 62 0.278129446 ## 63 -0.128786938 ## 64 0.359512723 ## 65 0.074671254 ## 66 0.074671254 ## 67 -0.088095300 ## 68 -0.047403661 ## 69 -0.250861854 ## 70 -0.128786938 ## 71 0.522279277 ## 72 -0.169478577 ## 73 0.603662554 ## 74 0.400204362 ## 75 0.156054531 ## 76 -0.047403661 ## 77 0.400204362 ## 78 -0.006712023 ## 79 -0.210170215 ## 80 -0.006712023 ## 81 -0.535703323 ## 82 -0.372936769 ## 83 -0.047403661 ## 84 0.400204362 ## 85 -0.495011684 ## 86 -0.088095300 ## 87 -0.210170215 ## 88 0.278129446 ## 89 0.115362893 ## 90 -0.250861854 ## 91 -0.413628407 ## 92 -0.210170215 ## 93 0.278129446 2.3. RANDOM EFFECTS V EXPERIMENTECH 25 ## 94 -0.088095300 ## 95 0.033979616 ## 96 -0.088095300 ## 97 0.562970916 ## 98 0.033979616 ## ## $item ## (Intercept) ## 1 0.15475432 ## 10 -0.07879328 ## 11 0.19274689 ## 12 -0.56296464 ## 13 -0.51379691 ## 14 0.50501203 ## 15 -0.05207469 ## 16 -0.28989442 ## 2 0.10953101 ## 3 -0.31572869 ## 4 -0.04507284 ## 5 0.69147855 ## 6 0.31668773 ## 7 0.18333068 ## 8 0.04908930 ## 9 -0.34430505 ## ## with conditional variances for "participant" "item" m1.ranef$participant$`(Intercept)` ## [1] -0.047403661 -0.047403661 0.115362893 -0.372936769 0.033979616 ## [6] 0.033979616 0.725737470 -0.169478577 0.440896000 -0.535703323 ## [11] 0.115362893 -0.535703323 -0.413628407 0.237437808 -0.535703323 ## [16] 0.033979616 0.440896000 -0.088095300 0.033979616 0.115362893 ## [21] -0.413628407 -0.088095300 0.156054531 0.196746169 -0.495011684 ## [26] 0.847812385 -0.210170215 -0.047403661 -0.047403661 0.156054531 ## [31] -0.210170215 -0.535703323 -0.495011684 0.278129446 -0.047403661 ## [36] -0.047403661 0.522279277 -0.006712023 0.033979616 0.074671254 ## [41] 0.237437808 -0.372936769 0.318821085 -0.291553492 0.156054531 ## [46] -0.454320046 -0.210170215 0.074671254 0.115362893 0.278129446 ## [51] -0.250861854 0.440896000 -0.006712023 -0.413628407 0.400204362 ## [56] -0.210170215 0.074671254 0.318821085 -0.047403661 -0.047403661 ## [61] 0.278129446 -0.128786938 0.359512723 0.074671254 0.074671254 ## [66] -0.088095300 -0.047403661 -0.250861854 -0.128786938 0.522279277 ## [71] -0.169478577 0.603662554 0.400204362 0.156054531 -0.047403661 ## [76] 0.400204362 -0.006712023 -0.210170215 -0.006712023 -0.535703323 ## [81] -0.372936769 -0.047403661 0.400204362 -0.495011684 -0.088095300 26 CHAPTER 2. GAUSSOVA DISTRIBUCE ## [86] -0.210170215 0.278129446 0.115362893 -0.250861854 -0.413628407 ## [91] -0.210170215 0.278129446 -0.088095300 0.033979616 -0.088095300 ## [96] 0.562970916 0.033979616 summary(m1.ranef$participant$`(Intercept)`) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.535703 -0.210170 -0.006712 0.000000 0.156055 0.847812 hist(m1.ranef$participant$`(Intercept)`) Histogram of m1.ranef$participant$‘(Intercept)‘ m1.ranef$participant$‘(Intercept)‘ Frequency −0.5 0.0 0.5 1.0 0510152025 shapiro.test(m1.ranef$participant$`(Intercept)`) ## ## Shapiro-Wilk normality test ## ## data: m1.ranef$participant$`(Intercept)` ## W = 0.97906, p-value = 0.1239 # normal enough mean(m1.ranef$participant$`(Intercept)`) ## [1] 9.104604e-16 sd(m1.ranef$participant$`(Intercept)`) ## [1] 0.3060158 2.3. RANDOM EFFECTS V EXPERIMENTECH 27 shapiro.test(rnorm(100, mean = 5, sd = 3)) ## ## Shapiro-Wilk normality test ## ## data: rnorm(100, mean = 5, sd = 3) ## W = 0.98962, p-value = 0.6337 A nakonec pro items ranef m1.ranef$item$`(Intercept)` ## [1] 0.15475432 -0.07879328 0.19274689 -0.56296464 -0.51379691 0.50501203 ## [7] -0.05207469 -0.28989442 0.10953101 -0.31572869 -0.04507284 0.69147855 ## [13] 0.31668773 0.18333068 0.04908930 -0.34430505 summary(m1.ranef$item$`(Intercept)`) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.562965 -0.296353 0.002008 0.000000 0.185685 0.691479 hist(m1.ranef$item$`(Intercept)`) Histogram of m1.ranef$item$‘(Intercept)‘ m1.ranef$item$‘(Intercept)‘ Frequency −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 012345 shapiro.test(m1.ranef$item$`(Intercept)`) ## ## Shapiro-Wilk normality test ## 28 CHAPTER 2. GAUSSOVA DISTRIBUCE ## data: m1.ranef$item$`(Intercept)` ## W = 0.97494, p-value = 0.9109 # normal enough mean(m1.ranef$item$`(Intercept)`) ## [1] 5.646915e-14 sd(m1.ranef$item$`(Intercept)`) ## [1] 0.3499262 shapiro.test(rnorm(100, mean = 5, sd = 3)) ## ## Shapiro-Wilk normality test ## ## data: rnorm(100, mean = 5, sd = 3) ## W = 0.991, p-value = 0.7448 #t.test(m1.ranef$item$`(Intercept)`,m1.ranef$participant$`(Intercept)`) Chapter 3 Testování hypotéz 3.1 Historie • t.test • t-distribuce: typ normální distribuce s relativně malým počtem trials ve vzorku, známým mean a neznámou (v populaci) SD • autor: William Sealy Gosset, pseudonym Student • historie (Guinness brewery) • testování hypotéz: 1. Nulová hypotéza (data jsou čistě náhodná) 2. Experiment a testování toho, jak nepravděpodobný by byl výsledek měření, pokud by byla pravdivá nulová hypotéza 3. Zamítnutí nulové hypotézy, pokud je 𝑝 pod úrovní “threshold” – standardně 0.05 3.2 Příklad s mincí log_coin <- c(TRUE, FALSE) flips <- sample(log_coin, size = 100, replace = TRUE, prob = c(0.2, 0.8)) str(flips) ## logi [1:100] TRUE TRUE FALSE FALSE FALSE FALSE ... summary(flips) ## Mode FALSE TRUE 29 30 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## logical 82 18 t.test(flips, mu=0.5) ## ## One Sample t-test ## ## data: flips ## t = -8.2875, df = 99, p-value = 5.789e-13 ## alternative hypothesis: true mean is not equal to 0.5 ## 95 percent confidence interval: ## 0.1033848 0.2566152 ## sample estimates: ## mean of x ## 0.18 hist(as.numeric(flips)) Histogram of as.numeric(flips) as.numeric(flips) Frequency 0.0 0.2 0.4 0.6 0.8 1.0 020406080 • jasně neférová mince • menší rozdíl: 60/40 log_coin <- c(TRUE, FALSE) flips <- sample(log_coin, size = 100, replace = TRUE, prob = c(0.4, 0.6)) str(flips) ## logi [1:100] FALSE FALSE FALSE TRUE TRUE TRUE ... 3.2. PŘÍKLAD S MINCÍ 31 summary(flips) ## Mode FALSE TRUE ## logical 54 46 t.test(flips, mu=0.5) ## ## One Sample t-test ## ## data: flips ## t = -0.79855, df = 99, p-value = 0.4265 ## alternative hypothesis: true mean is not equal to 0.5 ## 95 percent confidence interval: ## 0.3606089 0.5593911 ## sample estimates: ## mean of x ## 0.46 hist(as.numeric(flips)) Histogram of as.numeric(flips) as.numeric(flips) Frequency 0.0 0.2 0.4 0.6 0.8 1.0 01020304050 - někdy ano, někdy, v závislosti na náhodném generování - 53/47: log_coin <- c(TRUE, FALSE) flips <- sample(log_coin, size = 100, replace = TRUE, prob = c(0.47, 0.53)) str(flips) 32 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## logi [1:100] FALSE TRUE TRUE FALSE FALSE TRUE ... summary(flips) ## Mode FALSE TRUE ## logical 56 44 t.test(flips, mu=0.5) ## ## One Sample t-test ## ## data: flips ## t = -1.2027, df = 99, p-value = 0.232 ## alternative hypothesis: true mean is not equal to 0.5 ## 95 percent confidence interval: ## 0.3410099 0.5389901 ## sample estimates: ## mean of x ## 0.44 hist(as.numeric(flips)) Histogram of as.numeric(flips) as.numeric(flips) Frequency 0.0 0.2 0.4 0.6 0.8 1.0 01020304050 • ne • nulová hypotéza: mince je férová • až v posledním případě byla nulová hypotéza potvrzena • v přechozích byla s 𝑝 … odmítnuta – tj. 𝑝 říká s jakou pravděpodobností by platila nulová hypotéza 3.3. LINGVISTICKÁ APLIKACE 33 3.3 Lingvistická aplikace • Baayen (2008) : chap. 4.2 • začíná už v 4.1: testy na normální rozložení • quantile-quantile grafy • Shapiro-Wilk test (s. 73) • 4.1.2: rozdíl mezi populací a mean • s. 75 library(languageR) t.test(durationsOnt$DurationPrefixNasal, mu = 0.053) ## ## One Sample t-test ## ## data: durationsOnt$DurationPrefixNasal ## t = -1.5038, df = 101, p-value = 0.1358 ## alternative hypothesis: true mean is not equal to 0.053 ## 95 percent confidence interval: ## 0.04561370 0.05401646 ## sample estimates: ## mean of x ## 0.04981508 data("durationsOnt") head(durationsOnt$DurationPrefixNasal) ## [1] 0.043905 0.065099 0.092454 0.064830 0.097262 0.085662 mean(durationsOnt$DurationPrefixNasal) ## [1] 0.04981508 library(languageR) simplex = ratings[ratings$Complex == "simplex", ] freqAnimals = simplex[simplex$Class == "animal", ]$Frequency freqPlants = simplex[simplex$Class == "plant", ]$Frequency t.test(freqAnimals, freqPlants) ## ## Welch Two Sample t-test ## 34 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## data: freqAnimals and freqPlants ## t = 2.674, df = 57.545, p-value = 0.009739 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## 0.193183 1.344315 ## sample estimates: ## mean of x mean of y ## 5.208494 4.439745 • data from chap. 2.2 3.4 Zpět k random efektům • napřed znovu načteme data A postupně se propracujeme k random-effectům (normální rozdělení) items <- items[-which(items$participant %in% c(32)),] items <- items %>% mutate(condition=replace(condition, condition=="item-méně_než", "fewer")) %>% mutate(condition=replace(condition, condition=="item-nanejvýš", "at-most")) %>% mutate(condition=replace(condition, condition=="item-ne_víc_než", "no-more")) %>% mutate(condition=replace(condition, condition=="item-trochu_méně", "slightly-less" as.data.frame() ddply(items, .(condition), summarise, Means = mean(rating1, na.rm=TRUE)) ## condition Means ## 1 at-most 1.265464 ## 2 fewer 2.507732 ## 3 no-more 1.314433 ## 4 slightly-less 2.203608 ddply(items, .(condition), summarise, Medians = median(rating1,na.rm=TRUE)) ## condition Medians ## 1 at-most 1 ## 2 fewer 2 ## 3 no-more 1 ## 4 slightly-less 2 library(ggplot2) data.to.plot <- items graph_to_plot <- ggplot(data.to.plot, aes(condition,rating1)) graph_to_plot + geom_boxplot() + stat_summary(fun.y=mean, geom="point", size=3) 3.4. ZPĚT K RANDOM EFEKTŮM 35 1 2 3 4 5 at−most fewer no−more slightly−less condition rating1 items$classAB <- "A" items$classAB[items$condition == "at-most"] <- "B" items$classAB[items$condition == "no-more"] <- "B" p <- ggplot(items, aes(condition, rating1, fill = classAB)) + stat_summary(geom = "bar", fun.y = mean, position = "dodge") + stat_summary(geom = "errorbar", fun.data = mean_se, size=.3, width=.2, position=position_dodge(.9)) p + labs(y = "acceptability") + theme(axis.text=element_text(size=15), axis.title=element_text(size=17,face="bold")) 36 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ 0 1 2 at−most fewer no−more slightly−less condition acceptability classAB A B # histograms # first by condition at_most <- filter(items, condition=="at-most") head(at_most) ## materials participant item condition position question.order rating1 ## 1 Items 1 2 at-most 21 1 1 ## 2 Items 1 6 at-most 32 1 1 ## 3 Items 1 10 at-most 28 1 1 ## 4 Items 1 14 at-most 9 1 1 ## 5 Items 2 1 at-most 18 1 1 ## 6 Items 2 5 at-most 26 1 1 ## ## 1 Kontext: Aleš čte na ## 2 Kontext: Aleš si čte pracovní řád firmy, kde j ## 3 Kontext: Aleš se dívá na válečný dokument, kde uslyší následující větu: „ ## 4 Kontext: Aleš čte článek o jedné ## 5 Kontext: Aleš čte na obalu čokolády násle ## 6 Kontext: Aleš si čte studijní řád místní univerzity, kde vidí následující větu: "D ## classAB ## 1 B ## 2 B ## 3 B 3.4. ZPĚT K RANDOM EFEKTŮM 37 ## 4 B ## 5 B ## 6 B write.csv(at_most, "at_most.csv") ggplot(at_most, aes(x=rating1)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. 0 100 200 300 1 2 3 4 5 rating1 count # two conditions at_most_no_more <- filter(items, condition=="at-most" | condition == "fewer") head(at_most_no_more) ## materials participant item condition position question.order rating1 ## 1 Items 1 1 fewer 17 1 4 ## 2 Items 1 2 at-most 21 1 1 ## 3 Items 1 5 fewer 5 1 5 ## 4 Items 1 6 at-most 32 1 1 ## 5 Items 1 9 fewer 34 1 2 ## 6 Items 1 10 at-most 28 1 1 ## ## 1 Kontext: Aleš čte na obalu čokolády následující vět ## 2 Kontext: Aleš čte na obalu piv ## 3 Kontext: Aleš si čte studijní řád místní univerzity, kde vidí následující větu: "Diplomová p 38 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## 4 Kontext: Aleš si čte pracovní řád firmy, kde j ## 5 Kontext: Aleš si čte zprávu valné hromady firmy EB, kde stojí následu ## 6 Kontext: Aleš se dívá na válečný dokument, kde uslyší následující větu: „ ## classAB ## 1 A ## 2 B ## 3 A ## 4 B ## 5 A ## 6 B ggplot(at_most_no_more, aes(x = rating1)) + geom_histogram(fill = "white", colour = "black") + facet_grid(condition ~ .) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. at−mostfewer 1 2 3 4 5 0 100 200 300 0 100 200 300 rating1 count # all conditions # https://r-graphics.org/recipe-distribution-multi-hist ggplot(items, aes(x = rating1)) + geom_histogram(fill = "white", colour = "black") + facet_grid(condition ~ .) ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. 3.4. ZPĚT K RANDOM EFEKTŮM 39 at−mostfewerno−moreslightly−less 1 2 3 4 5 0 100 200 300 0 100 200 300 0 100 200 300 0 100 200 300 rating1 count # tests for normality shapiro.test(filter(at_most_no_more, condition == "at-most")$rating1) ## ## Shapiro-Wilk normality test ## ## data: filter(at_most_no_more, condition == "at-most")$rating1 ## W = 0.39231, p-value < 2.2e-16 # filter(at_most_no_more, condition == "at-most")$rating1 # filter(at_most_no_more, condition == "at-most") # not normal distribution # Wilcox test wilcox.test(filter(at_most_no_more, condition == "at-most")$rating1, + filter(at_most_no_more, condition == "fewer")$rating1) ## ## Wilcoxon rank sum test with continuity correction ## ## data: filter(at_most_no_more, condition == "at-most")$rating1 and +filter(at_most_no_more, co ## W = 42624, p-value < 2.2e-16 ## alternative hypothesis: true location shift is not equal to 0 40 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ # quantile graphs qqnorm(filter(at_most_no_more, condition == "at-most")$rating1) −3 −2 −1 0 1 2 3 12345 Normal Q−Q Plot Theoretical Quantiles SampleQuantiles mean(filter(at_most_no_more, condition == "at-most")$rating1) ## [1] 1.265464 sd(filter(at_most_no_more, condition == "at-most")$rating1) ## [1] 0.7703438 # vs. Gauss distribution qqnorm(rnorm(length(filter(at_most_no_more, condition == "at-most")$rating1), 1.27, 0.7 3.4. ZPĚT K RANDOM EFEKTŮM 41 −3 −2 −1 0 1 2 3 −10123 Normal Q−Q Plot Theoretical Quantiles SampleQuantiles ggplot(items, aes(x = rating1, fill = condition)) + geom_histogram(position = "dodge", alpha = 1, binwidth = 0.5) 0 100 200 300 1 2 3 4 5 rating1 count condition at−most fewer no−more slightly−less summary(items) ## materials participant item condition 42 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## Length:1552 Min. : 1.00 1 : 97 Length:1552 ## Class :character 1st Qu.:25.00 10 : 97 Class :character ## Mode :character Median :50.00 11 : 97 Mode :character ## Mean :49.68 12 : 97 ## 3rd Qu.:74.00 13 : 97 ## Max. :98.00 14 : 97 ## (Other):970 ## position question.order rating1 content ## Min. : 3.00 Min. :1 Min. :1.000 Length:1552 ## 1st Qu.:10.75 1st Qu.:1 1st Qu.:1.000 Class :character ## Median :18.50 Median :1 Median :1.000 Mode :character ## Mean :18.41 Mean :1 Mean :1.823 ## 3rd Qu.:26.25 3rd Qu.:1 3rd Qu.:2.000 ## Max. :34.00 Max. :1 Max. :5.000 ## ## classAB ## Length:1552 ## Class :character ## Mode :character ## ## ## ## # linear model library(lmerTest) items$condition <- as.factor(items$condition) items$condition <- relevel(items$condition, ref="at-most") m1 <- lmer(as.numeric(rating1) ~ condition + (1|participant) + (1|item), data=items) summary(m1) ## Linear mixed model fit by REML. t-tests use Satterthwaite's method [ ## lmerModLmerTest] ## Formula: as.numeric(rating1) ~ condition + (1 | participant) + (1 | item) ## Data: items ## ## REML criterion at convergence: 4879.4 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.1573 -0.6556 -0.1995 0.4451 3.6016 3.4. ZPĚT K RANDOM EFEKTŮM 43 ## ## Random effects: ## Groups Name Variance Std.Dev. ## participant (Intercept) 0.1438 0.3793 ## item (Intercept) 0.1341 0.3662 ## Residual 1.2334 1.1106 ## Number of obs: 1552, groups: participant, 97; item, 16 ## ## Fixed effects: ## Estimate Std. Error df t value Pr(>|t|) ## (Intercept) 1.276e+00 1.142e-01 3.003e+01 11.176 3.2e-12 *** ## conditionfewer 1.231e+00 7.980e-02 1.437e+03 15.428 < 2e-16 *** ## conditionno-more 3.811e-02 7.980e-02 1.437e+03 0.478 0.633 ## conditionslightly-less 9.162e-01 7.983e-02 1.438e+03 11.477 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Correlation of Fixed Effects: ## (Intr) cndtnf cndtnn## conditinfwr -0.349 ## conditnn-mr -0.349 0.500 ## cndtnslght- -0.349 0.500 0.500 library(emmeans) A konečně random effects # random effects m1.ranef <- ranef(m1) m1.ranef ## $participant ## (Intercept) ## 1 -0.047403661 ## 2 -0.047403661 ## 3 0.115362893 ## 4 -0.372936769 ## 5 0.033979616 ## 6 0.033979616 ## 7 0.725737470 ## 8 -0.169478577 ## 9 0.440896000 ## 10 -0.535703323 ## 11 0.115362893 ## 12 -0.535703323 44 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## 13 -0.413628407 ## 14 0.237437808 ## 15 -0.535703323 ## 16 0.033979616 ## 17 0.440896000 ## 18 -0.088095300 ## 19 0.033979616 ## 20 0.115362893 ## 21 -0.413628407 ## 22 -0.088095300 ## 23 0.156054531 ## 24 0.196746169 ## 25 -0.495011684 ## 26 0.847812385 ## 27 -0.210170215 ## 28 -0.047403661 ## 29 -0.047403661 ## 30 0.156054531 ## 31 -0.210170215 ## 33 -0.535703323 ## 34 -0.495011684 ## 35 0.278129446 ## 36 -0.047403661 ## 37 -0.047403661 ## 38 0.522279277 ## 39 -0.006712023 ## 40 0.033979616 ## 41 0.074671254 ## 42 0.237437808 ## 43 -0.372936769 ## 44 0.318821085 ## 45 -0.291553492 ## 46 0.156054531 ## 47 -0.454320046 ## 48 -0.210170215 ## 49 0.074671254 ## 50 0.115362893 ## 51 0.278129446 ## 52 -0.250861854 ## 53 0.440896000 ## 54 -0.006712023 ## 55 -0.413628407 ## 56 0.400204362 ## 57 -0.210170215 ## 58 0.074671254 ## 59 0.318821085 3.4. ZPĚT K RANDOM EFEKTŮM 45 ## 60 -0.047403661 ## 61 -0.047403661 ## 62 0.278129446 ## 63 -0.128786938 ## 64 0.359512723 ## 65 0.074671254 ## 66 0.074671254 ## 67 -0.088095300 ## 68 -0.047403661 ## 69 -0.250861854 ## 70 -0.128786938 ## 71 0.522279277 ## 72 -0.169478577 ## 73 0.603662554 ## 74 0.400204362 ## 75 0.156054531 ## 76 -0.047403661 ## 77 0.400204362 ## 78 -0.006712023 ## 79 -0.210170215 ## 80 -0.006712023 ## 81 -0.535703323 ## 82 -0.372936769 ## 83 -0.047403661 ## 84 0.400204362 ## 85 -0.495011684 ## 86 -0.088095300 ## 87 -0.210170215 ## 88 0.278129446 ## 89 0.115362893 ## 90 -0.250861854 ## 91 -0.413628407 ## 92 -0.210170215 ## 93 0.278129446 ## 94 -0.088095300 ## 95 0.033979616 ## 96 -0.088095300 ## 97 0.562970916 ## 98 0.033979616 ## ## $item ## (Intercept) ## 1 0.15475432 ## 10 -0.07879328 ## 11 0.19274689 ## 12 -0.56296464 46 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ ## 13 -0.51379691 ## 14 0.50501203 ## 15 -0.05207469 ## 16 -0.28989442 ## 2 0.10953101 ## 3 -0.31572869 ## 4 -0.04507284 ## 5 0.69147855 ## 6 0.31668773 ## 7 0.18333068 ## 8 0.04908930 ## 9 -0.34430505 ## ## with conditional variances for "participant" "item" m1.ranef$participant$`(Intercept)` ## [1] -0.047403661 -0.047403661 0.115362893 -0.372936769 0.033979616 ## [6] 0.033979616 0.725737470 -0.169478577 0.440896000 -0.535703323 ## [11] 0.115362893 -0.535703323 -0.413628407 0.237437808 -0.535703323 ## [16] 0.033979616 0.440896000 -0.088095300 0.033979616 0.115362893 ## [21] -0.413628407 -0.088095300 0.156054531 0.196746169 -0.495011684 ## [26] 0.847812385 -0.210170215 -0.047403661 -0.047403661 0.156054531 ## [31] -0.210170215 -0.535703323 -0.495011684 0.278129446 -0.047403661 ## [36] -0.047403661 0.522279277 -0.006712023 0.033979616 0.074671254 ## [41] 0.237437808 -0.372936769 0.318821085 -0.291553492 0.156054531 ## [46] -0.454320046 -0.210170215 0.074671254 0.115362893 0.278129446 ## [51] -0.250861854 0.440896000 -0.006712023 -0.413628407 0.400204362 ## [56] -0.210170215 0.074671254 0.318821085 -0.047403661 -0.047403661 ## [61] 0.278129446 -0.128786938 0.359512723 0.074671254 0.074671254 ## [66] -0.088095300 -0.047403661 -0.250861854 -0.128786938 0.522279277 ## [71] -0.169478577 0.603662554 0.400204362 0.156054531 -0.047403661 ## [76] 0.400204362 -0.006712023 -0.210170215 -0.006712023 -0.535703323 ## [81] -0.372936769 -0.047403661 0.400204362 -0.495011684 -0.088095300 ## [86] -0.210170215 0.278129446 0.115362893 -0.250861854 -0.413628407 ## [91] -0.210170215 0.278129446 -0.088095300 0.033979616 -0.088095300 ## [96] 0.562970916 0.033979616 summary(m1.ranef$participant$`(Intercept)`) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.535703 -0.210170 -0.006712 0.000000 0.156055 0.847812 hist(m1.ranef$participant$`(Intercept)`) 3.4. ZPĚT K RANDOM EFEKTŮM 47 Histogram of m1.ranef$participant$‘(Intercept)‘ m1.ranef$participant$‘(Intercept)‘ Frequency −0.5 0.0 0.5 1.0 0510152025 shapiro.test(m1.ranef$participant$`(Intercept)`) ## ## Shapiro-Wilk normality test ## ## data: m1.ranef$participant$`(Intercept)` ## W = 0.97906, p-value = 0.1239 # normal enough mean(m1.ranef$participant$`(Intercept)`) ## [1] 9.104604e-16 sd(m1.ranef$participant$`(Intercept)`) ## [1] 0.3060158 shapiro.test(rnorm(100, mean = 5, sd = 3)) ## ## Shapiro-Wilk normality test ## ## data: rnorm(100, mean = 5, sd = 3) ## W = 0.98222, p-value = 0.1971 A nakonec pro items ranef 48 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ m1.ranef$item$`(Intercept)` ## [1] 0.15475432 -0.07879328 0.19274689 -0.56296464 -0.51379691 0.50501203 ## [7] -0.05207469 -0.28989442 0.10953101 -0.31572869 -0.04507284 0.69147855 ## [13] 0.31668773 0.18333068 0.04908930 -0.34430505 summary(m1.ranef$item$`(Intercept)`) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.562965 -0.296353 0.002008 0.000000 0.185685 0.691479 hist(m1.ranef$item$`(Intercept)`) Histogram of m1.ranef$item$‘(Intercept)‘ m1.ranef$item$‘(Intercept)‘ Frequency −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 012345 shapiro.test(m1.ranef$item$`(Intercept)`) ## ## Shapiro-Wilk normality test ## ## data: m1.ranef$item$`(Intercept)` ## W = 0.97494, p-value = 0.9109 # normal enough mean(m1.ranef$item$`(Intercept)`) ## [1] 5.646915e-14 3.5. DOMÁCÍ ÚKOL? (SPÍŠ NE) 49 sd(m1.ranef$item$`(Intercept)`) ## [1] 0.3499262 shapiro.test(rnorm(100, mean = 5, sd = 3)) ## ## Shapiro-Wilk normality test ## ## data: rnorm(100, mean = 5, sd = 3) ## W = 0.98584, p-value = 0.3643 t.test(m1.ranef$item$`(Intercept)`,m1.ranef$participant$`(Intercept)`) ## ## Welch Two Sample t-test ## ## data: m1.ranef$item$`(Intercept)` and m1.ranef$participant$`(Intercept)` ## t = 5.9846e-13, df = 18.976, p-value = 1 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.1943237 0.1943237 ## sample estimates: ## mean of x mean of y ## 5.646915e-14 9.104604e-16 3.5 Domácí úkol? (spíš ne) • t.test mezi dvěma podmínkami at_most <- filter(items, condition=="at-most") no_more <- filter(items, condition == "fewer") t.test(at_most$rating1,no_more$rating1) ## ## Welch Two Sample t-test ## ## data: at_most$rating1 and no_more$rating1 ## t = -13.745, df = 556.31, p-value < 2.2e-16 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -1.419801 -1.064735 ## sample estimates: ## mean of x mean of y ## 1.265464 2.507732 50 CHAPTER 3. TESTOVÁNÍ HYPOTÉZ • i když vyjde, tak to není přísně vzato statistický důkaz, protože odpovědi subjektů nejsou (s největší pravděpodobností) v normální distribuci • ukázat, zda mají podmínky normální distribuci: – graf – shapiro.test – případně použít wilcox.test a porovnat s t.testem – funguje i u dat, která nemají normální distribuci • cesta k lineárním modelům Bibliography Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press, Cambridge. 51