Czech swarms: experimental evidence 15-11-2018, UCL London Mojmír Dočekal Intro 1/40 Basic pattern • swarm constructions (examples from Hoeksema (2009)): (1) a. Termites are swarming in my kitchen. [A-Subject] b. My kitchen is swarming with termites. [L-Subject] 2/40 Outline 1) Swarm constructions a) basic properties b) two theories c) alleged PPI behaviour 2) Experiment on Czech swarms 3) Results: extreme degree construction Slides: https://bit.ly/2QIHag2 3/40 Data German swarms (Hoeksema 2009, ex.(6)) (2) a. Ameisen Ants wimmeln swarm in in der the Küche. kitchen ‘Ants are swarming in the kitchen’ b. Die the Küche kitchen wimmelt swarms von with Ameisen. ants ‘The kitchen is swarming with ants’ c. Es it wimmelt swarms von with Ameisen ants in in der the Küche. kitchen ‘The kitchen is swarming with ants.’ 4/40 Czech swarms: (3) a. Na on té the louce meadow bzučely swarmed.3PL včely. bees.PL.NOM ‘The bees swarmed on the meadow.’ b. Ta the louka meadow bzučela swarmed.3SG včelami. bees.PL.INSTR ‘The meadow was swarming with bees.’ c. Na on té the louce meadow to it bzučelo swarmed.3SG včelami. bees.PL.INSTR ‘The meadow was swarming with bees.’ 5/40 Dowty (2000) 5 classes of swarms: 1) small local movements (repeated): crawl, drip, buble, dance, foam, rumble, pulsate, ... 2) animal (and other) sounds (repetitive): hum, buzz, whistle, resonate, echo, ... 3) kinds of light emission: beam, blaze, flame, glow, glitter, ... 4) smells and tastes: smell, taste, reek, ... 5) degree of occupancy/abundance: brim, teem, be rampant, ... 6/40 Two analyses 1) Dowty’s dynamic texture hypothesis • locations described by predicates: small, frequently repeated events • transfer of events to locations (4) My kitchen is swarming with termites. • many events → many subregions (+ many agents) • texture perception • many agents: cannot be counted 7/40 • analogical to transfer of telicity in accomplishments • nicely explains the constraint on objects: indefinites (mostly bare plurals and mass nouns: swarming with insect) • Dowty (2000, 123): (5) a. The room swarmed with mosquitoes. b. The room swarmed with a hundred mosquitoes. c. ??The room swarmed with seventy-three mosquitoes. d. My philodendron is crawling with dozens of snails. e. ??My philodendron is crawling with fifty-seven snails. 8/40 • A-construction is basic/default: “purely compositional”, “semantically unmarked” • L-construction: semantically potent/marked • Dowty: similar to middle alternations, conative alternation • the basic construction (A-construction for swarms) is not distinctive/marked • analogically to middle alternations, etc.: (6) a. Janet broke the vase. b. Crystal vases break easily. 9/40 2) Hoeksema’s analysis • impressive empirical evidence (Dutch, English, corpora, ...) • against Dowty’s texture hypothesis: not all subjects are strictly locative • but swarms always express a high degree: (7) a. Q: Was John angry? b. A: He was foaming with fury. (8) a. Q: Was the crowd loud? b. A: The walls were vibrating with their cheers. 10/40 • Hoeksema: swarms = causative degree constructions • “the object of with causes the subject to exhibit a high degree of some property by completely affecting it” • evidence: compatibility with high degree adverbs, incompatibility with low-degree adv: (9) a. The book is literally littered with typos. b. The yard was absolutely lousy with vermin. c. ??The book is somewhat littered with typos. d. ??The yard was a bit lousy with vermin. 11/40 Note • similar ideas for extreme adjectives: Morzycki (2012) • adjectives as fantastic, fabulous, awesome, ... allow only modifiers of extreme degree: (10) a. It was an absolutely/??somewhat fantastic concert. b. It was an utterly/??a bit awesome dinner. 12/40 • not easy to find decisive evidence (some experimental below) • Hoeksema claims for (11): (11) The book is littered with typos. • high degree doesn’t predict some typo to be on every page • Dowty: completely affected (every page) but doesn’t predict high degree • both approaches: only L-constructions have real swarm properties (restrictions on type of objects, high degree, markedness, ...) 13/40 Relation to PPIs • Hoeksema (2018) lists swarms (among extreme adjectives, some idioms, etc.) as high degree predicates: special sub-type of PPIs • Hoeksema (2018): some corpus evidence (Dutch, English) for the L-construction • Hoeksema (2009): swarms avoid negation and DE contexts (similar observations for extreme degree adjectives: Morzycki (2012)) • shares “high degree” property with Krifka’s TONS of money 14/40 The question behind the experiment: (12) To what extent are swarm constructions PPIs? 15/40 Corpus evidence from Czech • Czech national corpus (120 748 715 positions) • 2 of 3 prototypical swarm predicates show signs of PPI-hood 16/40 bzučet ‘buzz’ vs. zpívat ‘sing’ # [lemma="bzučet"&tag="V.........A.*"] ... 356 # [lemma="bzučet"&tag="V.........N.*"] ... 7 # [lemma="zpívat"&tag="V.........A.*"] ... 7741 # [lemma="zpívat"&tag="V.........N.*"] ... 253 challenge.df = matrix(c(356,7741,7,253), nrow = 2) colnames(challenge.df) = c("pos","neg") rownames(challenge.df) = c("bzučet","zpívat") fisher.test(challenge.df) ## ## Fisher's Exact Test for Count Data ## ## data: challenge.df ## p-value = 0.2167 ## alternative hypothesis: true odds ratio is not equal to 1 ## 95 percent confidence interval: ## 0.7866306 4.2080621 ## sample estimates: ## odds ratio ## 1.66214 17/40 hemžit se ‘swarm’ vs. pohybovat se ‘move’ # [lemma="hemžit"&tag="V.........A.*"] ... 823 # [lemma="hemžit"&tag="V.........N.*"] ... 5 # [lemma="pohybovat"&tag="V.........A.*"] ... 12795 # [lemma="pohybovat"&tag="V.........N.*"] ... 227 challenge.df = matrix(c(823,12795,5,227), nrow = 2) colnames(challenge.df) = c("pos","neg") rownames(challenge.df) = c("hemžit","pohybovat") fisher.test(challenge.df) ## ## Fisher's Exact Test for Count Data ## ## data: challenge.df ## p-value = 0.01096 ## alternative hypothesis: true odds ratio is not equal to 1 ## 95 percent confidence interval: ## 1.227920 9.104574 ## sample estimates: ## odds ratio ## 2.920073 18/40 třást se ‘tremble’ vs. hýbat ‘move’ # [lemma="třást"&tag="V.........A.*"] ... 3368 # [lemma="třást"&tag="V.........N.*"] ... 136 # [lemma="hýbat"&tag="V.........A.*"] ... 1786 # [lemma="hýbat"&tag="V.........N.*"] ... 918 challenge.df = matrix(c(3368,1786,136,918), nrow = 2) colnames(challenge.df) = c("pos","neg") rownames(challenge.df) = c("hemžit","pohybovat") fisher.test(challenge.df) ## ## Fisher's Exact Test for Count Data ## ## data: challenge.df ## p-value < 2.2e-16 ## alternative hypothesis: true odds ratio is not equal to 1 ## 95 percent confidence interval: ## 10.51357 15.48580 ## sample estimates: ## odds ratio ## 12.72324 19/40 • some properties resemble Krifka’s TONS of money • but other properties are different: 1) swarms are lexically derived: many gaps (no Czech swarm construction for whistle, flame, taste, ...) 2) i/ani are totally grammaticalized (pure formal constraints, no lexical gaps) 20/40 3) Czech swarms are probably not emphatic (no or weaker positive (=¬¬) bias in questions vs. minimizers): (13) You neg-have ani one drop? minimizers a. Yes, I had one beer. b. ??No, I didn’t have even radler. (14) Neg-swarms.3SG it policemen.INST.PL? swarms a. ?No, neg-BE n-word policeman. b. Yes, be.3PL there many policemen. 21/40 Experiment 22/40 The experiment on swarms • joint work with Iveta Šafratová • acceptability task: 5-point Likert scale: 1-worst, 5-best • 4x2 conditions design • 32 items (+32 fillers) • Latin-square design, IBEX farm • 50 subjects, all passed fillers 23/40 Czech data (example item from the experiment) Reference level condition (15) Ta the louka meadow bzučela swarmed.3SG včelami. bees.PL.INSTR The meadow was swarming with bees. L-construction (16) Na on té the louce meadow bzučely swarmed.3PL včely. bees.PL.NOM The bees swarmed on the meadow. A-construction 24/40 Degree condition (17) Ta the louka meadow trochu slightly bzučela swarmed.3SG včelami. bees.PL.INSTR (18) Na on té the louce meadow trochu slightly bzučely swarmed.3PL včely. bees.PL.NOM 25/40 Negation condition (19) Ta the louka meadow nebzučela neg-swarmed.3SG včelami. bees.PL.INSTR (20) Na on té the louce meadow nebzučely neg-swarmed.3PL včely. bees.PL.NOM 26/40 Rescuing condition (21) Jestli if to it dnes today na on louce meadow nebzučí neg-swarm.3SG včelami, bees.PL.INSTR, tak then zítra . . . bude. (22) Jestli if dnes today na on louce meadow nebzučí neg-swarm.3PL včely, bees.PL.NOM, tak then zítra . . . budou. 27/40 > ddply(data_part_1, .(Condition), summarise, Means = mean(Answer, na.rm=TRUE)) Condition Means 1 Deg-Inst 2.927083 2 Deg-Nom 3.541667 3 Neg-Inst 3.739583 4 Neg-Nom 4.239583 5 Ref-Inst 3.744792 6 Ref-Nom 4.619792 7 Resc-Inst 3.614583 8 Resc-Nom 4.078125 > ddply(data_part_1, .(Condition), summarise, Medians = median(Answer,na.rm=TRUE)) Condition Medians 1 Deg-Inst 3 2 Deg-Nom 4 3 Neg-Inst 4 4 Neg-Nom 5 5 Ref-Inst 4 6 Ref-Nom 5 7 Resc-Inst 4 8 Resc-Nom 5 > 28/40 q q q q 2.8 3.2 3.6 4.0 4.4 Deg Neg Ref Resc Condition Answer Nom q No Yes Figure 1: Error-bars, Experiment 29/40 Linear model for the experiment > summary(m1) Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest'] Formula: as.numeric(Answer) ~ Condition * Nom + (1 | Subj) + (1 | Item) Data: data_part_1 REML criterion at convergence: 4917.1 Scaled residuals: Min 1Q Median 3Q Max -3.5572 -0.5778 0.1593 0.7057 2.3157 Random effects: Groups Name Variance Std.Dev. Subj (Intercept) 0.1993 0.4464 Item (Intercept) 0.1897 0.4356 Residual 1.2932 1.1372 Number of obs: 1536, groups: Subj, 46; Item, 32 30/40 Fixed effects: Estimate Std. Error df t value Pr(>|t|) (Intercept) 3.733e+00 1.306e-01 1.383e+02 28.587 < 2e-16 *** ConditionDeg-Inst -8.175e-01 1.163e-01 1.453e+03 -7.030 3.16e-12 *** ConditionDeg-Nom -1.808e-01 1.164e-01 1.454e+03 -1.554 0.12034 ConditionNeg-Inst 6.586e-03 1.165e-01 1.454e+03 0.057 0.95491 ConditionNeg-Nom 5.242e-01 1.164e-01 1.454e+03 4.505 7.18e-06 *** ConditionRef-Nom 9.123e-01 1.166e-01 1.455e+03 7.824 9.78e-15 *** ConditionResc-Inst -1.158e-01 1.163e-01 1.453e+03 -0.996 0.31958 ConditionResc-Nom 3.551e-01 1.166e-01 1.455e+03 3.046 0.00236 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) CndD-I CndD-N CndN-I CndN-N CndtnRf-N CndR-I CndtnDg-Ins -0.445 CondtnDg-Nm -0.446 0.498 CndtnNg-Ins -0.446 0.501 0.498 CondtnNg-Nm -0.446 0.500 0.501 0.498 CondtnRf-Nm -0.447 0.499 0.502 0.502 0.500 CndtnRsc-In -0.445 0.498 0.500 0.501 0.498 0.501 CndtnRsc-Nm -0.447 0.501 0.500 0.502 0.502 0.503 0.499 fit warnings: fixed-effect model matrix is rank deficient so dropping 8 columns / coefficients 31/40 Summary of the experiment • reference level condition: Ref-Inst • against PPI-status: no significant difference against Neg-Inst • rescuing doesn’t work: Resc-Inst even worse • the construction is sensitive to degrees: Deg-Inst significantly worse • Nominative conditions are always better (default) • degree construction with no real PPI behaviour (against Hoeksema (2009) and Morzycki (2012)) 32/40 Another model • some hint of PPI-hood: Ref-Nom and Neg-Nom are significantly different: > data_part_1$Condition <- relevel(data_part_1$Condition, ref="Neg-Nom") > m1a <- lmer(as.numeric(Answer) ~ Condition * Nom + (1|Subj) + (1|Item), data=data_part_1) fixed-effect model matrix is rank deficient so dropping 8 columns / coefficients > summary(m1a) Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest'] Formula: as.numeric(Answer) ~ Condition * Nom + (1 | Subj) + (1 | Item) Data: data_part_1 ... Fixed effects: Estimate Std. Error df t value Pr(>|t|) (Intercept) 4.2567 0.1306 138.2625 32.601 < 2e-16 *** ConditionNeg-Inst -0.5176 0.1166 1455.3944 -4.439 9.73e-06 *** ConditionRef-Inst -0.5242 0.1164 1453.7258 -4.505 7.18e-06 *** ConditionDeg-Inst -1.3417 0.1164 1453.7258 -11.531 < 2e-16 *** ConditionDeg-Nom -0.7050 0.1163 1453.2580 -6.063 1.71e-09 *** ConditionRef-Nom 0.3881 0.1165 1454.4890 3.332 0.000883 *** ConditionResc-Inst -0.6399 0.1166 1455.3944 -5.488 4.78e-08 *** ConditionResc-Nom -0.1690 0.1163 1453.2580 -1.453 0.146332 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 33/40 Linguistic interpretation • PPI-status very improbable • reasons: 1) Krifka: PPI need emphatic/focus alternatives which are ordered logically (or by likelihood) – even [¬ ... ONE ...]: alternatives {1,2,3,...} – results of the experiment: no logical ordering (INST >d NOM/INST >d NOM) between nominative and instrumental – no likelihood ordering – covert Emph.Assert/even wouldn’t produce scalar presupposition 34/40 2) the nature of swarm-alternatives is different than in case of i/ani 3) interesting counter-evidence to Dowty’s claim: – A-construction (Nominative) is semantically marked even if default – clearly high degree constructions in both L/A realizations – if the mapping (to degrees) happens, then in both L/A realizations – problematic even for Hoeksema’s analysis 35/40 Research question: (23) If both nominative and instrumental are high-degree constructions, what is the difference between them? 36/40 37/40 Thanks! 38/40 Appendix • similar case pattern as in spray-load alternations: • total affectedness ≈ Instrumental • Instrumental: high degree, not prone for low degree modifiers (trochu ‘a bit’) (24) Petr Petr naložil loaded.3SG seno hay.ACC na on vůz. truck.ACC ‘Petr loaded the hay on the truck.’ (CHECK) (25) Petr Petr naložil loaded vůz truck.ACC senem. hay.INST ‘Petr loaded the truck with hay.’ 39/40 References I Dowty, David. 2000. “The Garden Swarms with Bees’ and the Fallacy of’argument Alternation’.” Hoeksema, Jack. 2009. “The Swarm Alternation Revisited.” Theory and Evidence in Semantics, no. 189: 53. ———. 2018. “Positive Polarity Predicates.” Linguistics 56 (2): 361–400. Morzycki, Marcin. 2012. “Adjectival Extremeness: Degree Modification and Contextually Restricted Scales.” Natural Language & Linguistic Theory 30 (2): 567–609. 40/40