Pavel Rychlý pary@fi.muni.cz 17. března 2014 Pavel Rychlý IB047 Morfologické značkování každý token značka několik desítek až tisíc značek (obsahující gramatické kategorie) Universal Tagset (Google) 12 značek - pouze slovní druhy jeden sloupec ve vertkálním tvaru Přístupy k syntaxi pro každou vetu vytvoríme strom zachycující vztahy mezi slovy a/nebo skupinami slov frázový (složkový) postupně ze slov vytváříme skupiny závislostní určujeme závislosti mezi jednotlivými slovy Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Phrase structure formalism - example saw man

with telescope Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Dependency formalism - example [root] adet Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Dependency vs. phrase-structure ■ Non-projectivity ■ disconnected phrases ■ not natural in the phrase structure notation ■ 20% of Czech sentences are reported to contain a non-projective dependency ■ Phrase structure - more fine-grained analysis ■ (new (queen of beauty)) ■ (new generation)(of fighters) ■ Coordinations and other "flat" phenomena ■ not natural in the dependency notation ■ problem for dependency analysis Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Non-projectivity - example Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Non-projectivity in phrase structure formalism Malou měl chaloupku Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Non-projectivity in phrase structure formalism měl Malou chaloupku Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Non-projectivity in phrase structure formalism měl Malou chaloupku Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Phrase structure expressivity [root] [root] queen generation New modifier °1dd- attached New modifier of p-attached beauty prep-object fightersprep_object Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Phrase structure expressivity New queen

of beauty New generation of fighters Vojtěch Kovář Automatic syntactic analysis for real-world applications >0 Q.O Fl MU Brno Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Coordinations - dependency structure □ s Vojtěch Kovář Fl MU Brno Automatic syntactic analysis for real-world applications Outline Introduction State of the art Bushbank Sketch grammar SET parser Applications Conclusions Coordinations - phrase structure fragment velmi tezký a rozměrný Vojtěch Kovář Automatic syntactic analysis for real-world applications Fl MU Brno