Lecture 9 . ...... Syntactic Formalisms for Parsing Natural Languages Aleš Horák, Miloš Jakubíček, Vojtěch Kovář (based on slides by Juyeon Kang) ia161@nlp.fi.muni.cz Autumn 2013 IA161 Syntactic Formalisms for Parsing Natural Languages 1 / 40 Lecture 9 Outline HPSG Parser : Enju Parsing method Description of parser Result CCG Parser : C&C Tools Parsing method Description of parser Result IA161 Syntactic Formalisms for Parsing Natural Languages 2 / 40 Lecture 9 Theoretical backgrounds Lecture 3 about HPSG Parsing Lecture 6 & 7 about CCG Parsing and Combinatory Logic IA161 Syntactic Formalisms for Parsing Natural Languages 3 / 40 Lecture 9 Enju (Y. Miyao, J.Tsujii, 2004, 2008) Syntactic parser for English Developed by Tsujii Lab. Of the University of Tokyo Based on the wide-coverage probabilistic HPSG HPSG theory [Pollard and Sag, 1994] Useful links to Enju http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html http://www-tsujii.is.s.u-tokyo.ac.jp/enju/ IA161 Syntactic Formalisms for Parsing Natural Languages 4 / 40 Lecture 9 Motivations Parsing based on a proper linguistic formalism is one of the core research fields in CL and NLP. But! a monolithic, esoteric and inward looking field, largely dissociated from real world application. IA161 Syntactic Formalisms for Parsing Natural Languages 5 / 40 Lecture 9 Motivations So why not! The integration of linguistic grammar formalisms with statistical models to propose an robust, efficient and open to eclectic sources of information other than syntactic ones IA161 Syntactic Formalisms for Parsing Natural Languages 6 / 40 Lecture 9 Motivations Two main ideas Development of wide-coverage linguistic grammars Deep parser which produces semantic representation (predicate-argument structures) IA161 Syntactic Formalisms for Parsing Natural Languages 7 / 40 Lecture 9 Parsing method Application of probabilistic model in the HPSG grammar and development of an efficient parsing algorithm Accurate deep analysis Disambiguation Wide-coverage High speed Useful for high level NLP application IA161 Syntactic Formalisms for Parsing Natural Languages 8 / 40 Lecture 9 Parsing method 1 Parsing based on HPSG Mathematically well-defined with sophisticated constraint-based system Linguistically justified Deep syntactic grammar that provides semantic analysis IA161 Syntactic Formalisms for Parsing Natural Languages 9 / 40 Lecture 9 Parsing method Difficulties in parsing based on HPSG Difficult to develop a broad-coverage HPSG grammar Difficult to disambiguate Low efficiency: very slow IA161 Syntactic Formalisms for Parsing Natural Languages 10 / 40 Lecture 9 Parsing method Solution: Corpus-oriented development of an HPSG grammar The principal aim of grammar development is treebank construction Penn treebank is coverted into an HPSG treebank A lexicon and a probabilistic model are extracted from the HPSG treebank IA161 Syntactic Formalisms for Parsing Natural Languages 11 / 40 Lecture 9 Parsing method Approach: develop grammar rules and an HPSG treebank collect lexical entries from the HPSG treebank . ...... How to make an HPSG treebank? Convert Penn Treebank into HPSG and develop grammar by restructuring a treebank in conformity with HPSG grammar rules IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 40 Lecture 9 Parsing method HPSG = lexical entries and grammar rules Enju grammar has 12 grammar rules and 3797 lexical entries for 10,536 words (Miyao et al. 2004) IA161 Syntactic Formalisms for Parsing Natural Languages 13 / 40 Lecture 9 Parsing method Overview of grammar development 1. Treebank conversion 2. Grammar rule application 3. Lexical entry collection Modify constituent structures by adding feature structures Apply the grammar rule when a parse tree contains correct analysis and specified feature values are filled Collect terminal nodes of HPSG parse trees and assign predicate-argument structure IA161 Syntactic Formalisms for Parsing Natural Languages 14 / 40 Lecture 9 Parsing method 2 Probabilistic model and HPSG: Log-linear model for unification-based grammars (Abney 1997, Johnson et al. 1999, Riezler et al. 2000, Miyao et al. 2003, Malouf and van Noord 2004, Kaplan et al. 2004, Miyao and Tsujii 2005) p(T|w) w = “A blue eyes girl with white hair and skin walked T = A blue eyes girl with white hair and skin walked NP NP NP NP S NP NP PP VP IA161 Syntactic Formalisms for Parsing Natural Languages 15 / 40 Lecture 9 Parsing method T1 T2 T3 T4 Tn All possible parse trees derived from w with a grammar. For example, p(T3|w) is the probability of selecting T3 from T1, T2, …, and Tn. IA161 Syntactic Formalisms for Parsing Natural Languages 16 / 40 Lecture 9 Parsing method Log-linear model for unification-based grammars Input sentence: w w = w1/P1, w2/P2, . . . wn/Pn Output parse tree T Normalization factor Weight for a feature function Feature function IA161 Syntactic Formalisms for Parsing Natural Languages 17 / 40 Lecture 9 Description of parser IA161 Syntactic Formalisms for Parsing Natural Languages 18 / 40 Lecture 9 Description of parser parsing proceeds in the following steps: 1. preprocessing Preprocessor converts an input sentence into a word lattice. 2. lexicon lookup Parser uses the predicate to find lexical entries for the word lattice 3. kernel parsing Parser does phrase analysis using the defined grammar rules in the kernel parsing process. IA161 Syntactic Formalisms for Parsing Natural Languages 19 / 40 Lecture 9 Description of parser Chart data structure two dimensional table we call each cell in the table ‘CKY cell.’ Example Let an input sentence s(= w1, w2, w3, ..., wn), w1 = ”I”, w2 = ”saw”, w3 = ”a”, w4 = ”girl”, w5 = ”with”, w6 = ”a”, w7 = ”telescope” for the sentence “I saw a girl with a telescope”, the chart is arranged as follows. I saw a girl with a telescope 0,1 1,2 2,3 3,4 4,5 5,6 6,7 0,2 1,3 2,4 3,5 4,6 5,7 0,3 1,4 2,5 3,6 4,7 0,4 1,5 2,6 3,7 0,5 1,6 2,7 0,6 1,7 0,7 IA161 Syntactic Formalisms for Parsing Natural Languages 20 / 40 Lecture 9 Description of parser System overview Supertagger Enumeration of assignments Deterministic disambiguation Mary loved John HEAD noun Subj < > COMPS < > HEAD noun Subj < > COMPS < > HEAD noun Subj < > COMPS < > HEAD verb Subj COMPS HEAD noun Subj < > COMPS < > HEAD noun Subj < > COMPS < > HEAD noun Subj < > COMPS < > HEAD verb Subj COMPS HEAD verb Subj COMPS HEAD noun Subj < > COMPS < > HEAD verb Subj COMPS HEAD noun Subj < > COMPS < > Mary loved John Mary loved John IA161 Syntactic Formalisms for Parsing Natural Languages 21 / 40 Lecture 9 Demonstration http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html IA161 Syntactic Formalisms for Parsing Natural Languages 22 / 40 Lecture 9 Results Fast, robust and accurate analysis Phrase structures Predicate argument structures Accurate deep analysis – the parser can output both phrase structures and predicate-argument structures. The accuracy of predicate-argument relations is around 90% for newswire articles and biomedical papers. High speed – parsing speed is less than 500 msec. per sentence by default (faster than most Penn Treebank parsers), and less than 50 msec when using the highspeed setting (”mogura”). IA161 Syntactic Formalisms for Parsing Natural Languages 23 / 40 Lecture 9 C&C tools Developed by Curran and Clark [Clark and Curran, 2002, Curran, Clark and Bos, 2007], University of Edinburgh Wide-coverage statistical parser based on the CCG: CCG Parser Computational semantic tools named Boxer Useful links http://svn.ask.it.usyd.edu.au/trac/candc http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 40 Lecture 9 CCG Parser [Clark, 2007] Statistical parsing and CCG Advantages of CCG providing a compositional semantic for the grammar →completely transparent interface between syntax and semantics the recovery of long-range dependencies can be integrated into the parsing process in a straightforward manner IA161 Syntactic Formalisms for Parsing Natural Languages 25 / 40 Lecture 9 Parsing method Penn Treebank conversion : TAG, LFG, HPSG and CCG CCGBank [Hockenmaier and Steedman, 2007] CCG version of the Penn Treebank Grammar used in CCG parser CCGBank Some rules used as the grammar Lexical category set Training data for the statistical models Supertagger Parser IA161 Syntactic Formalisms for Parsing Natural Languages 26 / 40 Lecture 9 Parsing method-CCG Bank Corpus translated from the Penn Treebank, CCGBank contains Syntactic derivations Word-word dependencies Predicate-argument structures IA161 Syntactic Formalisms for Parsing Natural Languages 27 / 40 Lecture 9 Parsing method-CCG Bank Semi automatic conversion of phrase-structure trees in the Penn Treebank into CCG derivations Consists mainly of newspaper texts Grammar: Lexical category set Combinatory rules Unary type-changing rules Normal-form constraints Punctuation rules IA161 Syntactic Formalisms for Parsing Natural Languages 28 / 40 Lecture 9 Parsing method Supertagging [Clark, 2002] uses conditional maximum entropy models implement a maximum entropy supertagger ADV NOM PRP PRO:DEM NOM KON VER:pres VER:infi DET:ART tout commentaire sur cette proposition et prefere avancer les (s\1 s)/(s np/n (s\1 s)/n np/np s\1 s n np (np\np)/n (s\1 s)/np (n\n)/np pp_sur/np np/n n ((np\s)\( ((np\s)/n (np\s)/np (s/np)/(n np\s (np\s)/(n ((np\s_inf) (np\s_inf) np/n IA161 Syntactic Formalisms for Parsing Natural Languages 29 / 40 Lecture 9 Parsing method-Supertagger Set of 425 lexical categories from the CCGbank The per-word accuracy of the Supertagger is around 92% on unseen WSJ text. → Using the multi-supertagger increases the accuracy significantly – to over 98% – with only a small cost in increased ambiguity. IA161 Syntactic Formalisms for Parsing Natural Languages 30 / 40 Lecture 9 Parsing method-Supertagger Log-linear models in NLP applications: POS tagging Name entity recognition Chunking Parsing → referred as maximum entropy models and random fields IA161 Syntactic Formalisms for Parsing Natural Languages 31 / 40 Lecture 9 Parsing method-Supertagger Log-linear parsing models for CCG 1 the probability of a dependency structure 2 the normal-form model: the probability of a single derivation → modeling 2) is simpler than 1) 1 defined as P(π|S) = ∑ d∈∆(π) P(d, π|S) 2 defined using a log-linear form as follows: P(w|S) = 1 ZS eλ.f(w) ZS = ∑ w∈p(S) eλ.f(w′) IA161 Syntactic Formalisms for Parsing Natural Languages 32 / 40 Lecture 9 Parsing method-Supertagger Features common to the dependency and normal-form models Feature type Example LexCat + word (S/S)/NP + Before LexCat + POS (S/S)/NP + IN RootCat S[dcl] RootCat + World S[dcl] + was RootCat + POS S[dcl] + VBD Rule S[dcl] → NP S[dcl]\NP Rule + Word S[dcl] → NP S[dcl]\NP + bought Rule + POS S[dcl] → NP S[dcl]\NP + VBD IA161 Syntactic Formalisms for Parsing Natural Languages 33 / 40 Lecture 9 Parsing method-Supertagger Predicate-argument dependency features for the dependency model Feature type Example Word-Word ⟨bought, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩ Word-POS ⟨bought, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩ POS-Word ⟨VBD, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩ POS-POS ⟨VBD, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩ Word + Distance(words) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2 Word + Distance(punct) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0 Word + Distance(verbs) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0 POS + Distance(words) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2 POS + Distance(punct) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0 POS + Distance(verbs) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0 IA161 Syntactic Formalisms for Parsing Natural Languages 34 / 40 Lecture 9 Parsing method-Supertagger Rule dependency features for the normal-form model Feature type Example Word-Word ⟨company, S[dcl] → NP S[dcl]\NP, bought⟩ Word-POS ⟨company, S[dcl] → NP S[dcl]\NP, VBD⟩ POS-Word ⟨NN, S[dcl] → NP S[dcl]\NP, bought⟩ POS-POS ⟨NN, S[dcl] → NP S[dcl]\NP, VBD⟩ Word + Distance(words) ⟨bought, S[dcl] → NP S[dcl]\NP⟩+ > 2 Word + Distance(punct) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 2 Word + Distance(verbs) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 0 POS + Distance(words) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩+ > 2 POS + Distance(punct) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 2 POS + Distance(verbs) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 0 IA161 Syntactic Formalisms for Parsing Natural Languages 35 / 40 Lecture 9 Description of parser Input sentence CCGBank C&C taggers Supertaggers POStagger Chunker Parser Boxer IA161 Syntactic Formalisms for Parsing Natural Languages 36 / 40 Lecture 9 Demonstration http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo IA161 Syntactic Formalisms for Parsing Natural Languages 37 / 40 Lecture 9 Results Supertagger ambiguity and accuracy on section00 β k CATS/WORD ACC SENT ACC ACC(POS) SENT ACC 0.075 20 1.27 97.34 67.43 96.34 60.27 0.030 20 1.43 97.92 72.87 97.05 65.50 0.010 20 1.72 98.37 77.73 97.63 70.52 0.005 20 1.98 98.52 79.25 97.86 72.24 0.001 150 3.57 99.17 87.19 98.66 80.24 IA161 Syntactic Formalisms for Parsing Natural Languages 38 / 40 Lecture 9 Results Parsing accuracy on DepBank Relation dependent aux conj ta det arg_mod mod ncmod xmod cmod pmod arg CCG parser CCGbank Prec Rec F Prec Rec F # GRs 84.07 82.19 83.12 88.83 84.19 86.44 10,696 95.03 90.75 92.84 96.47 90.33 93.30 400 79.02 75.97 77.46 83.07 80.27 81.65 595 51.52 11.64 18.99 62.07 12.59 20.93 292 95.23 94.97 95.10 97.27 94.09 95.66 1,114 81.46 81.76 81.61 86.75 84.19 85.45 8,295 71.30 77.23 74.14 77.83 79.65 78.73 3,908 73.36 78.96 76.05 78.88 80.64 79.75 3,550 42.67 53.93 47.64 56.54 60.67 58.54 178 51.34 57.14 54.08 64.77 69.09 66.86 168 0.00 0.00 0.00 0.00 0.00 0.00 12 85.76 80.01 82.78 89.79 82.91 86.21 4,387 DepBank: Parc Dependency Bank [King et al. 2003] IA161 Syntactic Formalisms for Parsing Natural Languages 39 / 40 Lecture 9 Results subj_or_dobj 86.08 83.08 84.56 91.01 85.29 88.06 3,127 subj 84.08 75.57 79.60 89.07 78.43 83.41 1,363 nesubj 83.89 75.78 79.63 88.86 78.51 83.37 1,354 xsubj 0.00 0.00 0.00 50.00 28.57 36.36 7 csubj 0.00 0.00 0.00 0.00 0.00 0.00 2 comp 86.16 81.71 83.88 89.92 84.74 87.25 3,024 obj 86.30 83.08 84.66 90.42 85.52 87.90 2,328 dobj 87.01 88.44 87.71 92.11 90.32 91.21 1,764 obj2 68.42 65.00 66.67 66.67 60.00 63.16 20 iobj 83.22 65.63 73.38 83.59 69.81 76.08 544 clausal 77.67 72.47 74.98 80.35 77.54 78.92 672 xcomp 77.69 74.02 75.81 80.00 78.49 79.24 381 ccomp 77.27 70.10 73.51 80.81 76.31 78.49 291 pcomp 0.00 0.00 0.00 0.00 0.00 0.00 24 macroaverage 65.71 62.29 63.95 71.73 65.85 68.67 microaverage 81.95 80.35 81.14 86.86 82.75 84.76 IA161 Syntactic Formalisms for Parsing Natural Languages 40 / 40