Parsing with HPSG - Lecture 3Syntactic formalisms for natural language parsing FI MU autumn 2011 2 Overview on syntactic formalisms ● Unification based grammars : HPSG, LFG, TAG, UCG... ● Dependency based grammars : Tesnière model; Meaning-Text of Mel'čuk... ● Application based grammars : CG, CCG, ACG, CTL... 3 Heritage of HPSG ● GPSG ● linear order/ hierarchy order feature structure for representation of information ● LFG ● Lexicon contains ● Lexical rules ● CG ● Subcategorization 4 Key points of HPSG ● Monostratal theory without derivation – Sharing a given information without movement and transformation – One representation for different levels of analysis : phonology, syntax, semantic – Constraint-based analyses ● Unification of given informations ● Computational formalism 5 Syntactic representation in HPSG ● Typed feature structure ● consists of a couple “attribute/value” ● the types are organized into a hierarchy ex: sign>phrase, case>nominative ● feature structure is a directed acyclic graph (DAG), with arcs representing features going between values 6 Features ● Basic element of structure in HPSG ● Should be appropriate to a type ● Most frequently used features – PHON – SYNSEM – LOC/NON-LOC – CAT – CONTEXT – CONTENT – HEAD – SUJ – COMPS – S-ARG – QUANT – .... 7 Types ● Types are attributed to features -> typed features – sign – synsem – head – phrase – content – Index – .... ● Each of these feature values is itself a complex object: – The type sign has the features PHON and SYNSEM appropriate for it – The feature SYNSEM has a value of type synsem – This type itself has relevant features (LOCAL and NON-LOCAL) 8 ● sign is the basic type in HPSG used to describe lexical items (of type word) and phrases (of type phrase). ● All signs carry the following two features: – PHON encodes the phonological representation of the sign – SYNSEM syntax and semantics 9 ● In attribute-value matrix (AVM) form, here is the skeleton of an object: 10 Structure of signs in HPSG ● synsem introduces the features LOCAL and NON-LOCAL ● local introduces CATEGORY (CAT), CONTENT (CONT) and CONTEXT(CONX) ● non-local will be discussed in connection with unbounded dependencies ● category includes the syntactic category and the grammatical argument of the word/phrase 11 Description of an object in HPSG: lexical sign and phrasal sign 12 CATEGORY ● CATEGORY encode the sign's syntactic category – Given via the feature [HEAD head], where head is the supertype for noun, verb, adjective, preposition, determiner, marker; each of these types selects a particular set of head features – Given via the feature [VALENCE ...], possible to combine the signs with the other signs to a larger phrases 13 Sub-categorization of head type 14 Description of an object in HPSG 15 Semantic representation : CONTENT (& CONTEXT) feature ● Semantic interpretation of the sign is given as the value to CONTENT ● nominal-object: an individual/entity (or a set of them), associated with a referring index, bearing agreement features → INDEX, RESTR ● Parameterized-state-of-affairs (psoa): a partial situate; an event relation along with role names for identifying the participants of the event→ BACKGR ● quantifier: some, all, every, a, the, . . . ● Note: many of these have been reformulated by “Minimal Recursion Semantics (MRS)” which allows underspecification of quantifier scopes, though a in-depth discussion of MRS is beyond the scope of this class 16 Sub-categorization of content type Note: Semantic restriction on the index are represented as a value of RESTR. RESTR is an attribute of a nominal object. The value of RESTR is a set of psoa. In turn, RESTR has the attribute of REL whose value can either be referential indices or psoas. 17 Sub-categorization of index type 18 Lexical input of She 19 ● Each phrase has a DTRS attribute which has a constituent-structure value ● This DTRS value corresponds to what we view in a tree as daughters (with additional grammatical role information, e.g. adjunct, complement, etc.) ● By distinguishing different kinds of constituentstructures, we can define different kinds of constructions in a language 20 Structure of phrase 21 head-subject/complement structure 22 Questions! (1) ● How exactly did the last example work? ● drink has head information specifying that it is a finite verb and subcategories for a subject and an object – The head information gets percolated up (the HEAD feature principle) – The valence information gets “checked off” as one moves up in the tree (the VALENCE principle) ● Such principles are treated as linguistic universals in HPSG 23 HEAD-feature principle ● The value of the HEAD feature of any headed phrase is token-identical with the HEAD value of the head daughter 24 VALENCE principle ● In a headed phrase, for each valence feature F, the F value of the head daughter is the concatenation of the phrase’s F value with the list of FDTR’s SYNSEM (Pollard and Sag, 1994:348) ● Note: Valence Principle constrains the way in which information is shared between phrases and their head daughters. – F can be any one of SUBJ, COMPS, SPR – When the F-DTR is empty, the F valence feature of the head daughter will be copied to the mother phrase 25 ● Note that agreement is handled neatly, simply by the fact that the SYNSEM values of a word’s daughters are token-identical to the items on the VALENCE lists ● How exactly do we decide on a syntactic structure? ● Why the subject is checked off at a higher point in the tree? Questions! (2) 26 Immediate Dominance (ID) Principle ● Every headed phrase must satisfy exactly one of the ID schemata – The exact inventory of valid ID schemata is language specific – We will introduce a set of ID schemata for English 27 Immediate Dominance (ID) Schemata 28 head-adjunct structure 29 Semantic principle ● The CONTENT value of a headed phrase is token identical to the CONTENT value of the semantic head daughter ● The semantic head daughter is identified as – The ADJ-DTR in a head-adjunct phrase – The HEAD-DTR in other headed phrases 30 SPEC principle ● In a headed phrase whose non-head daughter (either the MARK-DTR or COMP-DTR|FIRST) has a SYNSEM|LOCAL|CATEGORY|HEAD value of type functional, the spec value of that value must be tokenidentical with the phrase’s DTRS|HEAD-DTR| SYNSEM value 31 Example 2 Kim likes bagels 32 Kim likes(1) bagels 33 Kim likes(2) bagels 34 Kim likes bagels 35 ● head-complement schema 36 ● head-complement schema headed by likes 37 Kim likes bagels 38 ● head-subject schema 39 ● head-subject schema headed by likes bagels 40 Kim likes bagels 41 Tree of Kim likes bagels 42 Compare HPSG to CFG ● Each sign or HPSG rule consists of SYNSEM, DTRS, and PHON parts. ● The SYNSEM part specifies how the syntax and semantics of the phrase (or word) are constrained. It corresponds roughly to the left-hand side of CFG rules but contains much more information. ● The DTRS part specifies the constituents that make up the phrase (if it is a phrase). (Each of these constituents is a complete sign.) This corresponds to part of the information on the right-hand side of CFG rules, but not to ordering information. ● The PHON part specifies the ordering of the constituents in DTRS (where this is constrained) and the pronunciation of these (if this is specifiable). This corresponds to the the ordering information on the right-hand side of CFG rules. 43 Simulation of Bottom-up parsing algorithm in HPSG ● Unify input lexical-signs with lexical-signs in the lexicon. ● Until no more such unifications are possible – Unify instantiated signs with the daughters of instantiated phrasal signs or with phrasal signs in the grammar. if all instantiated signs but one saturated one (S) are associated with daughters of other instantiated signs and the PHON value of all instantiated signs is completely specified return the complete S structure else fail. 44 Example 2: processing of unification Kim walks The words in the sentence specify only their pronunciations and their positions. 1 [PHON ((0 1 kim))] 2 [PHON ((1 2 walks))] STEP 1: Unifying 1 with the lexical entry for Kim gives 3 [PHON ((0 1 kim)) SYNSEM [CAT [HEAD noun SUBCAT ()] CONTENT [INDEX 1 [PER 3rd NUM sing]] CONTEXT [BACKGR {[RELN naming BEARER 1 NAME Kim]}]]] We now know something about the meaning of Kim (it refers to somebody named Kim) and something about its syntactic properties (it is third person singular). 45 STEP 2: Unifying 2 with the lexical entry for walks gives 4 [PHON ((1 2 walks)) SYNSEM [CAT [HEAD [VFORM fin] SUBCAT ([CAT [HEAD noun SUBCAT ()] CONTENT [INDEX 1 [PER 3rd NUM sing]]])] CONTENT [RELN walk WALKER 1]]] We know that walks refers to walking and that it requires a subject noun phrase which refers to the walker but doesn't require any object. 1 [PHON ((0 1 kim))] 2 [PHON ((1 2 walks))] 46 STEP 3: Unifying 4 with the HEAD-DTR of this rule gives 5 [SYNSEM [CAT [HEAD [VFORM fin] SUBCAT 2([CAT [HEAD noun SUBCAT ()] CONTENT [INDEX 1 [PER 3rd NUM sing]]])] CONTENT 4[RELN walk WALKER 1]] DTRS [HEAD-DTR [SYNSEM [CAT [HEAD [VFORM fin] SUBCAT (2)]] CONTENT [4] PHON 3((1 2 walks))] SUBJ-DTRS ()] PHON 3((1 2 walks))] Now we have a VP with the transitive verb walks as its head (and only constituent). HEAD-DTR rule [SYNSEM [CAT [HEAD 1 SUBCAT (2)] CONTENT 4] DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1 SUBCAT (2)] CONTENT 4] PHON 3] SUBJ-DTRS ()] PHON 3] 47 HEAD-DTR rule 6 [SYNSEM [CAT [HEAD 1 SUBCAT ()] CONTENT 4] DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1 SUBCAT (2)] CONTENT 4] PHON 3] SUBJ-DTRS ([PHON 5 SYNSEM 2])] PHON (5 < 3)] STEP 4: Unifying 5 with the HEAD-DTR of this rule gives 7 [SYNSEM [CAT [HEAD 1[VFORM fin SUBCAT ()]] CONTENT 4[RELN walk WALKER ]] DTRS [HEAD-DTR [SYNSEM [CAT [HEAD 1[VFORM fin] SUBCAT 2([CAT [HEAD noun SUBCAT ()] CONTENT [INDEX [PER 3rd NUM sing]]])] CONTENT [RELN walk WALKER 4]] PHON 3((1 2 walks))] SUBJ-DTRS ([PHON 5 SYNSEM 2[CAT [HEAD noun SUBCAT ()] CONTENT [INDEX ]]])] PHON (5 < 3((1 2 walks)))] 48 STEP 5: Unifying 3 with the SUBJ-DTR of 7 gives 8 [SYNSEM [CAT [HEAD [VFORM fin SUBCAT ()]] CONTENT [RELN walk WALKER [PER 3rd NUM sing]]] DTRS [HEAD-DTR [SYNSEM [CAT [HEAD [VFORM fin] SUBCAT ([CAT [HEAD noun SUBCAT ()] CONTENT [INDEX [PER 3rd NUM sing]]]) CONTENT [RELN walk WALKER [PER 3rd NUM sing]]] PHON ((1 2 walks))] SUBJ-DTRS ([PHON ((0 1 kim)) SYNSEM [CAT [HEAD noun SUBCAT ()] CONTENT [INDEX [PER 3rd NUM sing]]])] PHON ((0 1 kim) (1 2 walks))] Now the subject of the sentence is pronounceable, and we're done. 49 Phenomena covered by HPSG parsers ● Case assignment ● Word order : scrambling ● Long distance dependency ● Coordination ● Scope of adverbs and negation ● Topic drop ● Agreement ● Relative clause ● ... 50 Example 3: unbounded dependency construction ● An unbounded dependency construction – involves constituents with different functions – involves constituents of different categories – is in principle unbounded ● Two kind of unbounded dependency constructions (UDCs) – Strong UDCs – Weak UDCs 51 Strong UDCs ● An overt constituent occurs in a non-argument position: – Topicalization: Kimi, Sandy loves_i . – Wh-questions: I wonder [whoi Sandy loves_ i]. – Wh-relative clauses: This is the politician [who i Sandy loves_ i ]. – It -clefts: It is Kimi [whoi Sandy loves_ i ]. – Pseudoclefts: [Whati Sandy loves_ i ] is Kimi. 52 Weak UDCs ● No overt constituent in a non-argument position: – Purpose infinitive (for -to clauses): I bought it i for Sandy to eat_ i . – Tough movement: Sandy i is hard to love_ i . – Relative clause without overt relative pronoun: This is [the politician] i [Sandy loves_ i ]. – It -clefts without overt relative pronoun: It is Kim i [Sandy loves_ i ]. 53 Using the feature SLASH ● To account for UDCs, we will use the feature SLASH (so-named because it comes from notation like S/NP to mean an S missing an NP) ● This is a non-local feature which originates with a trace, gets passed up the tree, and is finally bound by a filler 54 The bottom of a UDC: Traces ● phonologically null, but structure-shares local and slash values 55 Traces ● Because the local value of a trace is structureshared with the slash value, constraints on the trace will be constraints on the filler. – For example, hates specifies that its object be accusative, and this case information is local – So, the trace has [synsem|local|cat|head|case acc] as part of its entry, and thus the filler will also have to be accusative *Hei/Himi, John likes_ i 56 The middle of a UDC: The Nonlocal Feature Principle (NFP) ● For each NON-LOCAL feature, the inherited value on the mother is the union of the inherited values on the daughter minus the to-bind value on the head daughter. ● In other words, the slash information (which is part of inherited) percolates “up” the tree ● This allows the all the local information of a trace to “move up” to the filler 57 ● The top of a UDC: filler-head structures Example for a structure licensed by the filler-head schema 58 ● The analysis of the UDC example Johni we know She likes_i S S 59 Example 4 John reads a new book 60 John reads a new book 61 John read a new book ● Note: apply head-adjunct schema 62 John reads a new book 63 John reads a new book 64 John reads a new book -completed analysis