Lecture 6 . ...... Syntactic Formalisms for Parsing Natural Languages Aleš Horák, Miloš Jakubíček, Vojtěch Kovář (based on slides by Juyeon Kang) ia161@nlp.fi.muni.cz Autumn 2013 IA161 Syntactic Formalisms for Parsing Natural Languages 1 / 43 Lecture 6 . ...... Parsing with CCG IA161 Syntactic Formalisms for Parsing Natural Languages 2 / 43 Lecture 6 Outline 1 A-B categorial system 2 Lambek calculus 3 Extended Categorial Grammar Variation based on Lambek calculus Abstract Categorial Grammar, Categorial Type Logic Variation based on Combinatory Logic Combinatory Categorial Grammar (CCG) Multi-modal Combinatory Categorial Grammar IA161 Syntactic Formalisms for Parsing Natural Languages 3 / 43 Lecture 6 Categorial Grammar is : a lexicalized theory of grammar along with other theories of grammar such as HPSG, TAG, LFG, … : linguistically and computationally attractive −→ language invariant combination rules, high efficient parsing IA161 Syntactic Formalisms for Parsing Natural Languages 4 / 43 Lecture 6 Main idea in CG and application operation All natural language consists of operators and of operands. Operator (functor) and operand (argument) Application: (operator(operand)) Categorial type: typed operator and operand IA161 Syntactic Formalisms for Parsing Natural Languages 5 / 43 Lecture 6 1. A-B categorial system . ...... The product of the directional adaptation by Bar-Hillel (1953) of Ajdukiewicz’s calculus of syntactic connection (Ajdukiewicz, 1935) Definition 1 (AB categories). Given A, a finite set of atomic categories, the set of categories C is the smallest set such that: A ⊆ C (X\Y), (X/Y) ∈ C if X, Y ∈ C IA161 Syntactic Formalisms for Parsing Natural Languages 6 / 43 Lecture 6 1. A-B categorial system Categories (type): primitive categories and derivative categories Primitive: S for sentence, N for nominal phrase Derivative: S/N, N/N, (S\N)/N, NN/N, S/S . . . Forward(>) and backward (<) functional application a. X/Y Y ⇒ X (>) b. Y X\Y ⇒ X (<) IA161 Syntactic Formalisms for Parsing Natural Languages 7 / 43 Lecture 6 1. A-B categorial system Calculus on types in CG are analogue to algebraic operations . ...... x/y y → x ≈ 3/5 ∗ 5 = 3 Brazil defeated Germany n (s\n)/n n > s\n < s IA161 Syntactic Formalisms for Parsing Natural Languages 8 / 43 Lecture 6 1. A-B categorial system Applicative tree of Brazil defeated Germany defeated operator Germany operand Brazil operand @ defeated (Germany) @ ((defeated(Germany))Brazil) IA161 Syntactic Formalisms for Parsing Natural Languages 9 / 43 Lecture 6 Limitation of AB system 1 Relative construction a. teami that ti defeated Germany b. teami that Brazil defeated ti a’. that (n\n)/(s\n) team [that](n\n)/(s\n) [defeated Germany]s\n b’. that (n\n)/(s/n) team [that](n\n)/(s/n) [Brazil defeated]s/n (?) team that (n\n)/(s/n) Brazil n defeated (s\n)/n 3 Many others complex phenomena Coordination, object extraction, phrasal verbs, ... 4 AB’s generative power is too weak – context-free IA161 Syntactic Formalisms for Parsing Natural Languages 10 / 43 Lecture 6 2. Lambek calculus (Lambek, 1958, 1961) the calculus of syntactic types still context-free The axioms of Lambek calculus are the following: 1 x → x 2 (xy)z → x(yz) → (xy)z (the axioms 1, 2 with inference rules, 3, 4, 5) 3 If xy → z then x → z/y, if xy → z then y → x\z; 4 If x → z/y then xy → z, if y → x\z then xy → z; 5 If x → y and y → z then x → z. IA161 Syntactic Formalisms for Parsing Natural Languages 11 / 43 Lecture 6 2. Lambek calculus (Lambek, 1958, 1961) The rules obtained from the previous axioms are the following: 1 Hypothesis: if x and y are types, then x/y and y\x are types. 2 Application rules : (x/y)y → x, y(y\x) → x ex: Poor John works. 3 Associativity rule : (x\y)/z ↔ x\(y/z) ex: John likes Jane. 4 Composition rules : (x/y)(y/z) → x/z, (x\y)(y\z) → x\z ex: He likes him. s/(n\s)n\s/n 5 Type-raising rules : x → y/(x/y), x → (y/x)\y IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 43 Lecture 6 3. Combinatory Categorial Grammar Developed originally by M. Steedman (1988, 1990, 2000, ...) Combinatory Categorial Grammar (CCG) is a grammar formalism equivalent to Tree Adjoining Grammar, i.e. it is lexicalized it is parsable in polynomial time (See Vijay-Shanker and Weir, 1990) it can capture cross-serial dependencies Just like TAG, CCG is used for grammar writing CCG is especially suitable for statistical parsing IA161 Syntactic Formalisms for Parsing Natural Languages 13 / 43 Lecture 6 3. Combinatory Categorial Grammar several of the combinators which Curry and Feys (1958) use to define the λ-calculus and applicative systems in general are of considerable syntactic interest (Steedman, 1988) The relationships of these combinators to terms of the λ-calculus are defined by the following equivalences (Steedman, 2000b): a.Bfg ≡ λx.f(gx) ... composition b.Tx ≡ λf.fx ... type-raising c.Sfg ≡ λx.fx(gx) ... substitution IA161 Syntactic Formalisms for Parsing Natural Languages 14 / 43 Lecture 6 CCG categories Atomic categories: S, N, NP, PP, TV… Complex categories are built recursively from atomic categories and slashes Example complex categories for verbs: intransitive verb: S\NP walked transitive verb: (S\NP)/NP respected ditransitive verb: ((S\NP)/NP)/NP gave IA161 Syntactic Formalisms for Parsing Natural Languages 15 / 43 Lecture 6 Lexical categories in CCG An elementary syntactic structure – a lexical category – is assigned to each word in a sentence, eg: walked: S\NP ‘give me an NP to my left and I return a sentence’ Think of the lexical category for a verb as a function: NP is the argument, S the result, and the slash indicates the direction of the argument IA161 Syntactic Formalisms for Parsing Natural Languages 16 / 43 Lecture 6 The typed lexicon item The CCG lexicon assigns categories to words, i.e. it specifies which categories a word can have. Furthermore, the lexicon specifies the semantic counterpart of the syntactic rules, e.g.: love (S\NP)/NPλxλy.loves ′ xy Combinatory rules determine what happens with the category and the semantics on combination IA161 Syntactic Formalisms for Parsing Natural Languages 17 / 43 Lecture 6 The typed lexicon item Attribution of types to lexical items: examples Predicate ex: is as an identificator of nominal as an operator of predication from a nominal (S\NP)/NP from an adjective (S\NP)/(N/N) from an adverb (S\NP)/(S\NP)\(S\NP) from a preposition (S\NP)/((S\NP)\(S\NP)/NP) ex: verbs unary (S\NP) binary (S\NP)/NP ternary (S\NP)/NP/NP IA161 Syntactic Formalisms for Parsing Natural Languages 18 / 43 Lecture 6 The typed lexicon item Adverbs Adverb of verb (S\NP)/(S\NP) (S\NP)/NP/(S\NP)/NP Adverb of adverb (S\NP)/(S\NP)/(S\NP)/(S\NP) (S\NP)/NP/(S\NP)/NP/(S\NP)/NP/(S\NP)/NP Adverb of adjective (N/N)/(N/N) (N\N)/(N\N) Adverb of proposition S/S . ...... Adverb: operator of determination of type (X/X) IA161 Syntactic Formalisms for Parsing Natural Languages 19 / 43 Lecture 6 The typed lexicon item Preposition Prep. 1: constructor of adverbial phrase (S\NP)\(S\NP)/NP (S/S)/NP (S/S)/N Prep. 2: constructor of adjectival phrase (N\N)/NP (N\N)/N . ...... Preposition: constructor of determination of type (X/X) IA161 Syntactic Formalisms for Parsing Natural Languages 20 / 43 Lecture 6 Dictionary of typed words Syntactic categories Syntactic types Lexical entries Nom. N Olivia, apple… Completed nom. NP an apple, the school Pron. NP She, he… Adj. (N/N), (N\N) pretty woman,… Adv. (N/N)/(N/N), very delicious,… (S\NP)\(S\NP)… Vb (S\NP), (S\NP)/NP… run, give… Prep. (S\NP)\(S\NP)/NP run in the park, (NP\NP)/NP… book of John, … Relative (S\NP)/S… I believe that… IA161 Syntactic Formalisms for Parsing Natural Languages 21 / 43 Lecture 6 Combinatorial categorial rules Functional application (>, <) Functional composition (> B, < B) Type-raising (< T, > T) Distribution (< S, > S) Coordination (< Φ, > Φ) IA161 Syntactic Formalisms for Parsing Natural Languages 22 / 43 Lecture 6 Functional application (FA) X/Y : f Y : a⇒ X : fa(forward functional application, >) Y : a X\Y : f⇒ X : fa(backward functional application, <) Combine a function with its argument: John likes Mary ((likes (Mary))John) S S\NP (likes (Mary)) NP (S\NP)/NP NP Mary sleeps (sleeps (Mary)) S NP S\NP Direction of the slash indicates position of the argument with respect to the function IA161 Syntactic Formalisms for Parsing Natural Languages 23 / 43 Lecture 6 Derivation in CCG The combinatorial rule used in each derivation step is usually indicated on the right of the derivation line Note especially what happens with the semantic information John loves Mary NP : John′ (S\NP)/NP : λxλy.loves′xy NP : Mary′ > S\NP : λy.loves′Mary′y < S : loves′Mary′John′ IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 43 Lecture 6 Function composition (FC) Generalized forward composition (> Bn) X/Y : f Y/Z : g ⇒B X/Z : λx.f(gx) (> B) Functional composition composes two complex categories (two functions): (S\NP)/PP (PP/NP) ⇒B (S\NP)/NP S/(S\NP) (S\NP)/NP ⇒B S/NP S > S/NP > B S/(S\NP) > T NP birds (S\NP)/NP like NP bugs IA161 Syntactic Formalisms for Parsing Natural Languages 25 / 43 Lecture 6 Function composition (FC) Generalized backward composition (< Bn) Y\Z : f X\Y : g ⇒B X\Z : x.f(gx) (< B) The referee gave (s/np)/np Unsal np a card np and (X\X)/X Rivaldo np the ball np s\((s/np)/np < s IA161 Syntactic Formalisms for Parsing Natural Languages 26 / 43 Lecture 6 Type-raising (T) Forward type-raising (> T) X : a ⇒ T/(T\X) : λf.fa (> T) Type-raising turns an argument into a function (e.g. for case assignment) NP ⇒ S/(S\NP) (nominative) birds NP fly S\NP < S birds NP > T S/(S\NP) > S fly S\NP This must be used e.g. in the case of WH-questions IA161 Syntactic Formalisms for Parsing Natural Languages 27 / 43 Lecture 6 Example of functional composition (> B) and type-raising (T) team n that (n\n)/(s/np) I np >T s/(s\np) >B s/s thought (s\np)/s that s/s Brazil np >T s/(s\np) >B s/np defeated (s\np)/np >B s/np >B s/np > n\n < n IA161 Syntactic Formalisms for Parsing Natural Languages 28 / 43 Lecture 6 Example of functional composition (> B) and type-raising (T) Backward type-raising (< T) X : a ⇒ T\(T/X) : λf.fa (< T) Type-raising turns an argument into a function (e.g. for case assignment) NP ⇒ (S\NP)\((S\NP)/NP) (accusative) The referee gave (s/np)/np Unsal np s\((s/np)/np) < s IA161 Syntactic Formalisms for Parsing Natural Languages 29 / 43 Lecture 6 Coordination (&) X CONJ X ⇒Φ X (Coordination(Φ)) give (VP/NP)/NP a dog VP\((VP/NP)/NP) < VP IA161 Syntactic Formalisms for Parsing Natural Languages 30 / 43 Lecture 6 Substitution (S) Forward substitution (> S) (X/Y)/Z Y/Z ⇒S X/Z Application to parasitic gap such as the following: a. team that I persuaded every detractor of to support team that (n\n)/(s/np) I np >T s/(s\np) persuaded ((s\np)/(s\np))/np every detractor of np/np to support (s\np)/np >B ((s\np)/(s\np))/np >S (s\np)/np >B s/np > n\n IA161 Syntactic Formalisms for Parsing Natural Languages 31 / 43 Lecture 6 Substitution (S) Backward crossed substitution (< S×) Y/Z (X\Y)/Z ⇒S X/Z Application to parasitic gap such as the following: a. John watched without enjoying the game between Germany and Paraguay. b. game that John watched without enjoying . ...... game that John [watched](s\np)/np [without enjoying]((s\np)\(s\np))/np game that (n\n)/(s/np) John np >T s/(s\np) watched (s\np)/np without enjoying ((s\np)\(s\np))/np B s/(s\np) > n\n IA161 Syntactic Formalisms for Parsing Natural Languages 32 / 43 Lecture 6 Limit on possible rules The Principle of Adjacency: Combinatory rules may only apply to entities which are linguistically realised and adjacent. The Principle of Directional Consistency: All syntactic combinatory rules must be consistent with the directionality of the principal function. ex: X\Y Y ̸=> X The Principle of Directional Inheritance: If the category that results from the application of a combinatory rule is a function category, then the slash defining directionality for a given argument in that category will be the same as the one defining directionality for the corresponding arguments in the input functions. ex: X/Y Y/Z ̸=> X\Z. IA161 Syntactic Formalisms for Parsing Natural Languages 33 / 43 Lecture 6 Semantic in CCG CCG offers a syntax-semantics interface. The lexical categories are augmented with an explicit identification of their semantic interpretation and the rules of functional application are accordingly expanded with an explicit semantics. Every syntactic category and rule has a semantic counterpart. The lexicon is used to pair words with syntactic categories and semantic interpretations: love (S\NP)/NP ⇒ λxλy.loves′xy IA161 Syntactic Formalisms for Parsing Natural Languages 34 / 43 Lecture 6 Semantic in CCG The semantic interpretation of all combinatory rules is fully determined by the Principle of Type Transparency: Categories: All syntactic categories reflect the semantic type of the associated logical form. Rules: All syntactic combinatory rules are type-transparent versions of one of a small number of semantic operations over functions including application, composition, and type-raising. IA161 Syntactic Formalisms for Parsing Natural Languages 35 / 43 Lecture 6 Semantic in CCG proved := (S\NP3s)/NP : λxλy.prove′xy the semantic type of the reduction is the same as its syntactic type, here functional application. Marcel NP3sm : marcel′ proved (S\NP3s)/NP : λxλy.prove′xy completeness NP : completeness′ > S\NP3s : λy.prove′completeness′y < S : prove′completeness′marcel′ IA161 Syntactic Formalisms for Parsing Natural Languages 36 / 43 Lecture 6 Semantic in CCG CCG with semantics : Mary will copy and file without reading these articles Mary will S/VP copy VP/NP and CONJ file VP/NP without (VP\VP)/VPing reading VPing/NP these articles NP :p.Mary’ λp.will’ :copy’ :and’ :file’ λp.λq.without’pq :read’ :articles’ >B (VP\VP)/VPing :λx.λq.without’(read’ x)q VP/NP :λx.and’(without’(read’x)(file’x))(copy’x) < VP :and’(without’)(read’articles’)(file’articles’))(copy’articles’) > S :will’(and’(without’)(read’articles’)(file’articles’))(copy’articles’))mary’ IA161 Syntactic Formalisms for Parsing Natural Languages 37 / 43 Lecture 6 Parsing a sentence in CCG Step 1: tokenization Step 2: tagging the concatenated lexicon Step 3: calculate on types attributed to the concatenated lexicons by applying the adequate combinatorial rules eliminate the applied combinators (we will see how to do on next week) Step 4: finding the parsing results presented in the form of an operator/operand structure (predicate -argument structure) IA161 Syntactic Formalisms for Parsing Natural Languages 38 / 43 Lecture 6 Parsing a sentence in CCG Example: I requested and would prefer musicals STEP 1 : tokenization/lemmatization → ex) POS Tagger, tokenizer, lemmatizer a. I-requested-and-would-prefer-musicals b. I-request-ed-and-would-prefer-musical-s STEP 2 : tagging the concatenated expressions → ex) Supertagger, Inventory of typed words I NP Requested (S\NP)/NP And CONJ Would (S\NP)/VP Prefer VP/NP musicals NP IA161 Syntactic Formalisms for Parsing Natural Languages 39 / 43 Lecture 6 Parsing a sentence in CCG STEP 3 : categorial calculus c. apply the coordination rules Coordination: (< & >) X conj X ⇒ X b. apply the functional composition rules Forward Composition: (> B) X/Y : f Y/Z : g ⇒ X/Z : Bfg a. apply the type-raising rules Subject Type-raising (> T) NP : a ⇒ T/(T\NP) : Ta 7/ S 6/ S/NP NP (>) 5/ S/(S\NP) (S\NP)/NP NP (>B) 4/ S/(S\NP) (S\NP)/NP NP (> Φ) 3/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/NP NP (>B) 2/ S/(S\NP) (S\NP)/NP CONJ (S\NP)/VP VP/NP NP (>T) 1/ NP (S\NP)/NP CONJ (S\NP)/VP VP/NP NP I- requested- and- would- prefer- musicals IA161 Syntactic Formalisms for Parsing Natural Languages 40 / 43 Lecture 6 Parsing a sentence in CCG STEP 4 : semantic representation (predicate-argument structure) 7/S: and’(will’(prefer’ musicals’) i’)(request’ musicals’ i’) 6/ :λy.and’(would’(prefer’ musicals’)y)(request’ musicals’ y) 5/ : λxλy.and’(will’(prefer’x)y)(request’xy) 4/ : λxλy.and’(will’(prefer’x)y)(request’xy) 3/ : λx.λy.will’(prefer’x)y 2/ :λf.f I’ 1/ :i’ :request’ :and’ : will’ :prefer’ : musicals’ I requested and would prefer musicals IA161 Syntactic Formalisms for Parsing Natural Languages 41 / 43 Lecture 6 Variation of CCG : Multi-modal CCG (Baldridge, 2002) Modalized CCG system Combination of Categorial Type Logic (CTL, Morrill, 1994; Moortgat, 1997) into the CCG (Steedman, 2000) Rules restrictions by introducing the modalities: *, x, •, ♢ Modalized functional composition rules (> B) X/♢Y Y/♢Z ⇒ X/♢Z (< B) X\♢Y Y\♢Z ⇒ X\♢Z Invite you to read the paper “Multi-Modal CCG” of (Baldridge and M.Kruijff, 2003 ) IA161 Syntactic Formalisms for Parsing Natural Languages 42 / 43 Lecture 6 The positions of several formalisms on the Chomsky hierarchy Turing complete Context-sensitive Middly context-sensitive Context-free Unrestricted CTL CTL with Non-expanding Rules Multiset-CCG CCG TAG AB CTL Base Logic Lambek Calculus IA161 Syntactic Formalisms for Parsing Natural Languages 43 / 43