Lecture 9
.
......
Syntactic Formalisms for Parsing
Natural Languages
Aleš Horák, Miloš Jakubíček, Vojtěch Kovář
(based on slides by Juyeon Kang)
ia161@nlp.fi.muni.cz
Autumn 2013
IA161 Syntactic Formalisms for Parsing Natural Languages 1 / 40
Lecture 9
Outline
HPSG Parser : Enju
Parsing method
Description of parser
Result
CCG Parser : C&C Tools
Parsing method
Description of parser
Result
IA161 Syntactic Formalisms for Parsing Natural Languages 2 / 40
Lecture 9
Theoretical backgrounds
Lecture 3 about HPSG Parsing
Lecture 6 & 7 about CCG Parsing and Combinatory Logic
IA161 Syntactic Formalisms for Parsing Natural Languages 3 / 40
Lecture 9
Enju (Y. Miyao, J.Tsujii, 2004, 2008)
Syntactic parser for English
Developed by Tsujii Lab. Of the University of Tokyo
Based on the wide-coverage probabilistic HPSG
HPSG theory [Pollard and Sag, 1994]
Useful links to Enju
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/
IA161 Syntactic Formalisms for Parsing Natural Languages 4 / 40
Lecture 9
Motivations
Parsing based on a proper linguistic formalism is one of the
core research ﬁelds in CL and NLP.
But!
a monolithic, esoteric and inward looking ﬁeld, largely
dissociated from real world application.
IA161 Syntactic Formalisms for Parsing Natural Languages 5 / 40
Lecture 9
Motivations
So why not!
The integration of linguistic grammar formalisms with
statistical models to propose an robust, eﬃcient and open to
eclectic sources of information other than syntactic ones
IA161 Syntactic Formalisms for Parsing Natural Languages 6 / 40
Lecture 9
Motivations
Two main ideas
Development of wide-coverage linguistic grammars
Deep parser which produces semantic representation
(predicate-argument structures)
IA161 Syntactic Formalisms for Parsing Natural Languages 7 / 40
Lecture 9
Parsing method
Application of probabilistic model in the HPSG grammar and
development of an eﬃcient parsing algorithm
Accurate deep analysis
Disambiguation
Wide-coverage
High speed
Useful for high level NLP application
IA161 Syntactic Formalisms for Parsing Natural Languages 8 / 40
Lecture 9
Parsing method
1 Parsing based on HPSG
Mathematically well-deﬁned with sophisticated constraint-based
system
Linguistically justiﬁed
Deep syntactic grammar that provides semantic analysis
IA161 Syntactic Formalisms for Parsing Natural Languages 9 / 40
Lecture 9
Parsing method
Diﬃculties in parsing based on HPSG
Diﬃcult to develop a broad-coverage HPSG grammar
Diﬃcult to disambiguate
Low eﬃciency: very slow
IA161 Syntactic Formalisms for Parsing Natural Languages 10 / 40
Lecture 9
Parsing method
Solution:
Corpus-oriented development of an HPSG grammar
The principal aim of grammar development is treebank
construction
Penn treebank is coverted into an HPSG treebank
A lexicon and a probabilistic model are extracted from the HPSG
treebank
IA161 Syntactic Formalisms for Parsing Natural Languages 11 / 40
Lecture 9
Parsing method
Approach:
develop grammar rules and an HPSG treebank
collect lexical entries from the HPSG treebank
.
......
How to make an HPSG treebank?
Convert Penn Treebank into HPSG and develop grammar by restructuring a treebank
in conformity with HPSG grammar rules
IA161 Syntactic Formalisms for Parsing Natural Languages 12 / 40
Lecture 9
Parsing method
HPSG = lexical entries and grammar rules
Enju grammar has 12 grammar rules and
3797 lexical entries for 10,536 words
(Miyao et al. 2004)
IA161 Syntactic Formalisms for Parsing Natural Languages 13 / 40
Lecture 9
Parsing method
Overview of grammar development
1. Treebank
conversion
2. Grammar rule
application
3. Lexical entry
collection
Modify constituent structures
by adding feature structures
Apply the grammar rule
when a parse tree contains
correct analysis and
speciﬁed feature values are ﬁlled
Collect terminal nodes
of HPSG parse trees
and assign
predicate-argument structure
IA161 Syntactic Formalisms for Parsing Natural Languages 14 / 40
Lecture 9
Parsing method
2 Probabilistic model and HPSG:
Log-linear model for uniﬁcation-based grammars
(Abney 1997, Johnson et al. 1999, Riezler et al. 2000, Miyao
et al. 2003, Malouf and van Noord 2004, Kaplan et al. 2004,
Miyao and Tsujii 2005)
p(T|w)
w = “A blue eyes girl with white hair and skin walked
T =
A blue eyes girl with white hair and skin walked
NP
NP
NP
NP
S
NP
NP
PP
VP
IA161 Syntactic Formalisms for Parsing Natural Languages 15 / 40
Lecture 9
Parsing method
T1 T2 T3 T4 Tn
All possible parse trees derived from w with a grammar.
For example, p(T3|w) is the probability of selecting T3 from T1,
T2, …, and Tn.
IA161 Syntactic Formalisms for Parsing Natural Languages 16 / 40
Lecture 9
Parsing method
Log-linear model for uniﬁcation-based grammars
Input sentence: w
w = w1/P1, w2/P2, . . . wn/Pn
Output parse tree T
Normalization
factor
Weight for a
feature function
Feature function
IA161 Syntactic Formalisms for Parsing Natural Languages 17 / 40
Lecture 9
Description of parser
IA161 Syntactic Formalisms for Parsing Natural Languages 18 / 40
Lecture 9
Description of parser
parsing proceeds in the following steps:
1. preprocessing
Preprocessor converts an input sentence into a word lattice.
2. lexicon lookup
Parser uses the predicate to ﬁnd lexical entries for the word
lattice
3. kernel parsing
Parser does phrase analysis using the deﬁned grammar rules
in the kernel parsing process.
IA161 Syntactic Formalisms for Parsing Natural Languages 19 / 40
Lecture 9
Description of parser
Chart
data structure
two dimensional table
we call each cell in the table ‘CKY cell.’
Example
Let an input sentence s(= w1, w2, w3, ..., wn), w1 = ”I”, w2 =
”saw”, w3 = ”a”, w4 = ”girl”, w5 = ”with”, w6 = ”a”, w7 =
”telescope” for the sentence “I saw a girl with a telescope”,
the chart is arranged as follows.
I saw a girl with a telescope
0,1 1,2 2,3 3,4 4,5 5,6 6,7
0,2 1,3 2,4 3,5 4,6 5,7
0,3 1,4 2,5 3,6 4,7
0,4 1,5 2,6 3,7
0,5 1,6 2,7
0,6 1,7
0,7
IA161 Syntactic Formalisms for Parsing Natural Languages 20 / 40
Lecture 9
Description of parser
System overview
Supertagger
Enumeration of
assignments
Deterministic
disambiguation
Mary loved John
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
HEAD verb
Subj <NP>
COMPS <NP>
HEAD noun
Subj < >
COMPS < >
Mary loved John
Mary loved John
IA161 Syntactic Formalisms for Parsing Natural Languages 21 / 40
Lecture 9
Demonstration
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/demo.html
IA161 Syntactic Formalisms for Parsing Natural Languages 22 / 40
Lecture 9
Results
Fast, robust and accurate analysis
Phrase structures
Predicate argument structures
Accurate deep analysis – the parser can output both phrase
structures and predicate-argument structures. The accuracy of
predicate-argument relations is around 90% for newswire
articles and biomedical papers.
High speed – parsing speed is less than 500 msec. per
sentence by default (faster than most Penn Treebank parsers),
and less than 50 msec when using the highspeed setting
(”mogura”).
IA161 Syntactic Formalisms for Parsing Natural Languages 23 / 40
Lecture 9
C&C tools
Developed by Curran and Clark [Clark and Curran, 2002,
Curran, Clark and Bos, 2007], University of Edinburgh
Wide-coverage statistical parser based on the CCG: CCG Parser
Computational semantic tools named Boxer
Useful links
http://svn.ask.it.usyd.edu.au/trac/candc
http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo
IA161 Syntactic Formalisms for Parsing Natural Languages 24 / 40
Lecture 9
CCG Parser [Clark, 2007]
Statistical parsing and CCG
Advantages of CCG
providing a compositional semantic for the grammar
→completely transparent interface between syntax and
semantics
the recovery of long-range dependencies can be integrated into
the parsing process in a straightforward manner
IA161 Syntactic Formalisms for Parsing Natural Languages 25 / 40
Lecture 9
Parsing method
Penn Treebank conversion : TAG, LFG, HPSG and CCG
CCGBank [Hockenmaier and Steedman, 2007]
CCG version of the Penn Treebank
Grammar used in CCG parser
CCGBank
Some rules
used as the grammar
Lexical category set
Training data for
the statistical models
Supertagger Parser
IA161 Syntactic Formalisms for Parsing Natural Languages 26 / 40
Lecture 9
Parsing method-CCG Bank
Corpus translated from the Penn Treebank, CCGBank contains
Syntactic derivations
Word-word dependencies
Predicate-argument structures
IA161 Syntactic Formalisms for Parsing Natural Languages 27 / 40
Lecture 9
Parsing method-CCG Bank
Semi automatic conversion of
phrase-structure trees in the Penn Treebank into
CCG derivations
Consists mainly of newspaper texts
Grammar:
Lexical category set
Combinatory rules
Unary type-changing rules
Normal-form constraints
Punctuation rules
IA161 Syntactic Formalisms for Parsing Natural Languages 28 / 40
Lecture 9
Parsing method
Supertagging [Clark, 2002]
uses conditional maximum entropy models
implement a maximum entropy supertagger
ADV NOM PRP PRO:DEM NOM KON VER:pres VER:inﬁ DET:ART
tout commentaire sur cette proposition et prefere avancer les
(s\1 s)/(s
np/n
(s\1 s)/n
np/np
s\1 s
n
np
(np\np)/n
(s\1 s)/np
(n\n)/np
pp_sur/np np/n n ((np\s)\(
((np\s)/n
(np\s)/np
(s/np)/(n
np\s
(np\s)/(n
((np\s_inf)
(np\s_inf) np/n
IA161 Syntactic Formalisms for Parsing Natural Languages 29 / 40
Lecture 9
Parsing method-Supertagger
Set of 425 lexical categories from the CCGbank
The per-word accuracy of the Supertagger is around 92% on
unseen WSJ text.
→ Using the multi-supertagger increases the accuracy
signiﬁcantly – to over 98% – with only a small cost in
increased ambiguity.
IA161 Syntactic Formalisms for Parsing Natural Languages 30 / 40
Lecture 9
Parsing method-Supertagger
Log-linear models in NLP applications:
POS tagging
Name entity recognition
Chunking
Parsing
→ referred as maximum entropy models and random
ﬁelds
IA161 Syntactic Formalisms for Parsing Natural Languages 31 / 40
Lecture 9
Parsing method-Supertagger
Log-linear parsing models for CCG
1 the probability of a dependency structure
2 the normal-form model: the probability of a single derivation
→ modeling 2) is simpler than 1)
1 deﬁned as P(π|S) =
∑
d∈∆(π)
P(d, π|S)
2 deﬁned using a log-linear form as follows: P(w|S) = 1
ZS
eλ.f(w)
ZS =
∑
w∈p(S)
eλ.f(w′)
IA161 Syntactic Formalisms for Parsing Natural Languages 32 / 40
Lecture 9
Parsing method-Supertagger
Features common to the dependency and normal-form models
Feature type Example
LexCat + word (S/S)/NP + Before
LexCat + POS (S/S)/NP + IN
RootCat S[dcl]
RootCat + World S[dcl] + was
RootCat + POS S[dcl] + VBD
Rule S[dcl] → NP S[dcl]\NP
Rule + Word S[dcl] → NP S[dcl]\NP + bought
Rule + POS S[dcl] → NP S[dcl]\NP + VBD
IA161 Syntactic Formalisms for Parsing Natural Languages 33 / 40
Lecture 9
Parsing method-Supertagger
Predicate-argument dependency features for the dependency
model
Feature type Example
Word-Word ⟨bought, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩
Word-POS ⟨bought, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩
POS-Word ⟨VBD, (S\NP1)/NP2, 2, stake, (NP\NP)/(S[dcl]/NP)⟩
POS-POS ⟨VBD, (S\NP1)/NP2, 2, NN, (NP\NP)/(S[dcl]/NP)⟩
Word + Distance(words) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2
Word + Distance(punct) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
Word + Distance(verbs) ⟨bought, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
POS + Distance(words) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 2
POS + Distance(punct) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
POS + Distance(verbs) ⟨VBD, (S\NP1)/NP2, 2, (NP\NP)/(S[dcl]/NP)⟩ + 0
IA161 Syntactic Formalisms for Parsing Natural Languages 34 / 40
Lecture 9
Parsing method-Supertagger
Rule dependency features for the normal-form model
Feature type Example
Word-Word ⟨company, S[dcl] → NP S[dcl]\NP, bought⟩
Word-POS ⟨company, S[dcl] → NP S[dcl]\NP, VBD⟩
POS-Word ⟨NN, S[dcl] → NP S[dcl]\NP, bought⟩
POS-POS ⟨NN, S[dcl] → NP S[dcl]\NP, VBD⟩
Word + Distance(words) ⟨bought, S[dcl] → NP S[dcl]\NP⟩+ > 2
Word + Distance(punct) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 2
Word + Distance(verbs) ⟨bought, S[dcl] → NP S[dcl]\NP⟩ + 0
POS + Distance(words) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩+ > 2
POS + Distance(punct) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 2
POS + Distance(verbs) ⟨VBD, S[dcl] → NP S[dcl]\NP⟩ + 0
IA161 Syntactic Formalisms for Parsing Natural Languages 35 / 40
Lecture 9
Description of parser
Input sentence
CCGBank
C&C taggers
Supertaggers
POStagger
Chunker
Parser
Boxer
IA161 Syntactic Formalisms for Parsing Natural Languages 36 / 40
Lecture 9
Demonstration
http://svn.ask.it.usyd.edu.au/trac/candc/wiki/Demo
IA161 Syntactic Formalisms for Parsing Natural Languages 37 / 40
Lecture 9
Results
Supertagger ambiguity and accuracy on section00
β k CATS/WORD ACC SENT ACC ACC(POS) SENT ACC
0.075 20 1.27 97.34 67.43 96.34 60.27
0.030 20 1.43 97.92 72.87 97.05 65.50
0.010 20 1.72 98.37 77.73 97.63 70.52
0.005 20 1.98 98.52 79.25 97.86 72.24
0.001 150 3.57 99.17 87.19 98.66 80.24
IA161 Syntactic Formalisms for Parsing Natural Languages 38 / 40
Lecture 9
Results
Parsing accuracy on DepBank
Relation
dependent
aux
conj
ta
det
arg_mod
mod
ncmod
xmod
cmod
pmod
arg
CCG parser CCGbank
Prec Rec F Prec Rec F # GRs
84.07 82.19 83.12 88.83 84.19 86.44 10,696
95.03 90.75 92.84 96.47 90.33 93.30 400
79.02 75.97 77.46 83.07 80.27 81.65 595
51.52 11.64 18.99 62.07 12.59 20.93 292
95.23 94.97 95.10 97.27 94.09 95.66 1,114
81.46 81.76 81.61 86.75 84.19 85.45 8,295
71.30 77.23 74.14 77.83 79.65 78.73 3,908
73.36 78.96 76.05 78.88 80.64 79.75 3,550
42.67 53.93 47.64 56.54 60.67 58.54 178
51.34 57.14 54.08 64.77 69.09 66.86 168
0.00 0.00 0.00 0.00 0.00 0.00 12
85.76 80.01 82.78 89.79 82.91 86.21 4,387
DepBank: Parc Dependency Bank
[King et al. 2003]
IA161 Syntactic Formalisms for Parsing Natural Languages 39 / 40
Lecture 9
Results
subj_or_dobj 86.08 83.08 84.56 91.01 85.29 88.06 3,127
subj 84.08 75.57 79.60 89.07 78.43 83.41 1,363
nesubj 83.89 75.78 79.63 88.86 78.51 83.37 1,354
xsubj 0.00 0.00 0.00 50.00 28.57 36.36 7
csubj 0.00 0.00 0.00 0.00 0.00 0.00 2
comp 86.16 81.71 83.88 89.92 84.74 87.25 3,024
obj 86.30 83.08 84.66 90.42 85.52 87.90 2,328
dobj 87.01 88.44 87.71 92.11 90.32 91.21 1,764
obj2 68.42 65.00 66.67 66.67 60.00 63.16 20
iobj 83.22 65.63 73.38 83.59 69.81 76.08 544
clausal 77.67 72.47 74.98 80.35 77.54 78.92 672
xcomp 77.69 74.02 75.81 80.00 78.49 79.24 381
ccomp 77.27 70.10 73.51 80.81 76.31 78.49 291
pcomp 0.00 0.00 0.00 0.00 0.00 0.00 24
macroaverage 65.71 62.29 63.95 71.73 65.85 68.67
microaverage 81.95 80.35 81.14 86.86 82.75 84.76
IA161 Syntactic Formalisms for Parsing Natural Languages 40 / 40