preview image
> NLP
Constituency Parsing
2024/07/19
455 words
3 mins

 

Aims to extract a constituency-based parse tree from a sentence that represents its syntactic structure according to a phrase structure grammar .

Example

             Sentence (S)
                 |
   +-------------+------------+
   |                          |
 Noun (N)                Verb Phrase (VP)
   |                          |
 John                 +-------+--------+
                      |                |
                    Verb (V)         Noun (N)
                      |                |
                    sees              Bill

(Somewhat) recent approaches convert the parse tree into a sequence following a depth-first traversal in order to be able to apply sequence-to-sequence models to it. The linearized version of the above parse tree looks as follows: (S (N) (VP V N)).


Penn Treebank

The Wall Street Journal section of the Penn Treebank was used to evaluate constituency parsers. Section 22 is used for development and Section 23 is used for evaluation. Models are evaluated based on F1. Most of the models listed below incorporate external data or features. For a comparison of single models trained only on WSJ, refer to Kitaev and Klein (2018).

ModelF1 ScorePaper / SourceCode
Span Attention + XLNet (Tian et al., 2020)96.40Improving Constituency Parsing with Span AttentionCode
Label Attention Layer + HPSG + XLNet (Mrini et al., 2020)96.38Rethinking Self-Attention: Towards Interpretability for Neural ParsingCode
Attach-Juxtapose Parser + XLNet (Yang and Deng, 2020)96.34Strongly Incremental Constituency Parsing with Graph Neural NetworksCode
Head-Driven Phrase Structure Grammar Parsing (Joint) + XLNet (Zhou and Zhao, 2019)96.33Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Head-Driven Phrase Structure Grammar Parsing (Joint) + BERT (Zhou and Zhao, 2019)95.84Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
CRF Parser + BERT (Zhang et al., 2020)95.69Fast and Accurate Neural CRF Constituency ParsingCode
Self-attentive encoder + ELMo (Kitaev and Klein, 2018)95.13Constituency Parsing with a Self-Attentive EncoderCode
Model combination (Fried et al., 2017)94.66Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
LSTM Encoder-Decoder + LSTM-LM (Takase et al., 2018)94.47Direct Output Connection for a High-Rank Language Model
LSTM Encoder-Decoder + LSTM-LM (Suzuki et al., 2018)94.32An Empirical Study of Building a Strong Baseline for Constituency Parsing
In-order (Liu and Zhang, 2017)94.2In-Order Transition-based Constituent Parsing
CRF Parser (Zhang et al., 2020)94.12Fast and Accurate Neural CRF Constituency ParsingCode
Semi-supervised LSTM-LM (Choe and Charniak, 2016)93.8Parsing as Language Modeling
Stack-only RNNG (Kuncoro et al., 2017)93.6What Do Recurrent Neural Network Grammars Learn About Syntax?
RNN Grammar (Dyer et al., 2016)93.3Recurrent Neural Network Grammars
Transformer (Vaswani et al., 2017)92.7Attention Is All You Need
Combining Constituent Parsers (Fossum and Knight, 2009)92.4Combining constituent parsers via parse selection or parse hybridization
Semi-supervised LSTM (Vinyals et al., 2015)92.1Grammar as a Foreign Language
Self-trained parser (McClosky et al., 2006)92.1Effective Self-Training for Parsing