Joint Hebrew Segmentation and Parsing using a PCFGLA Lattice Parser

Yoav Goldberg and Michael Elhadad
Ben Gurion University


Abstract

We experiment with extending a lattice parsing methodology for parsing Hebrew (Goldberg and Tsarfaty, 2008; Golderg et al., 2009) to make use of a stronger syntactic model: the PCFG-LA Berkeley Parser. We show that the methodology is very effective: using a small training set of about 5500 trees, we construct a parser which parses and segments unsegmented Hebrew text with an F-score of almost 80%, an error reduction of over 20% over the best previous result for this task. This result indicates that lattice parsing with the Berkeley parser is an effective methodology for parsing over uncertain inputs.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2124.pdf