Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
authors Emily Bender, Dan Flickinger, Stephan Oepen and Yi Zhang
venue Conference on Empirical Methods in Natural Language Processing
year 2011
abstract In order to obtain a fine-grained evaluation of parser accuracy over naturally occurring text, we study 100 examples each of ten reasonably frequent linguistic phenomena, randomly selected from a parsed version of the English Wikipedia. We construct a corresponding set of gold-standard target dependencies for these 1000 sentences, operationalize mappings to these targets from seven state-of-the-art parsers, and evaluate the parsers against this data to measure their level of success in identifying these dependencies.

download: pdf