Polyglot NER: Massive Multilingual Named Entity Recognition

Abstract

We build a Named Entity Recognition system (NER) for 40 languages using only language agnostic methods. Our system relies only on un-supervised methods for feature generation. We obtain training data for the task of NER through a semi-supervised technique not relying whatsoever on anylanguage specific or orthographic features. This approach allows us to scale to large set of languages for which little human expertise and human annotated training data is available.

Publication
In Proceedings of the 2015 SIAM International Conference on Data Mining (SDM 2015)
Date
Links