Evaluation experiments in natural language processing often involve construction of samples from large lexical semantic resources, such as WordNet, Wiktionary, and OmegaWiki for evaluation and training purposes. The two most recurrent tasks are extraction of synsets and semantic relations between words. BabelNet is a resource which combines and interlinks all main lexical resources providing a unified assess to them. In this paper, we present BabelNet Extract, an open source tool which helps in addressing these two recurrent extraction tasks effectively in a parallelized manner from the large-scale multilingual BabelNet semantic network. The tool extracts individual word senses and the synsets they form as well as the semantic relations established between the synsets. We show its architecture, describe the output format, and discuss the use cases of the tool.
Dieser Eintrag ist Teil der Universitätsbibliographie.