Ontological Lexicon Enrichment: The Badea System For Semi-Automated Extraction Of Antonymy Relations From Arabic Language Corpora
Main Article Content
Abstract
language processing tools and applications; however, they are expensive to build, maintain, and extend. In this paper, we present the Badea system for the semi-automated extraction of lexical relations, specifically antonyms using a pattern-based approach to support the task of ontological lexicon enrichment. The approach is based on an ontology of “seed” pairs of antonyms in the Arabic language; we identify patterns in which the pairs occur and then use the patterns identified to find new antonym pairs in an Arabic textual corpora. Experiments are conducted on Badea using texts from three Arabic textual corpuses: KSUCCA, KACSTAC, and CAC. The system is evaluated and the patterns’ reliability and system performance is measured. The results from our experiments on the three Arabic corpora show that the pattern-based approach can be useful in the ontological enrichment task, as the evaluation of the system resulted in the ontology being updated with over 300 new antonym pairs, thereby enriching the lexicon and increasing its size by over 400%. Moreover, the results show important findings on the reliability of patterns in extracting antonyms for Arabic. The Badea system will facilitate the enrichment of ontological lexicons that can be very useful in any Arabic natural language processing system that requires semantic relation extraction.