Researchers from NYU Abu Dhabi (NYUAD) have developed an Online Readability Leveled Arabic Thesaurus. The work was conducted by Associate Professor of Practice of Arabic Language Muhamed Al Khalil in collaboration with Professor of Computer Science Nizar Habash, who also leads the Computational Approaches to Modeling Language (CAMeL) Lab.
The one-of-a-kind interface provides the possible roots, English glosses, related Arabic words and phrases, and readability on a five-level readability scale for a user-inputted Arabic word. It also connects multiple existing Arabic resources and processing tools, enabling Arabic speakers and learners to benefit from recent advances in Arabic computational linguistics technologies.
The interface is one of the products of the NYUAD-funded project Simplification of Arabic Masterpieces for Extensive Reading (SAMER), and a demo version of it is available for public use here.
A collaboration between NYUAD’s Arabic Studies Program and CAMeL Lab, SAMER seeks to create a standard for the simplification of modern fiction in Arabic to school-age learners and to use this standard to simplify a number of Arabic fiction masterpieces.
Established in September 2014, CAMeL’s mission is research and education in artificial intelligence, specifically focusing on natural language processing, computational linguistics, and data science. The main lab research areas are Arabic natural language processing, machine translation, text analytics, and dialogue systems.
The interface was presented as part of the International Conference on Computational Linguistics (COLING) 2020. The paper entitled A Large-Scale Leveled Readability Lexicon for Standard Arabic, (presented at the 12th Language Resources and Evaluation Conference in Marseille, France) provides further research background on the thesaurus.