Arabic automatic processing is challenging for a number of reasons. First, Arabic words are morphologically rich. Second, un-digitized Arabic words are highly ambiguous. This is why morphology and specifically diacritization is vital for applications of Arabic Natural Language Processing.
Cooking recipes exist in abundance; but due to their unstructured text format, they are hard to study quantitatively beyond treating them as simple bags of words. In this paper, we proposed an ingredient- instruction dependency tree data structure to represent recipes.
The goal of this work is to build speech-based search engines for low resource languages. There are several challenges in building such engines — this project focuses on two: mitigating the verbosity of spoken queries, and utilizing methods of speech processing that do not require a language model.