Cooking recipes exist in abundance; but due to their unstructured text format, they are hard to study quantitatively beyond treating them as simple bags of words. In this paper, we proposed an ingredient-instruction dependency tree data structure to represent recipes. The proposed representation allows for more refined comparison of recipes and recipe-parts, and is a step towards semantic representation of recipes. Furthermore, we built a parser that maps recipes into the proposed representation. The parser’s edge prediction accuracy of 93.5% improved over a strong baseline of 85.7% (54.5% error reduction).


Jermsurawong, Jermsak and Nizar Habash. Predicting the structure of Cooking Recipes. In Proceedings of EMNLP, Lisbon, 2015.


  • Jermsak Jermsurawong
  • Nizar Habash

Request a Copy

To get a copy of SIMMR corpus, please contact Nizar Habash.