Student Researcher Asks, "Why Can't Robots Cook Like Grandma?"

For his Capstone research project, Mick Jermsurawong (Class of 2015) wanted to find out if it's possible for computers to understand expressive information from recipes.

Computers can do a lot of remarkable things these days. Understanding a recipe, however, isn't among them. Recipes, as they're commonly known, are simply lists of measurable ingredients followed by step-by-step cooking instructions. They are relatively easy for humans to follow and, under normal circumstances, we are able to predict the outcome of combining specific ingredients and exposing them to heat or cold. So why can't computers?

Unlike us, computers can't interpret meaning from an ingredient list, or anticipate what will happen when you mix eggs with milk, or place muffin batter into the oven at 425 degrees.

To machines, recipes are nothing more than a collection of meaningless characters, like text dropped into Microsoft Word. The program doesn't know if you've written the world's most insightful poem or next week's shopping list. The same is true for a computer and a recipe.

For his final project at NYU Abu Dhabi, Mick Jermsurawong (Class of 2015) set out to determine if and how computers could be programmed to understand expressive information, such as a recipe.

"The Ingredient Tree is a machine learning project," said the computer science graduate. "First, we represent the recipe using a flow chart model (for ease of understanding, he said, imagine the flow chart is a tree and the ingredients are its leaves, which lead down to the root to a final cooked product). "Then, we take text from a recipe, annotated to mirror the flow chart representation, and ask what insights a computer can extract from the labeled text."

A presentation slide shows a flowchart representation of The Ingredient Tree, a Capstone computer science project by Mick Jermsurawong, who graduated in May.

"The knowledge gained by the computer can help it automate construction of a new flow chart when given text recipes it hasn't seen before. It's an expressive representation," he explains further, "that goes beyond a simple list of words to describe ingredients and instructions. The structure shows how ingredients are organized along the cooking process."

The Ingredient Tree
A presentation slide shows a flowchart representation of The Ingredient Tree, a Capstone computer science project by Mick Jermsurawong, who graduated in May.
Inspiration for the project, he said, came in part from his brother, who is a chef.

"As a shorthand for writing recipes, he would sometimes bunch ingredients together: A + B = C and so on. This additive combination of ingredient subgroups is the underlying structure for how he understands recipes. The way ingredients are combined then become secondary details that he can draw from in the future to cook the same recipe again."

Jermsurawong said based on the flow chart model a computer could also be able to differentiate between Thai cuisines and Italian ones. For example, do Thai recipes use meat and vegetables more often and much earlier in the cooking process than typical Italian recipes? In what ways are Paid Thai and spaghetti bolognese similar?

But there are many challenges to overcome, and a lot more research is required, such as applying the current work to larger and more diverse sets of recipes or even improving on the model itself. This project, he stressed, is only a small stepping stone toward computers being able to process expressive information from recipes in a more rigorous way.

Even so, The Ingredient Tree is already gaining international attention. The Capstone paper, written during Jermsurawong's senior year at NYUAD, has been accepted into the 2015 Conference on Empirical Methods in Natural Language Processing, to be held in Portugal in September. It is one of just 312 computer science research projects added to the conference program from more than 1,300 submissions.