Programming Prejudice
Has humanity accidentally programmed its own bias in machines?
By Naser Al Wasmi, NYU Abu Dhabi Public Affairs
Nizar Habash was at a big IT company presentation by a leading computational linguistics researcher when he realized the industry was unintentionally programming bias. The presentation showed a system capable of translating entire webpages from Arabic to English while maintaining the formatting. The demo ran smoothly at first, showing a mirror image of an Arabic news article in fluent English – quite impressive since this was years before Google Translate became a household name – except Habash saw a puzzling detail that went unnoticed by everyone else in the room.
In English, the headline read: “Second Palestinian Suicide Bombing in Baghdad Today” prompting the researcher running the demo to announce, “isn’t that a beautifully fluent sentence?” But Habash, a Palestinian who was born in Baghdad but grew up all over the Middle East, was confused – not by the program, which he admits was impressive, but by the content it produced.
“There was not a single mention of the word 'Palestinian' in the Arabic article anywhere. It was inserted statistically by the program. The word 'Palestinian' and the phrase 'suicide bombing' occurred with such frequency in the news that the machine bet on including it because it increases the probability that the sentence is correct. That’s when I realized that the machine translation system successfully modeled human bias,” he said.
At the time, in the burgeoning field of computational linguistics, the industry called those mistakes hallucinations. Today, they’re known as bias.
Habash, the program head of computer science at NYU Abu Dhabi who works on machine translation and Arabic language processing, has noticed that in attempting to create artificial intelligence based on human intelligence, the programming world has embedded prejudice, unknowingly or not.
He said that machine learning in translation programs is based on frequency of occurrence. Translation programming uses sequences to learn patterns of words that it extracts from websites translated in multiple languages. In this case, the machine’s proverbial Rosetta Stone is multilingual news websites that inherently embed their own slant, or bias.
“We know the world is full of bias; and data is an even more biased version of the world because those in power have more control over data creation; and all of our machine learning models have been developed with an assumption that recreating the data is the goal. So, it is no surprise that we end up recreating and sometimes magnifying those biases,” Habash explained.
He said, she said
A past NYUAD undergraduate student who was learning Arabic approached Habash saying that she was constantly being corrected by native speakers. She said that often when she’s stuck, she’ll turn to Google Translate to say something in Arabic, only to be told that she was using the masculine form. He encouraged her to start a Capstone Project on that, and now it’s turned into a full-fledged research project.
They ran a simulation that tested Google Translate’s preference for gender when translating from English, a gender-neutral language, to Arabic, which uses a two-gender system in its grammar. They found that nine times out of 10, Google Translate would output the masculine form of sentences.
Habash gave another example he encountered in his research: a large dataset used for building multilingual machine translation systems were exclusively expressed in the masculine voice. But with one exception: the term "I am divorced" was translatedin the female tense.
“There’s a certain degree of trust in the machine that can be quite dangerous if left unaccounted for.”
“There’s a built-in prejudice that needs to be addressed. Sociologists have been telling us this for years, it’s part of human society to begin with, but as we work on AI, we need to ask how do we minimize these effects. Do we want AI to be like us, or a better us?”
As more attention over the last 20 years was given to making translation programs work, Habash said the issue of gender and race bias was systematically placed lower on the priority list in favor of actually making the programs function. Now that we’ve entered what he says is the “Golden Age” of AI, more needs to be done to address this issue.
“There’s a certain degree of trust in the machine that can be quite dangerous if left unaccounted for,” he says pointing at his phone. “These machines, we’re trying to make them as close to us as possible, but we may not want them to represent the bias that’s already inherent in the world, we still have to set policies on how they behave.