After the Turing Test
After the Turing Test
Nearly 100 years ago, a Czech science fiction play called R.U.R (in which robots overtake humans) first suggested the threat of an artificial intelligence (AI) takeover. Now that AI technologies control everything from phones to cars, the fear of a coup is more pertinent than ever. But whether you’re concerned about a robot rebellion or simply want to monitor the evolution of AI, there is one common question: How can you actually tell if a machine has developed a human-like level of awareness?
In 1950, British mathematician Alan Turing proposed a hypothetical test to measure computer intelligence. “The Imitation Game” involved two respondents — one human and one machine—along with a human judge, who would sit in a room and view different terminals. The judge would ask the two contestants questions for five minutes and, based on their answers, guess which was the human. The test would then be repeated. If the judge mistook the machine as human more than 30% of the time, the computer was determined to have AI. Turing surmised that it would take about 50 years before computers had enough capacity to pass the test.
Many researchers now believe that the Turing Test was inherently flawed, because it relied too heavily on natural language skills and deception— issues that led to controversy in June 2014, when a chatbot called Eugene Goostman passed the Turing Test by tricking judges into believing that it was a 13-year-old Ukrainian boy. To address these and others concerns, the AI community has proposed alternatives to the classic, but antiquated, Turing Test.
One proposal that’s gaining traction is the Winograd Schema Challenge, created by Hector Levesque, a Computer Science Professor at University of Toronto. The Challenge requires an understanding of both language and common sense knowledge to pass. Here’s an example:
Statement: The trophy would not fit in the brown suitcase because it was too big.
Q: What was too big? The suitcase, or the trophy?
An AI might lack the basic human experience to determine which “it” the problem refers to.
The first Winograd Schema Challenge being held in February 2016, offers a $25,000 prize for anyone whose AI solves the test.
Another modification to Turing’s test is Lovelace 2.0, created by Georgia Tech professor Mark Reidl. Lovelace 2.0 focuses on creativity: The computer must follow a specific instruction, like rendering a picture of a pink ghost emerging from a square pumpkin. To pass, the AI would have to understand both individual words and hidden assumptions in the request. Reidl believes that many creative tasks— like creating problems, or even stacking LEGO to make a shape—are at the nexus of our uniquely human intelligence.
But the best test, perhaps, was suggested by NYU cognitive scientist Gary Marcus; it requires computers to identify and interpret humor. As Marcus wrote in the New Yorker’s Elements blog, “No existing program.. can currently come close to doing what any bright, real teenager can do: Watch an episode of The Simpsons, and tell us when to laugh.”