Why was a YouTube chat about chess for hate speech?

Last June, Antonio Radić, host of a YouTube chess channel with more than a million subscribers, broadcast a live interview with Grandmaster Hikaru Nakamura when the broadcast was suddenly cut off.

Instead of a lively discussion about chess openings, famous games and iconic players, viewers were told that Radić’s video had been removed for “harmful and dangerous” content. Radić saw a message stating that the video, which included nothing more outrageous than a discussion about the king’s defense of India, had violated YouTube community guidelines. He remained offline for 24 hours.

It is not yet clear exactly what happened. YouTube declined to comment beyond saying that deleting a video of Radić was a mistake. But a new study suggests it reflects the shortcomings of artificial intelligence programs designed to automatically detect hate speech, abuse, and misinformation online.

Ashique KhudaBukhsh, an AI project scientist at Carnegie Mellon University and a serious chess player, wondered if YouTube’s algorithm could have been confused by debates about black-and-white pieces, attacks and defenses.

Therefore, he and Rupak Sarkar, a CMU engineer, designed an experiment. They formed two versions of a language model called BERT, one with messages from the far-right racist website Stormfront and the other with Twitter data. They then tested the text algorithms and comments of 8,818 chess videos and found that they were far from perfect. Algorithms marked about 1 percent of transcripts or comments as hate speech. But more than 80 percent of those listed were false positives, read in context, the language was not racist. “Without a human being,” the couple says in their paper, “relying on the predictions of the classifiers available in chess discussions can be misleading.”

The experiment exposed a basic problem for AI language programs. Detecting hate speech or abuse is more than catching nasty words and phrases. The same words can have a very different meaning in different contexts, so an algorithm must deduce the meaning of a string of words.

“Basically, language is still a very subtle thing,” says Tom Mitchell, a CMU professor who has previously worked with KhudaBukhsh. “These types of formed classifiers will not soon have 100% accuracy.”

Yejin Choi, an associate professor at the University of Washington who specializes in artificial intelligence and languages, says she is “not at all” surprised by YouTube’s withdrawal, given the current limits of language comprehension. Choi says further advances in hate speech detection will require major investments and new approaches. She says algorithms work best when they analyze more than one piece of text in isolation, incorporating, for example, the user’s comment history or the nature of the channel in which the comments are posted.

But Choi’s research also shows how hate speech detection can perpetuate biases. In a 2019 study, she and others found that human annotators were more likely to label Twitter posts by users who identify as African American as abusive and that trained algorithms to identify abuse by using these annotations will repeat these biases.

item image

The WIRED Guide to Artificial Intelligence

Supersmart algorithms won’t accept all jobs, but they learn faster than ever, ranging from medical diagnosis to ad serving.

Companies have spent many millions collecting and recording training data for cars that drive automatically, but Choi says the same effort has not been made in the language of annotation. To date, no one has collected or annotated a high-quality data set of hate speech or abuse that includes many “cutting-edge cases” with ambiguous language. “If we make that level of investment in data collection (or even a small fraction), I’m sure AI can do so much better,” he says.

Mitchell, the CMU professor, says YouTube and other platforms probably have more sophisticated AI algorithms than the one built by KhudaBukhsh; but even these are limited.

.Source