Defending your voice against deepfakes – Technology Org

Recent advances in generative artificial intelligence have spurred developments in realistic speech synthesis. While this technology can potentially improve lives through personalized voice assistants and accessibility-enhancing communication tools, it also has led to deepfakes, in which synthesized speech can be misused to deceive humans and machines for nefarious purposes.

AntiFake-workflow

Overview of how AntiFake works. Image credit: Ning Zhang

In response to this evolving threat, Ning Zhang, an assistant professor of computer science and engineering at the McKelvey School of Engineering at Washington University in St. Louis, developed a tool called AntiFake, a novel defense mechanism designed to thwart unauthorized speech synthesis before it happens. Zhang presented AntiFake at the Association for Computing Machinery’s Computer and Communications Security Conference in Copenhagen, Denmark.

Unlike traditional deepfake detection methods, which are used to evaluate and uncover synthetic audio as a post-attack mitigation tool, AntiFake takes a proactive stance. It employs adversarial techniques to prevent the synthesis of deceptive speech by making it more difficult for AI tools to read necessary characteristics from voice recordings. The code is freely available to users.

“AntiFake makes sure that when we put voice data out there, it’s hard for criminals to use that information to synthesize our voices and impersonate us,” Zhang said. “The tool uses a technique of adversarial AI that was originally part of the cybercriminals’ toolbox, but now we’re using it to defend against them. We mess up the recorded audio signal just a little bit, distort or perturb it just enough that it still sounds right to human listeners, but it’s completely different to AI.”

To ensure AntiFake can stand up against an ever-changing landscape of potential attackers and unknown synthesis models, Zhang and first author Zhiyuan Yu, a graduate student in Zhang’s lab, built the tool to be generalizable and tested it against five state-of-the-art speech synthesizers. AntiFake achieved a protection rate of over 95%, even against unseen commercial synthesizers. They also tested AntiFake’s usability with 24 human participants to confirm the tool is accessible to diverse populations.

Currently, AntiFake can protect short clips of speech, aiming for the most common type of voice impersonation. But, Zhang said, nothing can stop this tool from being expanded to protect longer recordings or music in the ongoing fight against disinformation.

“Eventually, we want to be able to protect voice recordings fully,” Zhang said. “While I don’t know what will be next in AI voice tech — new tools and features are being developed all the time — I do think our strategy of turning adversaries’ techniques against them will continue to be effective. AI remains vulnerable to adversarial perturbations, even if the engineering specifics may need to shift to maintain this as a winning strategy.”

Source: Washington University in St. Louis