3 Questions: What you need to know about audio deepfakes

Audio deepfakes have had a recent bout of bad press after an artificial intelligence-generated robocall purporting to be the voice of Joe Biden hit up New Hampshire residents, urging them not to cast ballots. Meanwhile, spear-phishers — phishing campaigns that target a specific person or group, especially using information known to be of interest to the target — go fishing for money, and actors aim to preserve their audio likeness.

What receives less press, however, are some of the uses of audio deepfakes that could actually benefit society. In this Q&A prepared for MIT News, postdoc Nauman Dawalatabad addresses concerns as well as potential upsides of the emerging tech. A fuller version of this interview can be seen at the video below.

Q: What ethical considerations justify the concealment of the source speaker’s identity in audio deepfakes, especially when this technology is used for creating innovative content?

A: The inquiry into why research is important in obscuring the identity of the source speaker, despite a large primary use of generative models for audio creation in entertainment, for example, does raise ethical considerations. Speech does not contain the information only about “who you are?” (identity) or “what you are speaking?” (content); it encapsulates a myriad of sensitive information including age, gender, accent, current health, and even cues about the upcoming future health conditions. For instance, our recent research paper on “Detecting Dementia from Long Neuropsychological Interviews” demonstrates the feasibility of detecting dementia from speech with considerably high accuracy. Moreover, there are multiple models that can detect gender, accent, age, and other information from speech with very high accuracy. There is a need for advancements in technology that safeguard against the inadvertent disclosure of such private data. The endeavor to anonymize the source speaker’s identity is not merely a technical challenge but a moral obligation to preserve individual privacy in the digital age.

Q: How can we effectively maneuver through the challenges posed by audio deepfakes in spear-phishing attacks, taking into account the associated risks, the development of countermeasures, and the advancement of detection techniques?

A: The deployment of audio deepfakes in spear-phishing attacks introduces multiple risks, including the propagation of misinformation and fake news, identity theft, privacy infringements, and the malicious alteration of content. The recent circulation of deceptive robocalls in Massachusetts exemplifies the detrimental impact of such technology. We also recently spoke with the spoke with The Boston Globe about this technology, and how easy and inexpensive it is to generate such deepfake audios.

Anyone without a significant technical background can easily generate such audio, with multiple available tools online. Such fake news from deepfake generators can disturb financial markets and even electoral outcomes. The theft of one’s voice to access voice-operated bank accounts and the unauthorized utilization of one’s vocal identity for financial gain are reminders of the urgent need for robust countermeasures. Further risks may include privacy violation, where an attacker can utilize the victim’s audio without their permission or consent. Further, attackers can also alter the content of the original audio, which can have a serious impact.

Two primary and prominent directions have emerged in designing systems to detect fake audio: artifact detection and liveness detection. When audio is generated by a generative model, the model introduces some artifact in the generated signal. Researchers design algorithms/models to detect these artifacts. However, there are some challenges with this approach due to increasing sophistication of audio deepfake generators. In the future, we may also see models with very small or almost no artifacts. Liveness detection, on the other hand, leverages the inherent qualities of natural speech, such as breathing patterns, intonations, or rhythms, which are challenging for AI models to replicate accurately. Some companies like Pindrop are developing such solutions for detecting audio fakes. 

Additionally, strategies like audio watermarking serve as proactive defenses, embedding encrypted identifiers within the original audio to trace its origin and deter tampering. Despite other potential vulnerabilities, such as the risk of replay attacks, ongoing research and development in this arena offer promising solutions to mitigate the threats posed by audio deepfakes.

Q: Despite their potential for misuse, what are some positive aspects and benefits of audio deepfake technology? How do you imagine the future relationship between AI and our experiences of audio perception will evolve?

A: Contrary to the predominant focus on the nefarious applications of audio deepfakes, the technology harbors immense potential for positive impact across various sectors. Beyond the realm of creativity, where voice conversion technologies enable unprecedented flexibility in entertainment and media, audio deepfakes hold transformative promise in health care and education sectors. My current ongoing work in the anonymization of patient and doctor voices in cognitive health-care interviews, for instance, facilitates the sharing of crucial medical data for research globally while ensuring privacy. Sharing this data among researchers fosters development in the areas of cognitive health care. The application of this technology in voice restoration represents a hope for individuals with speech impairments, for example, for ALS or dysarthric speech, enhancing communication abilities and quality of life.

I am very positive about the future impact of audio generative AI models. The future interplay between AI and audio perception is poised for groundbreaking advancements, particularly through the lens of psychoacoustics — the study of how humans perceive sounds. Innovations in augmented and virtual reality, exemplified by devices like the Apple Vision Pro and others, are pushing the boundaries of audio experiences towards unparalleled realism. Recently we have seen an exponential increase in the number of sophisticated models coming up almost every month. This rapid pace of research and development in this field promises not only to refine these technologies but also to expand their applications in ways that profoundly benefit society. Despite the inherent risks, the potential for audio generative AI models to revolutionize health care, entertainment, education, and beyond is a testament to the positive trajectory of this research field.

3 Questions: What you need to know about audio deepfakes

Play video

Audio Deepfakes Explained
Video: MIT CSAIL

Making the clean energy transition work for everyone

Making the clean energy transition work for everyone

The clean energy transition is already underway, but how do we make sure it happens in a manner that is affordable, sustainable, and fair for everyone?

That was the overarching question at this year’s MIT Energy Conference, which took place March 11 and 12 in Boston and was titled “Short and Long: A Balanced Approach to the Energy Transition.”

Each year, the student-run conference brings together leaders in the energy sector to discuss the progress and challenges they see in their work toward a greener future. Participants come from research, industry, government, academia, and the investment community to network and exchange ideas over two whirlwind days of keynote talks, fireside chats, and panel discussions.

Several participants noted that clean energy technologies are already cost-competitive with fossil fuels, but changing the way the world works requires more than just technology.

“None of this is easy, but I think developing innovative new technologies is really easy compared to the things we’re talking about here, which is how to blend social justice, soft engineering, and systems thinking that puts people first,” Daniel Kammen, a distinguished professor of energy at the University of California at Berkeley, said in a keynote talk. “While clean energy has a long way to go, it is more than ready to transition us from fossil fuels.”

The event also featured a keynote discussion between MIT President Sally Kornbluth and MIT’s Kyocera Professor of Ceramics Yet-Ming Chiang, in which Kornbluth discussed her first year at MIT as well as a recently announced, campus-wide effort to solve critical climate problems known as the Climate Project at MIT.

“The reason I wanted to come to MIT was I saw that MIT has the potential to solve the world’s biggest problems, and first among those for me was the climate crisis,” Kornbluth said. “I’m excited about where we are, I’m excited about the enthusiasm of the community, and I think we’ll be able to make really impactful discoveries through this project.”

Fostering new technologies

Several panels convened experts in new or emerging technology fields to discuss what it will take for their solutions to contribute to deep decarbonization.

“The fun thing and challenging thing about first-of-a-kind technologies is they’re all kind of different,” said Jonah Wagner, principal assistant director for industrial innovation and clean energy in the U.S. Office of Science and Technology Policy. “You can map their growth against specific challenges you expect to see, but every single technology is going to face their own challenges, and every single one will have to defy an engineering barrier to get off the ground.”

Among the emerging technologies discussed was next-generation geothermal energy, which uses new techniques to extract heat from the Earth’s crust in new places.

A promising aspect of the technology is that it can leverage existing infrastructure and expertise from the oil and gas industry. Many newly developed techniques for geothermal production, for instance, use the same drills and rigs as those used for hydraulic fracturing.

“The fact that we have a robust ecosystem of oil and gas labor and technology in the U.S. makes innovation in geothermal much more accessible compared to some of the challenges we’re seeing in nuclear or direct-air capture, where some of the supply chains are disaggregated around the world,” said Gabrial Malek, chief of staff at the geothermal company Fervo Energy.

Another technology generating excitement — if not net energy quite yet — is fusion, the process of combining, or fusing, light atoms together to form heavier ones for a net energy gain, in the same process that powers the sun. MIT spinout Commonwealth Fusion Systems (CFS) has already validated many aspects of its approach for achieving fusion power, and the company’s unique partnership with MIT was discussed in a panel on the industry’s progress.

“We’re standing on the shoulders of decades of research from the scientific community, and we want to maintain those ties even as we continue developing our technology,” CFS Chief Science Officer Brandon Sorbom PhD ’17 said, noting that CFS is one of the largest company sponsors of research at MIT and collaborates with institutions around the world. “Engaging with the community is a really valuable lever to get new ideas and to sanity check our own ideas.”

Sorbom said that as CFS advances fusion energy, the company is thinking about how it can replicate its processes to lower costs and maximize the technology’s impact around the planet.

“For fusion to work, it has to work for everyone,” Sorbom said. “I think the affordability piece is really important. We can’t just build this technological jewel that only one class of nations can afford. It has to be a technology that can be deployed throughout the entire world.”

The event also gave students — many from MIT — a chance to learn more about careers in energy and featured a startup showcase, in which dozens of companies displayed their energy and sustainability solutions.

“More than 700 people are here from every corner of the energy industry, so there are so many folks to connect with and help me push my vision into reality,” says GreenLIB CEO Fred Rostami, whose company recycles lithium-ion batteries. “The good thing about the energy transition is that a lot of these technologies and industries overlap, so I think we can enable this transition by working together at events like this.”

A focused climate strategy

Kornbluth noted that when she came to MIT, a large percentage of students and faculty were already working on climate-related technologies. With the Climate Project at MIT, she wanted to help ensure the whole of those efforts is greater than the sum of its parts.

The project is organized around six distinct missions, including decarbonizing energy and industry, empowering frontline communities, and building healthy, resilient cities. Kornbluth says the mission areas will help MIT community members collaborate around multidisciplinary challenges. Her team, which includes a committee of faculty advisors, has begun to search for the leads of each mission area, and Kornbluth said she is planning to appoint a vice president for climate at the Institute.

“I want someone who has the purview of the whole Institute and will report directly to me to help make sure this project stays on track,” Kornbluth explained.

In his conversation about the initiative with Kornbluth, Yet-Ming Chiang said projects will be funded based on their potential to reduce emissions and make the planet more sustainable at scale.

“Projects should be very high risk, with very high impact,” Chiang explained. “They should have a chance to prove themselves, and those efforts should not be limited by resources, only by time.”

In discussing her vision of the climate project, Kornbluth alluded to the “short and long” theme of the conference.

“It’s about balancing research and commercialization,” Kornbluth said. “The climate project has a very variable timeframe, and I think universities are the sector that can think about the things that might be 30 years out. We have to think about the incentives across the entire innovation pipeline and how we can keep an eye on the long term while making sure the short-term things get out rapidly.”

Malicious open-source packages: Insights from Check Point’s Head of Data Science

Malicious open-source packages: Insights from Check Point’s Head of Data Science

Ori Abramovsky is the Head of Data Science of the Developer-First group at Check Point, where he leads the development and application of machine learning models to the source code domain. With extensive experience in various machine learning types, Ori specializes in bringing AI applications to life. He is committed to bridging the gap between theory and real-world application and is passionate about harnessing the power of AI to solve complex business challenges.

In this thoughtful and incisive interview, Check Point’s Developer-First Head of Data Science, Ori Abramovsky discusses malicious open-source packages. While malicious open-source packages aren’t new, their popularity among hackers is increasing. Discover attack vectors, how malicious packages conceal their intent, and risk mitigation measures. The best prevention measure is…Read the interview to find out.

What kinds of trends are you seeing in relation to malicious open-source packages?

The main trend we’re seeing relates to the increasing sophistication and prevalence of malicious open-source packages. While registries are implementing stricter measures, such as PyPI’s recent mandate for users to adopt two-factor authentication, the advances of Large Language Models (LLMs) pose significant challenges to safeguarding against such threats. Previously, hackers needed substantial expertise in order to create malicious packages. Now, all they need is access to LLMs and to find the right prompts for them. The barriers to entry have significantly decreased.

While LLMs democratise knowledge, they also make it much easier to distribute malicious techniques. As a result, it’s fair to assume that we should anticipate an increasing volume of sophisticated attacks. Moreover, we’re already in the middle of that shift, seeing these attacks extending beyond traditional domains like NPM and PyPI, manifesting in various forms such as malicious VSCode extensions and compromised Hugging Face models. To sum it up, the accessibility of LLMs empowers malicious actors, indicating a need for heightened vigilance across all open-source domains. Exciting yet challenging times lie ahead, necessitating preparedness.

Are there specific attack types that are most popular among hackers, and if so, what are they?

Malicious open-source packages can be applied based on the stage of infection: install (as part of the install process), first use (once the package has been imported), and runtime (infection is hidden as part of some functionality and will be activated once the user will use that functionality). Install and first use attacks typically employ simpler techniques; prioritizing volume over complexity, aiming to remain undetected long enough to infect users (assuming that some users will mistakenly install them). In contrast, runtime attacks are typically more sophisticated, with hackers investing efforts in concealing their malicious intent. As a result, the attacks are harder to detect, but come with a pricier tag. They last longer and therefore have higher chances of becoming a zero-day affecting more users.

Malicious packages employ diverse methods to conceal their intent, ranging from manipulating package structures (the simpler ones will commonly include only the malicious code, the more sophisticated ones can even be an exact copy of a legit package), to employing various obfuscation techniques (from classic methods such as base64 encoding, to more advanced techniques, such as steganography). The downside of using such concealment methods can make them susceptible to detection, as many Yara detection rules specifically target these signs of obfuscation. Given the emergence of Large Language Models (LLMs), hackers have greater access to advanced techniques for hiding malicious intent and we should expect to see more sophisticated and innovative concealment methods in the future.

Hackers tend to exploit opportunities where hacking is easier or more likely, with studies indicating a preference for targeting dynamic installation flows in registries like PyPI and NPM due to their simplicity in generating attacks. While research suggests a higher prevalence of such attacks in source code languages with dynamic installation flows, the accessibility of LLMs facilitates the adaptation of these attacks to new platforms, potentially leading hackers to explore less visible domains for their malicious activities.

How can organisations mitigate the risk associated with malicious open-source packages? How can CISOs ensure protection/prevention?

The foremost strategy for organisations to mitigate the risk posed by malicious open-source packages is through education. One should not use open-source code without properly knowing its origins. Ignorance in this realm does not lead to bliss. Therefore, implementing practices such as double-checking the authenticity of packages before installation is crucial. Looking into aspects like the accuracy of package descriptions, their reputation, community engagement (such as stars and user feedback), the quality of documentation in the associated GitHub repository, and its track record of reliability is also critical. By paying attention to these details, organisations can significantly reduce the likelihood of falling victim to malicious packages.

The fundamental challenge lies in addressing the ignorance regarding the risks associated with open-source software. Many users fail to recognize the potential threats and consequently, are prone to exploring and installing new packages without adequate scrutiny. Therefore, it is incumbent upon Chief Information Security Officers (CISOs) to actively participate in the decision-making process regarding the selection and usage of open-source packages within their organisations.

Despite best efforts, mistakes can still occur. To bolster defences, organisations should implement complementary protection services designed to monitor and verify the integrity of packages being installed. These measures serve as an additional layer of defence, helping to detect and mitigate potential threats in real-time.

What role does threat intelligence play in identifying and mitigating risks related to open-source packages?

Traditionally, threat intelligence has played a crucial role in identifying and mitigating risks associated with open-source packages. Dark web forums and other underground channels were primary sources for discussing and sharing malicious code snippets. This allowed security professionals to monitor and defend against these snippets using straightforward Yara rules. Additionally, threat intelligence facilitated the identification of suspicious package owners and related GitHub repositories, aiding in the early detection of potential threats. While effective for simpler cases of malicious code, this approach may struggle to keep pace with the evolving sophistication of attacks, particularly in light of advancements like Large Language Models (LLMs).

These days, with the rise of LLMs, it’s reasonable to expect hackers to innovate new methods through which to conduct malicious activity, prioritizing novel techniques over rehashing old samples that are easily identifiable by Yara rules. Consequently, while threat intelligence remains valuable, it should be supplemented with more advanced analysis techniques to thoroughly assess the integrity of open-source packages. This combined approach ensures a comprehensive defence against emerging threats, especially within less-monitored ecosystems, where traditional threat intelligence may be less effective.

What to anticipate in the future?

The emergence of Large Language Models (LLMs) is revolutionising every aspect of the software world, including the malicious domain. From the perspective of hackers, this development’s immediate implication equates to more complicated malicious attacks, more diverse attacks and more attacks, in general (leveraging LLMs to optimise strategies). Looking forward, we should anticipate hackers trying to target the LLMs themselves, using techniques like prompt injection or by trying to attack the LLM agents. New types and domains of malicious attacks are probably about to emerge.

Looking at the malicious open-source packages domain in general, a place we should probably start watching is Github. Historically, malicious campaigns have targeted open-source registries such as PyPI and NPM, with auxiliary support from platforms like GitHub, Dropbox, and Pastebin for hosting malicious components or publishing exploited data pieces. However, as these registries adopt more stringent security measures and become increasingly monitored, hackers are likely to seek out new “dark spots” such as extensions, marketplaces, and GitHub itself. Consequently, malicious code has the potential to infiltrate EVERY open-source component we utilise, necessitating vigilance and proactive measures to safeguard against such threats.

From Sketch to Platformer: Google Genie’s Artistic Approach to Game Generation

Genie, a remarkable creation by Google DeepMind, has captured the imaginations of researchers and gamers alike. Its full name, “GENerative Interactive Environment,” hints at its extraordinary abilities. Unlike an average AI model, Genie possesses the unique power to transform single images or text prompts into interactive,…

Ofer Ronen, Co-Founder and CEO of Tomato.ai – Interview Series

Ofer Ronen is the Co-Founder and CEO of Tomato.ai, a platform that offers an AI powered voice filter to soften accents for offshore agent voices as they speak, resulting in improved CSAT and sales metrics. Ofer previously sold three tech startups, two to Google, and one…

YOLO-World: Real-Time Open-Vocabulary Object Detection

Object detection has been a fundamental challenge in the computer vision industry, with applications in robotics, image understanding, autonomous vehicles, and image recognition. In recent years, groundbreaking work in AI, particularly through deep neural networks, has significantly advanced object detection. However, these models have a fixed…

Streamlining Live Production: NETGEAR and Panasonic Connect Partner fo – Videoguys

Streamlining Live Production: NETGEAR and Panasonic Connect Partner fo – Videoguys

Discover how NETGEAR’s M4350 series switches seamlessly integrate with Panasonic Connect’s KAIROS IT/IP platform, revolutionizing live production workflows for broadcast and Pro AV industries.

Unlock the Future of Live Production with NETGEAR and Panasonic Connect
In a groundbreaking collaboration, NETGEAR and Panasonic Connect have joined forces to propel live production into the digital age. By integrating NETGEAR’s M4350 series switches with Panasonic Connect’s KAIROS IT/IP platform, the industry is witnessing a revolutionary shift in AV-over-IP solutions.

Elevating Live Production Capabilities
Panasonic Connect’s KAIROS platform empowers professionals with unparalleled control over content delivery, from broadcast to large screen displays and live streams. With the addition of ST 2110, KAIROS offers enhanced flexibility in input/output configurations and content creation, setting new standards for live production versatility.

Simplified Workflow, Enhanced Efficiency
NETGEAR’s AV-oriented switches, particularly the M4350 series, streamline network configurations with an intuitive interface. The NETGEAR AV OS simplifies setup with template-based approaches, reducing the learning curve for broadcast and Pro AV integrators. With enterprise-class hardware features like redundant power supplies and PoE capabilities, the M4350 series ensures reliability and ease of deployment.

Unprecedented Compatibility
NETGEAR switches boast compatibility with a wide array of audio, video, and lighting protocols, including SMPTE ST 2110. This ensures seamless integration with various systems, providing unmatched versatility for live production environments.

Empowering Broadcast and Pro AV Engineers
Tod Musgrave, senior broadcast BDM at NETGEAR, highlights the platform’s ease of configuration, making it the go-to choice for engineers transitioning to IP workflows. The interoperability between NETGEAR switches and the Panasonic KAIROS system marks a significant milestone in advancing AV-over-IP solutions, empowering professionals to push the boundaries of live production like never before.

Don’t Miss Out on the Future of Live Production
Join us as we redefine the landscape of live production with NETGEAR and Panasonic Connect. Experience seamless integration, unparalleled versatility, and unmatched efficiency in every aspect of your AV-over-IP workflow. The future is now – embrace it with NETGEAR and Panasonic Connect.

Read the full blog post from AV Network HERE


Get empowered, not overpowered by GenAI

Generative AI can create content that resembles human content, and this is what we’ve also been looking at in healthcare. Can generative AI be used by patients and caregivers to interact with our solutions in ways they haven’t been able to do before?…

UAE set to help fund OpenAI’s in-house chips

OpenAI’s ambitious plans to develop its own semiconductor chips for powering advanced AI models could receive a boost from the United Arab Emirates (UAE), according to a report by the Financial Times. The report states that MGX — a state-backed group in Abu Dhabi — is…