The scammers trying to steal our voices — and how to protect yourself

Scammers are increasingly recording people's voices, training AI and creating audio deepfakes to dupe and extort victims.

A woman sitting on a couch while talking on the phone.

While voice clone frauds have long been used for catfishing, financial fraud and political sabotage, technological advances are making them harder to detect. Source: Getty / Skynesher

Key Points
  • Criminals are using increasingly sophisticated AI technologies to recreate people’s voices for their own gain.
  • Audio deepfakes can be weaponised in a range of ways, including financial fraud, defamation and cheap labour.
  • Experts urge people to be cautious when answering suspicious phone calls or sharing information online.
Phone scammers are stealing and cloning the voices of unsuspecting victims in a bid to replicate their identity, create audio deepfakes and extort their loved ones out of money, according to experts.

Voice clone frauds have been used for catfishing, financial fraud and political sabotage for years. But in a growing trend, criminals are adopting increasingly sophisticated technologies to record phone calls, .

The rise and effectiveness of such scams has drawn the attention of authorities, institutions and consumer advocacy groups — from National Australia Bank to the government of Western Australia — who are urging people to be vigilant.

"At the moment we worry about people that call us that try and get our information or tell us that we have to pay something like a toll or whatever," Monica Whitty, a professor of human factors in cybersecurity and head of the department of software systems and cybersecurity at Monash University, told SBS News.

"But now it could be that they try to garner bits of your voice that then can be used and trained with AI … to sound like yourself."

How do they work?

Whitty laid out a potential scenario in which a person’s voice might be stolen and repurposed for a criminal’s benefit.

First, someone might answer a call from an anonymous number and speak, however briefly, to the person on the other end — who is, unbeknownst to them, a scammer. A recording of that conversation can then be used to "train" AI on how to replicate the victim's voice, not just verbatim but in a variety of ways.

This replica, known as an audio deepfake, can in turn be used to scam others.

"They (the scammers) might ring a family member of mine using my voice to say, 'Look, I've had my wallet stolen, I'm in Spain — can you send me some money? It's really urgent. I've been kidnapped. They've got me for hostage,'" Whitty said.

"So they'll set up a fake scenario with your voice to legitimise that big scenario."
Two hands of a person wearing a red sweater hold a smartphone
Both video and audio deepfakes can be used for more than just financial fraud. Source: AAP
While voice fraud isn't new, Whitty and others highlighted that technological advances made their application more diverse and effective.

"In scams like romance scams or investment scams, it used to be the case that you'd only bring in voice when it looked like the person was doubting the person creating the fake relationship … the voice helps develop more trust," Whitty said.

"It's evolved to the fact that the technology can mimic someone's voice that you actually know. So now that makes it even better for the criminal to use that in these scams."

Shahriar Kaisar, a cybersecurity researcher and leader in the information security and business department of RMIT, told SBS News that the proliferation of generative AI technology — artificial intelligence systems like ChatGPT — has seen the sophistication of these scams evolve to "a very different level, where it has become very difficult to distinguish between what is real and what is not."

"They use a machine learning model where they would collect voice samples or sometimes even video images from online public sources — it could be a YouTube video, it could be something that you have shared on TikTok," Kaisar explained.
That data is then fed into the system and dissected into "billions of milliseconds or nanoseconds … to produce something that looks exactly real, like how you'd probably be heard by other people."

"It is very difficult to distinguish this kind of technology," he said. "Whether it has been AI-generated or it is actually true."

Who are the victims?

Simon Kennedy hears stories of people being impacted by audio deepfakes every day. As president of the Australian Association of Voice Actors (AAVA), he belongs to an industry that is uniquely impacted by this proliferating technology.

"People come to us and say, 'Oh, I just found out that my voice is being used on a YouTube video episode that I had nothing to do with, and what can I do about it?'" he told SBS News.

"And the answer is, at this stage, not much, sadly, because the legislation hasn't caught up yet."

Kennedy said it's not uncommon for voice actors to discover that their voice has been cloned without their consent, or that they've lost work to a non-consensual clone of themselves. He's trying to get laws and regulations put in place to stop that.

Working alongside AAVA, Kennedy is talking to Australian parliamentarians, setting up meetings and putting forward the case that it should be illegal for someone to make a synthetic clone of somebody's voice without their consent — an act that can potentially affect anybody, not just voice actors.

"It seems like a very simple proposition, but it's not written into law yet," he said, suggesting that, if done right, legislation could help protect every Australian citizen's voice and image," Kennedy said.

"You can't make someone work for free against their will, and you shouldn't be able to use their likeness for free against their will for profit or deception either. So we see it as an extension of a moral right."
A hand holding a phone that's receiving a call from an unknown caller.
People are advised to only answer phone calls from contacts they recognise. Source: Getty / Rafael Abdrakhmanov/iStockphoto
Both video and audio deepfakes can also be used for more than just financial fraud and cheap labour. Kaisar explained there are several other malicious ways this technology can be deployed.

"It can be used for defamation. For example, it can be used to spread a rumour … We have also seen that it has been used in a political context as well," he said, citing a case where a deepfake video depicted Ukrainian President Volodymyr Zelenskyy telling his soldiers to lay down their arms and surrender to Russia.

"[It can] actually cause political civil unrest or political tension between countries on a national level as well," he said.

How can people protect themselves?

Kaisar warned that people should be careful about the kinds of content they share online, especially when it comes to personal content such as videos and voice recordings, to reduce the risk of falling prey to voice cloning.

"We would want some help from the platforms that are being used for developing deepfakes as well, and also platforms that are being used for sharing those," he added.

Whitty further urged people to remain vigilant — and to minimise the time spent talking to someone suspicious or unknown.

"If you've got a family member or someone that's ringing you for something urgent, it's better to stop, pause and … then ring that person up again or contact them in some other way just to make sure that it was them talking to you," Whitty said.

If possible, she added, people should only answer phone calls from contacts they know and recognise.

"If it says unknown, just ignore it altogether."

Share
7 min read
Published 18 May 2024 11:01am
By Gavin Butler
Source: SBS News



Share this with family and friends