Picture this: You come across a video on YouTube where the President of the United States is addressing a press conference with another head of state. The press conference is very convincing you decide to share the link with a friend. Your friend shares it with his contacts and the video starts spreading. Soon, thousands have seen it. But that’s not it, a while later you come to know that the video was actually not real and that the President’s head was imposed on someone else’s body and that no press conference took place.
You might be thinking that this is too farfetched and not possible. However, that’s not the case. All you need to do is check the widely popular video from CtrlShiftFace (YouTube channel). The video has amassed more than 11 million views so far.
The video features comedian Bill Haider, who shares a story about his encounters with actors Tom Cruise and Seth Rogen. As Haider, a skilled impressionist, does his best Cruise and Rogen, the faces of both seamlessly and rather frighteningly melt into his own. How seamless the transition was can be gauged from the comments on the video.
“The fade between faces is absolutely unnoticeable and it’s flipping creepy,” says a comment by a user. Another user says, “I thought Bill Hader’s impression was so good that he turned himself into Tom Cruise.” There are other comments too that highlight how frightening uses of this technology can be. The video, and more importantly, the accuracy of it merely highlights how easy and potentially dangerous it is to manipulate video content. The video is an expertly crafted deepfake, a technology invented in 2014 by Ian Goodfellow.
Two recent developments that have triggered this fear include the Tom Cruise deepfake and a looping video of Dr. Martin Luther King Jr. The deepfakes created by leveraging powerful techniques from machine learning and artificial intelligence have earned millions of views on social media platforms like TikTok, Twitter and Facebook.
The Tom Cruise deepfakes that are a collaboration between Belgian visual effects artist Christopher Ume and the Tom Cruise impersonator Miles Fisher have amassed 1.7million likes on TikTok. The clips are so realistic that even commercial tools used for the detection of deepfakes couldn’t mark them as fake.
Ume has played down fears saying that the expertise required to use the particular technology means that it is very hard to abuse it.
Another development that generated lots of interest was the online tool launched by genealogy website MyHeritage. The tool can be used to digitally animate photography resulting in a short looping video. Videos created using this tool can show people nodding or smiling without any hint that they aren’t real. Deep Nostalgia, the tool made available by MyHeritage has been so far used to create more than 26 million images.
These are not isolated examples of how accurate the deepfake technology has become. Last year millions of viewers of South Korea’s MBN channel were left stunned when a deepfake presented news instead of regular news anchor Kim Joo-Ha. The similarity between the fake and the news anchor was so uncanny that many feared that Kim will lose her job. A non-governmental organization in Mexico recently created a deepfake video of murdered Mexican journalist Javier Arturo Valdez Cardenas to call for justice. Parents of murdered teen Joaquin Oliver digitally resurrected him to promote gun safety legislation.
What is deepfake?
Researchers and special effect studios have been pushing the boundaries of what’s possible with video and image manipulation. Although the deepfake technology was around for a few years but deepfakes were first born in late 2017 when a Reddit user by the same name posted pornographic videos generated with a DNN-based face-swapping algorithm. The videos swapped the faces of celebrities – Gal Gadot, Taylor Swift, Scarlett Johansson and others – on to porn performers. Subsequently, the term deepfake has been used more broadly to refer to all types of AI-generated impersonating videos.
How are they created?
It takes a few steps to make a face-swap video. First, you run thousands of face shots of the two people through an AI algorithm called an encoder. The encoder finds and learns similarities between the two faces, and reduces them to their shared common features, compressing the images in the process. A second AI algorithm called a decoder is then taught to recover the faces from the compressed images. Because the faces are different, you train one decoder to recover the first person’s face, and another decoder to recover the second person’s face. To perform the face swap, you simply feed encoded images into the “wrong” decoder. For example, a compressed image of person A’s face is fed into the decoder trained on person B. The decoder then reconstructs the face of person B with the expressions and orientation of face A. For a convincing video, this has to be done on every frame.
Another way to make deepfakes uses what’s called a generative adversarial network, or Gan. A Gan pits two artificial intelligence algorithms against each other. The first algorithm, known as the generator, is fed random noise and turns it into an image. This synthetic image is then added to a stream of real images – of celebrities, say – that are fed into the second algorithm, known as the discriminator. At first, the synthetic images will look nothing like faces. But repeat the process countless times, with feedback on performance, and the discriminator and generator both improve. Given enough cycles and feedback, the generator will start producing utterly realistic faces of completely nonexistent celebrities.
What are they for?
Most deepfakes are pornographic in nature. According to Deeptrace, an artificial intelligence firm, 15,000 deepfake videos were found online in September 2019, nearly double from nine months earlier. The firm further said that a whopping 96% of the deepfake videos it found were pornographic in nature. Most videos mapped faces from female celebrities to porn stars. Newer techniques and in many cases online applications have made it easy for unskilled people to make deepfakes with just a handful of photos, fake videos are likely to spread beyond just the celebrity world. There is a genuine concern that as the technology becomes easier to use it will fuel revenge porn.
Many are pornographic. The AI firm Deeptrace found 15,000 deepfake videos online in September 2019, a near doubling over nine months. A staggering 96% were pornographic and 99% of those mapped faces from female celebrities on to porn stars. As new techniques allow unskilled people to make deepfakes with a handful of photos, fake videos are likely to spread beyond the celebrity world to fuel revenge porn.
Although celebrities remain the prime target of such videos, however, relatively lesser known people who have an internet presence have also been targeted. Recently, one creator on the discussion board 8chan made an explicit deepfake featuring the face of a German blogger who posts lifestyle videos; thousands of images of her face had been extracted from a hair tutorial she had recorded in 2014.
Is it only about videos?
No. Deepfake technology can create convincing but entirely fictional photos from scratch. Audio can be deepfaked too, to create “voice skins” or ”voice clones” of public figures.
Can anyone create them?
It is almost impossible to make a good deepfake on a normal computer. Most deepfakes are created on high-end desktops with powerful graphics cards or better still with computing power in the cloud. This reduces the processing time from days and weeks to hours. But it takes expertise, too, not least to touch up completed videos to reduce flicker and other visual defects. That said, plenty of tools are now available to help people make deepfakes. Several companies will make them for you and do all the processing in the cloud. There’s even a mobile phone app, Zao, that lets users add their faces to a list of TV and movie characters on which the system has trained. A tool on the messaging app Telegram that allowed users to create simulated nude images from a single uploaded photo has already been used hundreds of thousands of times, according to BuzzFeed News.
The Dangers of Deepfake Videos
Deepfake videos can lead to serious dangers in a matter of seconds and can lead to a number of dangers if not careful. Here are a few risks to watch out for:
It’s hard to believe the truth: Some deepfake videos are made to favor a person or situation when the truth is less than favorable.
Others may also say a video is deepfake, although it’s not, to avoid other problems.
It could ruin reputations: Deepfake videos have been known to ruin reputations of political candidates with untrue statements and falsifying events. But, deepfake videos have the power to impact private individuals too if the video spreads to a large-scale audience such as universities or major corporations.
More room for phishing and scams: The videos are easy ways to encourage users to click links for phishing attacks. These attacks can lead to gathering PII and data leaks for corporations and people that aren’t careful.
How to spot deepfakes?
Unfortunately, pointing out a deepfake video isn’t easy because they’re designed to look so closely like reality. But there are a few ways to spot them.
• Pay close attention to see if the subject in the video blinks. If not, it’s likely that the video may be fake. Humans blink on instinct, but The Guardian shared that AI hasn’t figured out the physical feature to replicate it yet.
• If you notice the person’s features or surroundings seem edited it may be a deepfake video.
• Look for lighting, sound and movement effects that don’t seem natural.
• Most importantly, trust your gut. If the video doesn’t feel real or features seem strange report it to the site you’re visiting right away.
The challenge to detect deepfakes
The competition between the making and detection of deepfakes will not end in the foreseeable future. We will see deepfakes that are easier to make, more realistic and harder to distinguish. The current bottleneck on the lack of details in the synthesis will be overcome by combining with the GAN models. The training and generating time will be reduced with advances in hardware and in lighter-weight neural network structures. In the past few months we are seeing new algorithms that are able to deliver a much higher level of realism or run in near real time. The latest form of deepfake videos will go beyond simple face swapping, to whole-head synthesis, joint audiovisual synthesis and even whole-body synthesis.
To curb the threat posed by increasingly sophisticated deepfakes, detection technology will also need to keep up the pace. As we try to improve the overall detection performance, emphasis should also be put on increasing the robustness of the detection methods to video compression, social media laundering and other common post-processing operations, as well as intentional counter-forensics operations. On the other hand, given the propagation speed and reach of online media, even the most effective detection method will largely operate in a postmortem fashion, applicable only after deepfake videos emerge.
Needless to say, deepfakes are not only a technical problem, and as the Pandora’s box has been opened, they are not going to disappear in the foreseeable future. But with technical improvements in our ability to detect them, and the increased public awareness of the problem, we can learn to co-exist with them and to limit their negative impacts in the future.