Identifying Deepfake Images from a Forensic Lens

BlueskyReddit
 Identifying Deepfake Images from a Forensic Lens

When law enforcement locates an explicit image as part of an investigation, there is no longer a guarantee the image is real. There’s a real and growing possibility it is an AI-generated deepfake image and we are in the very beginning stages as a society for how this will all play out.

In Florida, two teenagers are facing criminal charges – accused of employing AI to generate explicit images of a classmate. At this writing, only 10 states have laws on the books to hold people criminally responsible, though more legislation is in the works. While the laws play catch up and each U.S. state tackles this complicated matter from a criminal prosecution perspective, federal lawmakers are finding common ground in this issue from a civil lens – with bipartisan support for the Defiance Act, which would allow victims portrayed in explicit deepfakes to sue its creator for damages.

While the societal implication of this new reality plays out, we are working diligently to research this issue from a forensic lens. What is the digital footprint of an AI-generated image when compared to a normal image? What are the character traits of the file?

Due to the nature of crime and the overwhelming amount of digital evidence labs are dealing with, examiners do not have a lot of time to dig into these questions. And a lot of digital forensics professionals do not have enough forensic training to understand how to properly examine metadata and exactly where it came from and when. That’s why we are making it a priority to understand how these deepfake images work so you can cut right to the data and begin making the determinations needed to further your case.

Our Methodology of Understanding Deepfakes

We are creating a handful of deepfake images using various AI-generating applications on our Android and iOS test devices. Then we are taking the same kinds of images through traditional means. We then put them side by side and dig into the EXIF of the images and the databases that track image artifact metadata.

We are looking for things like how we associate the images created from the specific device and account and are tracing how they lead back to its creator. That is one layer of the research. A second layer is once the image gets circulated, how does the data behave? We are hypothesizing that unless there's some sort of stamp that shows “created by” embedded in the file, all the metadata gets stripped out when it gets posted. Cellebrite is working to detect deepfake images in our software by digging into the file headers and file attributes. We are digging to not only identify differences in files, but to try and find the text prompt that caused the image to be created in the first place! This includes Midjourney-generated files containing the text prompt in the file name.  

We have been dealing with real images for years and we know the character traits. This promising research is informing us on how we can spot the differences in an AI-generated image. This will make detecting these images even easier for law enforcement, particularly when it’s integrated into our digital investigative tools.

The Waiting Game

Law enforcement professionals are already having and will be having more tough conversations with victims, particularly when they don’t have laws to fall back on. There will be a lot of “I am so sorry that this picture of you (or worse, your child) was created and circulated.”

Society will play a large role in policing some of these images on platforms available today – with the need for people to quickly report that it violates the community standards, like what happened on X (formerly Twitter) when the deepfakes of Taylor Swift circulated. Any time you see an image – from explicit to child erotica in nature, it should be reported to the platform immediately.

This is also a great time to underscore the importance of the service, “Take it Down” that is run by the National Center for Missing and Exploited Children. If a minor’s likeness is an AI-generated explicit image, you can report it. If you're an investigator in law enforcement, you can report it even if you don't know the child.

When this unfortunate new reality happened to the biggest celebrity on the planet, every parent realized it could certainly happen to their child. We are committed to doing everything we can from the forensic side to give police the information they need when investigating these cases. Digital forensics and the tools professionals have always used are going to work the way they’ve always worked – it’s just tackling a new use. In many ways, it’s using the good that AI can offer in our software, to tackle the negative aspects of AI.

We will present our full findings at the Techno Security & Digital Forensics Conference 2024 in June and will publish them in Forensic Magazine in July.

About the Authors

Heather Mahalik Barnhart is the Senior Director of Community Engagement at Cellebrite, a global leader in premier Digital Investigative solutions for the public and private sectors. She educates and advises digital forensic professionals on cases around the globe. For more than 20 years, Heather’s worked on high-profile cases, investigating everything from child exploitation to Osama Bin Laden's digital media.

Jared Barnhart is the Customer Experience Team Lead at Cellebrite, a global leader in premier Digital Investigative solutions for the public and private sectors. A former detective and mobile forensics engineer, Jared is highly specialized in digital forensics, regularly training law enforcement and lending his expertise to help them solve cases and accelerate justice.

 

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and products for the lab. Plus, get special offers from Forensice – all delivered right to your inbox! Sign up now!