Our reliance on digital multimedia content has emerged from its capacity to provide first-hand information about an event and our propensity to believe what we see, and it is for the sake of this reliance that we must ascertain that the given digital content is in actuality what it purports to be, because as influential as digital images and videos are, the picture they paint may not always be accurate.

Digital images and videos are incredibly easy to manipulate, and the possibility of manipulation is especially worrisome in scenarios where this digital content is being treated as evidence for making decisions and judgments that have long lasting repercussions. This engenders the need for the development of shrewd investigative procedures that are capable of establishing the authenticity and trustworthiness of the given digital content with a high degree of scientific certainty. These services are provided by the research field known as digital visual media forensics.

Digital visual media forensics

Digital visual media forensics (DVMF) is a branch of digital forensics that provides tools and techniques that help authenticate digital images and videos by performing tasks such as source identification and tamper detection, so that when this content is being used as evidence, we can rest assured that it is genuine and reflects the truth at all times.

The key to DVMF, like all other disciplines of forensic science, is Locard’s exchange principle, that is, “every contact leaves a trace.” Every operation that is applied to a digital image or video, and which alters its composition in some way, leaves behind subtle evidence of its occurrence. This evidence, generally referred to as a forensic artifact, is like a fingerprint of a particular processing operation, and thus serves as an identifying trace for that operation. DVMF relies on the discovery and examination of such traces to determine which modification operations the given content may have undergone during its lifetime. 

For nearly two decades now, the domain of digital visual media forensics has been the recipient of much deserved attention. Today, forensic analysts have a plethora of content authentication tools and techniques at their disposal, each of which specializes at the detection of a particular kind of forgery. However, in a real-world forgery scenario where the forensic analyst is largely unaware of the processing history of the given content, application of a single technique is largely insufficient for the authentication of the given content. The most productive course of action in such cases is to work under the assumption that the given content may have undergone more than one kind of manipulation, and use multiple tamper-detection techniques, each of which specializes in the exposition of a particular kind of forgery, and then combine the evidence provided by all these techniques to reach a final decision regarding the authenticity of this content.

Though combining evidence generated from multiple forensic techniques seems simple enough in theory, in reality, this task is much more complicated, especially considering the fact that existing forensic techniques are mired in uncertainties and sometimes proffer incomplete, ambiguous, or even conflicting evidence regarding the presence (or absence) of a particular forensic artifact in the given content.

The evidence theory

The evidence theory, also known as The Dempster–Shafer Theory (DST) or the Theory of Belief Functions, has recently emerged as a viable solution to the problem of combining multiple pieces of information generated by different forensic techniques into one final decision regarding content authenticity. Arthur P. Dempster first introduced this theory in context of statistical inference in 1967, which was later developed into a framework to model epistemic uncertainties (i.e., uncertainties which arise due to uncertain information provided by experts) by one of his students, Glenn Shafer, in 1976.

The evidence theory is a generalization of the Bayesian theory of subjective probabilities, and is both a mathematical theory of evidence and a theory of plausible reasoning. It allows one to combine evidence, i.e., pieces of information, from multiple sources and observations (where the evidence items may vary in terms of completeness, reliability, and/or precision) to arrive at a collective degree of belief that takes into account the information provided by all the evidence items. In spite of several criticisms, this theory remains one of the best-known strategies to provide a solution to the problem of combining ambiguous, contradictory or paradoxical evidence from multiple sources.

Although the evidence theory is not the only available decision fusion method, it is by far the most intuitive method by which incomplete or conflicting pieces of evidence can be combined to reach a final decision regarding content trustworthiness. One of the most interesting features of this theory is that it enables modeling of uncertainty and ignorance in a very straightforward manner, especially when compared to the traditional Bayesian probability theory.

The first step of evidence theory-based decision fusion involves recognition and denotation of a set of all possible conclusions that can be reached about some event of interest. Each element of this set is called a proposition; all propositions in a set are exhaustive and mutually exclusive. The next step is probability assignment. Unlike the Bayesian probability theory where probabilities are assigned to individual propositions, in evidence theory, probabilities are assigned to all the subsets of the set of propositions. A set of all subsets of propositions is called a frame of discernment and its constituents are called elements. Each element of the frame of discernment is assigned a probability; this probability represents the degree of belief or the degree of support in that element. This belief can take any value from 0 to 1, where smaller belief values denote weaker support for that element and larger belief values denote stronger support for that element. Belief for every element is measured in the form of a mass function, which denotes the proportion of all the available and relevant evidence that supports that particular element. Another central concept of the evidence theory is plausibility, which is the measure of the belief (that has not yet been assigned to an element), which implies the falsity of that element.

The following example helps illustrate the essence of the evidence theory in a simple manner:

While investigating a burglary in a bakery, Sherlock Holmes makes the following two deductions: absence of any signs of a forced entry indicated, with a high degree of certainty (say 85 percent), that the burglar was “an insider.” Preliminary investigation also suggested, again with a high degree of certainty (say 75 percent), that the burglar was left-handed. 

One of the employees in the bakery is left-handed and Holmes must now determine a degree of certainty with which to accuse this particular employee of burglary. To determine this degree of certainty (aka degree of belief), we first define two sets: set I, that represents the set of all insiders (in the context of the bakery) and set L, that represents the set of all left-handed people (working in the sweetshop). Now, the first deduction suggests that the burglar belongs to set I with degree of belief 0.85, while the second deduction supports the hypothesis that the burglar belongs to set L with degree of belief 0.75. Collectively, these two deductions should suggest, with some degree of certainty, that the burglar was a left-handed insider. So, given the degrees of belief in sets I and L, we must now assign a degree of belief to set I+L (which represents the set of left-handed insiders). According to Dempster’s combination rule, set I+L gets assigned the degree of belief 0.637 or 0.64 (which is the product of 0.85 and 0.75). This implies that according to the evidence theory, Holmes can accuse the suspected employee with 64% degree of certainty.

The results obtained with the help of Dempster’s combination rule are not affected by the order in which the pieces of evidence are combined, but observing the same evidence more than once introduces bias, which ultimately changes the belief assignments. This implies that the results obtained after the application of Dempster's combination rule are most accurate when the pieces of evidence being combined have been obtained from independent sources or from experts whose opinions are not influenced by one another in any way.

Evidence theory in the image forensics domain

The first documented instance of use of the evidence theory in the digital image forensics domain can be found in Hu et al. 2009, where the authors developed an evidence theory-based image trustworthiness evaluation model. This model used feature-level fusion and Dempster’s combination rule to evaluate the degree of trustworthiness of a given image by detecting the presence (or absence) of four classes of forgeries, namely steganography (where some data is secretly hidden within a digital image), splicing, aka copy-paste forgery (where regions or objects are inserted or removed into/from a given image), double-compression (where a compressed image is re-compressed) and computer generated imagery (CGI).

In this model, trustworthiness of the given image I was evaluated by measuring the plausibility of the proposition “A: I is trustworthy.” To measure this plausibility, the authors first generated the probabilities of presence of any of the aforementioned classes of forgeries in the given image, and then used those probabilities to calculate belief functions, which were then combined to assign a final degree of trustworthiness to the given image. The authors demonstrated that in the presence of conflicting evidence, evidence theory-based decision fusion helped measure the degree of trustworthiness of the given image with high accuracy, whereas ordinary image forensic models often failed to do so. A more elaborate version of this model was later presented in Hu et al. 2016.

Another evidence theory-based decision fusion framework was proposed in Fontani et al. 2011, which was aimed at examining the integrity of a known suspicious region of a given digital image. In this framework, three different tools, namely ToolA, ToolB, and ToolC, were developed to detect three kinds of copy-paste forgeries, and the reliability of each of these tools was evaluated separately. The presence (or absence) of a forgery was assessed by measuring the plausibility of the proposition “A: Image has undergone tampering detectable using ToolA.” Analogous propositions were formed for ToolB and ToolC as well. Similarly, the reliability of ToolA was evaluated with the help of the proposition “TA: ToolA is reliable” (and likewise for ToolB and ToolC). Furthermore, to handle any uncertainty that occurred in case the three tools provided contradictory evidence regarding the presence or absence of a forgery, the framework employed an additional variable K. This variable was basically used to measure the degree of conflict between the three tools. Finally, based on the obtained belief functions for the three tools as well as the degree of conflict, the given image was classified as authentic or tampered. The authors later extended this framework in Fontani et al. 2013.

Evidence theory in the counter anti-forensics domain

Anti-forensics, simply stated, refers to the tactics used by a forger with the explicit intent to obstruct the digital investigation process. In the context of visual media forensics, an anti-forensic technique is a forgery technique that has been adapted and remodeled in a way that renders the unauthorized modification undetectable by contemporary tamper detection methods.

Counter anti-forensic strategies, as the name suggests, are those strategies that are specifically designed to detect anti-forensically created forgeries, and the cornerstone principle for all such strategies is the same old Locard’s exchange principle. Basically, while anti-forensic techniques are designed to hide the evidence of content manipulation, they inexorably leave behind certain identifying traces of their own. Therefore, in a scenario where the content in question is suspected of having been manipulated anti-forensically, it is generally advisable that the forensic analyst integrates forensic and counter anti-forensic tools into a single forgery detection system. Such a system would enable a more exhaustive analysis of the given content; while the forensic tool will look for the artifacts of the forgery, the counter anti-forensic tool will look for the artifacts of the anti-forensic operation. If both the forensic and counter anti-forensic tools are unable to detect any inconsistency in the given content, the authenticity of this content can be ascertained with a much higher degree of confidence.

This general idea was put to practical use in Fontani et al. 2014, where the authors developed a multi-clue framework to detect anti-forensically created copy-paste forgeries. The obtained results suggested that the forensic capabilities of an evidence theory-based multi-clue framework were significantly superior to the forensic capabilities of a simple forgery detection scheme.

Future prospects

Over the years, the domain of digital image forensics has received a lot of attention, and as a result, has been able to achieve several significant milestones. And though the evidence theory is a relatively recent addition to this domain, it has already established itself as a highly utilitarian decision fusion scheme.

The domain of digital video forensics, on the other hand, has yet to reach a desired level of proficiency, and as this research domain progresses further on its developmental path, it is bound to encounter several challenges, one of which would be the need to formulate effective strategies capable of handling uncertainties that arise when forensic analysts come across conflicting and/or ambiguous evidence vis-à-vis the presence (or absence) of a forgery in the given content. In circumstances riddled with such epistemic uncertainties, the theory of evidence could make for quite an advantageous decision fusion scheme.

The concept of uncertainty lacks a unanimously approved interpretation, but in a real-world or computational scenario, uncertainty can be described as a state of having incomplete, inconsistent, and/or imperfect knowledge about an event, process or system. In domains that involve critical decision-making processes, such as economics, statistics, logic, probability, game theory, computer science and artificial intelligence, certain decisions are more likely to be accurate (or inaccurate) than others, depending on how precisely we analyze and interpret the available information.

Digital visual media forensics is one such domain where uncertainty caused by incomplete, conflicting, and/or misleading evidence could lead to inaccurate decisions regarding the authenticity of the given content, which could in turn have long-term effects of damaging nature. Accurate and definitive decision-making in this domain is therefore contingent on how effectively we handle such uncertainties, if and when they arise. The evidence theory is a mathematical formalization that allows us to combine contradictory or incomplete pieces of information from multiple heterogeneous sources and observations in order to reach a final decision regarding content trustworthiness. Despite being around for ages, the faculties of the evidence theory have only recently been investigated in the context of digital visual media forensics. And while this theory has already made significant headway in the image forensics domain, the extent of its applicability in the video forensics domain remains to be seen.

Raahat Devender Singh is a Ph.D. student in the Department of Computer Science and Engineering in UIET, Panjab University, Chandigarh, India. She is currently working in the digital visual media forensics domain, with primary focus on digital video content authentication, forgery and tamper detection.