Visual Information Fidelity is a full reference image quality assessment index based on natural scene statistics and the notion of image information extracted by the human visual system. It was developed by Hamid R Sheikh and Alan Bovik at the Laboratory for Image and Video Engineering (LIVE) at the University of Texas at Austin in 2006. It is deployed in the core of the Netflix VMAF video quality monitoring system, which controls the picture quality of all encoded videos streamed by Netflix.
Images and videos of the three-dimensional visual environments come from a common class: the class of natural scenes. Natural scenes from a tiny subspace in the space of all possible signals, and researchers have developed sophisticated models to characterize these statistics. Most real-world distortion processes disturb these statistics and make the image or video signals unnatural. The VIF index employs natural scene statistical (NSS) models in conjunction with a distortion (channel) model to quantify the information shared between the test and the reference images. Further, the VIF index is based on the hypothesis that this shared information is an aspect of fidelity that relates well with visual quality. In contrast to prior approaches based on the human visual system (HVS) error-sensitivity and measurement of structure, this statistical approach uses in an information-theoretic setting, yields a full reference (FR) quality assessment (QA) method that does not rely on any HVS or viewing geometry parameter, nor any constants requiring optimization, and yet is competitive with state of the art QA methods.
Specifically, the reference image is modeled as being the output of a stochastic `natural' source that passes through the HVS channel and is processed later by the brain. The information content of the reference image is quantified as being the mutual information between the input and output of the HVS channel. This is the information that the brain could ideally extract from the output of the HVS. The same measure is then quantified in the presence of an image distortion channel that distorts the output of the natural source before it passes through the HVS channel, thereby measuring the information that the brain could ideally extract from the test image. This is shown pictorially in Figure 1. The two information measures are then combined to form a visual information fidelity measure that relates visual quality to relative image information.