Visual Quality Assessment (VQA)

We design prediction algorithms for the visual quality of images and videos, with respect to technical and perceptual aspects e.g. quality of experience (QoE). The tools of our trade include crowdsourcing, machine learning i.e. deep networks, eye-tracking. Consequently, we are creating massive multimedia databases that are suitable for training generic and accurate VQA models.

Project Description

The visual quality of images and videos is typically assessed by human experts and yields mean opinion scores (MOS). To reduce cost and time, providing methods for automated image/video quality assessment (IQA/VQA) is desirable for the multimedia and signal processing community at large. Current VQA methods are designed solely to capture aspects of the technical quality of displayed video streams. In addition to such visual quality we aim at methods to characterize images and videos in terms of other perceptual aspects. 

These aspects include the number and magnitude of eye movements required for viewing the content, the viewer's appreciation of the use of color, and the degree of interestingness. Several such components together with visual quality are combined for an overall ration of perceptual quality. Moreover, by investigating the human perceptual process and by understanding psychophysical phenomena, a saliency model will be developed that is based on a Markov model of eye movements.

Additionally, we will bring the state-of-the art a step closer to reality by setting up and applying media databases of authentic distortions and diverse content, which stand in contrast to current scientific data sets containing only a small variety of content and 'artificial' distortions.

Given an image or video whose visual quality is to be assessed, the question arises as to which IQA/VQA algorithm should be applied. Instead of choosing an algorithm based on a fixed test database, it can be assumed that a better quality assessment is possible when choosing an algorithm based on a particular test database consisting only of images/videos similar to the query image/video.

A number of problems occur:

  • What type of similarity is the most appropriate for this application? 
  • What statistical/perceptual features should be extracted to express similarity? 
  • How can the statistical/perceptual similarity of the input image and the test images be estimated? 
  • Should algorithms be combined to get more robust results?

The MMSP VQA Database Collection

The KoNViD-1k Database

Subjective video quality assessment (VQA) strongly depends on semantics, context, and the types of visual distortions. A lot of existing VQA databases cover small numbers of video sequences with artificial distortions. When testing newly developed Quality of Experience (QoE) models and metrics, they are commonly evaluated against subjective data from such databases, that are the result of perception experiments. However, since the aim of these QoE models is to accurately predict natural videos, these artificially distorted video databases are an insufficient basis for learning. Additionally, the small sizes make them only marginally usable for state-of-the-art learning systems, such as deep learning. In order to give a better basis for development and evaluation of objective VQA methods, we have created a larger datasets of natural, real-world video sequences with corresponding subjective mean opinion scores (MOS) gathered through crowdsourcing. 

Show more

The KonIQ-10k Database

The main challenge in applying state-of-the-art deep learning methods to predict image quality in-the-wild is the relatively small size of existing quality scored datasets. The reason for the lack of larger datasets is the massive resources required in generating diverse and publishable content. To this purpose, we have created a large IQA database of natural, real-world images with corresponding mean opinion scores (MOS) gathered through crowdsourcing. 

Show more

Project Members


  • Varga, D., Sziranyi, T.,  Saupe, D. - DeepRN: A content preserving deep architecture for blind image quality assessment, IEEE International Conference on Multimedia and Expo (ICME), 2018.
  • Wiedemann, O., Hosu, V., Lin, H., and Saupe D. - Disregarding the big picture: Towards local image quality assessment, 10th International Conference on Quality of Multimedia Experience (QoMEX), 2018.
  • Hosu, V., Lin, H., Saupe, D. - Expertise screening in crowdsourcing image quality, 10th International Conference on Quality of Multimedia Experience (QoMEX), 2018.
  • Men, H., Lin, H., and Saupe D. - Spatiotemporal Feature Combination Model for No-Reference Video Quality Assessment, 10th International Conference on Quality of Multimedia Experience (QoMEX), 2018.
  • Lin, H.,  Hosu, V.,  Saupe, D. - KonIQ-10K: Towards an ecologically valid and large-scale IQA database, arXiv preprint arXiv:1803.08489, 2018.
  • Hosu, V., Hahn, F., Jenadeleh, M., Lin, H., Men, H., Szirányi, T., Li, S., Saupe, D. - The Konstanz natural video database (KoNViD-1k), 9th International Conference on Quality of Multimedia Experience (QoMEX), 2017.
  • Men, H., Lin, H., Saupe, D. - Empirical evaluation of no-reference VQA methods on a natural video quality database, 9th International Conference on Quality of Multimedia Experience (QoMEX), 2017.
  • Hosu, V., Hahn, F., Zingman, I., Saupe, D. - Reported Attention as a Promising Alternative to Gaze in IQA Tasks, 5th International Workshop on Perceptual Quality of Systems 2016 (PQS 2016), Berlin, August 2016.
  • Saupe, D., Hahn, F., Hosu, V., Zingman, I., Rana, R., Li, S. - Crowd workers proven useful: A comparative study of subjective video quality assessment, Eight International Workshop on Quality of Multimedia Experience (QoMEX 2016), Lisbon, June 2016.