Personnel (Past, Present): Mohammed Aabed, Gukyeong Kwon, Ghassan AlRegib
Goal: To objectively analyze the quality of a video by studying the impact of network losses on HEVC videos and the resulting error propagation. Motion estimation features such as optical flow descriptors, power spectral density and SSIM are used as perceptual video quality assessment metrics for evaluating video quality.
Challenges: The paramount coding performance of HEVC comes at the expense of a more complex encoding operation compared with AVC. HEVC introduces the coding unit tree (CTU) structure which allows more flexibility for coding, transform, and prediction modes. These features make the bitstream and the decoded sequence more sensitive to errors and losses due to the higher level of data dependency. This introduces more challenges in terms of video quality assessment and monitoring, error concealment, etc. Also, DMOS is then used as a ground-truth to measure the accuracy of PVQA schemes and metrics. However, this abstracts a huge four-dimensional correlated signal (video) into a single floating number (DMOS score). Hence, interpreting the correlation between video features and the perceptual visual quality becomes a tedious task with ambiguous operational blocks.
Our work: [1] estimates channel-induced distortion in the video assuming we have access to the decoded video only without access to the bitstream or the decoder. Our model does not make any assumptions on the coding conditions, network loss patterns or error concealment techniques. The proposed approach relies only on the temporal variations of the power spectrum across the decoded frames. In [2, 5] we introduce a perceptual quality assessment framework for streamed videos using optical flow features. This approach is a reduced-reference pixel-based and relies only on the deviation of the optical flow of the corrupted frames. This technique compares an optical flow descriptor from the received frame against the descriptor obtained from the anchor frame. This approach is suitable for videos with complex motion patterns. Our technique does not make any assumptions on the coding conditions, network loss patterns or error concealment techniques. In [3] We propose a perceptual video quality assessment (PVQA) metric for distorted videos by analyzing the power spectral density (PSD) of a group of pictures. This is an estimation approach that relies on the changes in video dynamic calculated in the frequency domain and are primarily caused by distortion. We obtain a feature map by processing a 3D PSD tensor obtained from a set of distorted frames. This is a full reference tempo spatial approach that considers both temporal and spatial PSD characteristics. We also investigate the challenge of distortion map feature selection and spatiotemporal pooling in perceptual video quality assessment (PVQA) [4]. We analyze three distortion maps representing different visual features spatially and temporally: squared error, local pixel-level SSIM, and absolute difference of optical flow magnitudes. We examine the performance of each of these maps with different spatial and temporal pooling strategies across three databases. We identify the most effective statistical pooling strategies spatially and temporally with respect to PVQA. We also show the most significant spatial and temporal features correlated with perception for every distortion/feature map.
References:
[1] M. Aabed and G. AlRegib, “No-Reference Quality Assessment of HEVC Videos in Loss-Prone Networks,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 4-9 2014.
[2] M. Aabed and G. AlRegib, “PeQASO: Perceptual Quality Assessment of Streamed Videos Using Optical Flow Features,” in IEEE Transactions on Broadcasting, 2018.
[3] M. A. Aabed, G. Kwon, and G. AlRegib, “Power of Tempospatially Unified Spectral Density for Perceptual Video Quality Assessment,” in IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, 2017.
[4] M. Aabed and G. AlRegib, “Perceptual Video Quality Assessment: Spatiotemporal Pooling Strategies for Different Distortions and Visual Maps,” in IEEE International Workshop on Multimedia Signal Processing (MMSP), Montreal, Canada, Sep. 21-23 2016.
[5] M. Aabed and G. AlRegib, “Reduced-Reference Perceptual Quality Assessment for Video Streaming,” in IEEE International Conference on Image Processing (ICIP), Québec City, Canada, Sep. 27-30 2015.