Blind visual quality assessment on 360° Video based on progressive learning

Related tags

Deep LearningProVQA
Overview

Blind visual quality assessment on omnidirectional or 360 video (ProVQA)

Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and Video

This repository contains the official PyTorch implementation of the following paper:

Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and Video
Li Yang, Mai Xu, ShengXi Li, YiChen Guo and Zulin Wang (School of Electronic and Information Engineering, Beihang University)
Paper link: https://arxiv.org/abs/2111.09503
Abstract: Blind visual quality assessment (BVQA) on 360° video plays a key role in optimizing immersive multimedia systems. When assessing the quality of 360° video, human tends to perceive its quality degradation from the viewport-based spatial distortion of each spherical frame to motion artifact across adjacent frames, ending with the video-level quality score, i.e., a progressive quality assessment paradigm. However, the existing BVQA approaches for 360° video neglect this paradigm. In this paper, we take into account the progressive paradigm of human perception towards spherical video quality, and thus propose a novel BVQA approach (namely ProVQA) for 360° video via progressively learning from pixels, frames and video. Corresponding to the progressive learning of pixels, frames and video, three sub-nets are designed in our ProVQA approach, i.e., the spherical perception aware quality prediction (SPAQ), motion perception aware quality prediction (MPAQ) and multi-frame temporal non-local (MFTN) sub-nets. The SPAQ sub-net first models the spatial quality degradation based on spherical perception mechanism of human. Then, by exploiting motion cues across adjacent frames, the MPAQ sub-net properly incorporates motion contextual information for quality assessment on 360° video. Finally, the MFTN sub-net aggregates multi-frame quality degradation to yield the final quality score, via exploring long-term quality correlation from multiple frames. The experiments validate that our approach significantly advances the state-of-the-art BVQA performance on 360° video over two datasets, the code of which has been public in \url{https://github.com/yanglixiaoshen/ProVQA.}
Note: Since this paper is under review, you can first ask for the paper from me to ease the implementation of this project but you have no rights to use this paper in any purpose. Unauthorized use of this article for all activities will be investigated for legal responsibility. Contact me for accessing my paper (Email: [email protected])

Preparation

Requriments

First, download the conda environment of ProVQA from ProVQA_dependency and install my conda enviroment <envs> in Linux sys (Ubuntu 18.04+); Then, activate <envs> by running the following command:

conda env create -f ProVQA_environment.yaml

Second, install all dependencies by running the following command:

pip install -r ProVQA_environment.txt

If the above installation don't work, you can download the environment file with .tar.gz format. Then, unzip the file into a directory (e.g., pro_env) in your home directiory and activate the environment every time before you run the code.

source activate /home/xxx/pro_env

Implementation

The architecture of the proposed ProVQA is shown in the following figure, which contains four novel modules, i.e., SPAQ, MPAQ, MFTN and AQR.

Dataset

We trained our ProVQA on the large-scale 360° VQA dataset VQA-ODV, which includes 540 impaired 360° videos deriving from 60 reference 360° videos under equi-rectangular projection (ERP) (Training set: 432-Testing set:108). Besides, we also evaluate the performance of our ProVQA over 144 distorted 360° videos in BIT360 dataset.

Training the ProVQA

Our network is implemented based on the PyTorch framework, and run on two NVIDIA Tesla V100 GPUs with 32G memory. The number of sampled frames is 6 and the batch size is 3 per GPU for each iteration. The training set of VQA-ODV dataset has been packed as an LMDB file ODV-VQA_Train, which is used in our approach.

First, to run the training code as follows,

CUDA_VISIBLE_DEVICES=0,1 python ./train.py -opt ./options/train/bvqa360_240hz.yaml

Note that all the settings of dataset, training implementation and network can be found in "bvqa360_240hz.yaml". You can modify the settings to satisfy your experimental environment, for example, the dataset path should be modified to be your sever path. For the final BVQA result, we choose the trained model at iter=26400, which can be download at saved_model. Moreover, the corresponding training state can be obtained at saved_optimized_state.

Testing the ProVQA

Download the saved_model and put it in your own experimental directory. Then, run the following code for evaluating the BVQA performance over the testing set ODV-VQA_TEST. Note that all the settings of testing set, testing implementation and results can be found in "test_bvqa360_OURs.yaml". You can modify the settings to satisfy your experimental environment.

CUDA_VISIBLE_DEVICES=0 python ./test.py -opt ./options/test/test_bvqa360_OURs.yaml

The test results of predicted quality scores of all test 360° Video frames can be found in All_frame_scores and latter you should run the following code to generate the final 108 scores corresponding to 108 test 360° Videos, which can be downloaded from predicted_DMOS.

python ./evaluate.py

Evaluate BVQA performance

We have evaluate the BVQA performance for 360° Videos by 5 general metrics: PLCC, SROCC, KROCC, RMSE and MAE. we employ a 4-order logistic function for fitting the predicted quality scores to their corresponding ground truth, such that the fitted scores have the same scale as the ground truth DMOS gt_dmos. Note that the fitting procedure are conducted on our and all compared approaches. Run the code bvqa360_metric in the following command :

./bvqa360_metric1.m

As such, you can get the final results of PLCC=0.9209, SROCC=0.9236, KROCC=0.7760, RMSE=4.6165 and MAE=3.1136. The following tables shows the comparison on BVQA performance between our and other 13 approaches, over VQA-ODV and BIT360 dataset.

Tips

(1) We have summarized the information about how to run the compared algorithms in details, which can be found in the file "compareAlgoPreparation.txt".
(2) The details about the pre-processing on the ODV-VQA dataset and BIT360 dataset can be found in the file "pre_process_dataset.py".

Citation

If this repository can offer you help in your research, please cite the paper:

@misc{yang2021blind,
      title={Blind VQA on 360{\deg} Video via Progressively Learning from Pixels, Frames and Video}, 
      author={Li Yang and Mai Xu and Shengxi Li and Yichen Guo and Zulin Wang},
      year={2021},
      eprint={2111.09503},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

  1. https://github.com/xinntao/EDVR
  2. https://github.com/AlexHex7/Non-local_pytorch
  3. https://github.com/ChiWeiHsiao/SphereNet-pytorch

Please enjoy it and best wishes. Plese contact with me if you have any questions about the ProVQA approach.

My email address is 13021041[at]buaa[dot]edu[dot]cn

Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

OnsagerNet Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks This is the original pyTorch implemenati

Haijun.Yu 3 Aug 24, 2022
code for "Feature Importance-aware Transferable Adversarial Attacks"

Feature Importance-aware Attack(FIA) This repository contains the code for the paper: Feature Importance-aware Transferable Adversarial Attacks (ICCV

Hengchang Guo 44 Nov 24, 2022
Planar Prior Assisted PatchMatch Multi-View Stereo

ACMP [News] The code for ACMH is released!!! [News] The code for ACMM is released!!! About This repository contains the code for the paper Planar Prio

Qingshan Xu 127 Dec 31, 2022
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
TensorFlow implementation of Elastic Weight Consolidation

Elastic weight consolidation Introduction A TensorFlow implementation of elastic weight consolidation as presented in Overcoming catastrophic forgetti

James Stokes 67 Oct 11, 2022
SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

SPLADE 🍴 + 🥄 = 🔎 This repository contains the weights for four models as well as the code for running inference for our two papers: [v1]: SPLADE: S

NAVER 170 Dec 28, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

130 Dec 11, 2022
⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

Tony Chou 240 Dec 05, 2022
BEGAN in PyTorch

BEGAN in PyTorch This project is still in progress. If you are looking for the working code, use BEGAN-tensorflow. Requirements Python 2.7 Pillow tqdm

Taehoon Kim 260 Dec 07, 2022
Modification of convolutional neural net "UNET" for image segmentation in Keras framework

ZF_UNET_224 Pretrained Model Modification of convolutional neural net "UNET" for image segmentation in Keras framework Requirements Python 3.*, Keras

209 Nov 02, 2022
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
Artifacts for paper "MMO: Meta Multi-Objectivization for Software Configuration Tuning"

MMO: Meta Multi-Objectivization for Software Configuration Tuning This repository contains the data and code for the following paper that is currently

0 Nov 17, 2021
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
Deep Learning tutorials in jupyter notebooks.

DeepSchool.io Sign up here for Udemy Course on Machine Learning (Use code DEEPSCHOOL-MARCH to get 85% off course). Goals Make Deep Learning easier (mi

Sachin Abeywardana 1.8k Dec 28, 2022
An open-source online reverse dictionary.

An open-source online reverse dictionary.

THUNLP 6.3k Jan 09, 2023
Pytorch implementation of Learning Rate Dropout.

Learning-Rate-Dropout Pytorch implementation of Learning Rate Dropout. Paper Link: https://arxiv.org/pdf/1912.00144.pdf Train ResNet-34 for Cifar10: r

42 Nov 25, 2022
Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

tonne 1.4k Dec 29, 2022
Companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsura et al.

META-RS This is the companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsu

Bosch Research 7 Dec 09, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 01, 2023