Q1Net: Quality Level Prediction of Image Compression using Block-wise Confidence-aware CNN

Official implementation of the BMVC 2021 paper. Paper: https://bmva-archive.org.uk/bmvc/2021/conference/papers/paper_0813.html

Q1Net predicts the quality level of a compressed image (e.g. the JPEG quality factor) directly from the image, using a block-wise, confidence-aware CNN.

Highlights

Real-time: predicts the compression quality level in milliseconds, fast enough to run on mobile devices.
Accurate: over 99% accuracy in the paper's experiments.
Block-wise & confidence-aware: exploits the characteristic deformations transform coding leaves on small blocks, estimates a per-patch confidence, and fuses only the reliable patches instead of processing the whole image.
Deployable: exports to TensorFlow Lite for on-device inference.

How it works

Instead of looking at the whole image, Q1Net samples small patches around coding blocks, runs a lightweight CNN on each patch to predict a quality value together with a confidence, keeps only the high-confidence patches, and fuses them:

flowchart LR
    A[Input image] --> B[Sample small patches<br/>around coding blocks]
    B --> C[CNN backbone]
    C --> D[Per-patch:<br/>confidence + quality]
    D --> E{confidence above<br/>threshold?}
    E -- yes --> F[Keep patch]
    E -- no --> G[Discard patch]
    F --> H[Fuse by median<br/>= predicted quality]

The per-patch backbone is a compact residual CNN operating on 16x16x3 patches:

flowchart LR
    I[16x16x3 patch] --> S["CBR + Bottleneck stages<br/>channels 8 - 16 - 32 - 64 - 32 - 16"]
    S --> CV[Conv 3x3, ReLU]
    CV --> P[Global average pooling]
    P --> O["Dense 2, sigmoid x100<br/>= confidence, quality"]

CBR is Conv to BatchNorm to ReLU; the bottleneck is a 1x1 to 3x3 to 1x1 residual block. The confidence-aware loss down-weights unreliable patches during training.

Results

Confusion matrices over 10,000 compressed images spanning all 100 quality levels (Figure 4 from the paper). A sharper diagonal means more accurate quality prediction: Q1Net (c) produces a markedly tighter diagonal than MobileNetV2 (a) and JQE (b), staying accurate across the full quality range.

Authors

Kyuwon Kim (chammoru at gmail, q1.kim at samsung)
Chulju Yang (ijn9429 at gmail, chulju at samsung)

Citation

@InProceedings{kim2021q1net,
  title     = {Quality Level Prediction of Image Compression using Block-wise Confidence-aware CNN},
  author    = {Kim, Kyuwon and Yang, Chulju},
  booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
  month     = {November},
  year      = {2021}
}

Requirements

Python 3 (tested on 3.12)
The pinned packages in requirements.txt (TensorFlow 2.16, installed in the setup step below)

TensorFlow 2.16 defaults to Keras 3, but this project uses the Keras 2 API. env.sh exports TF_USE_LEGACY_KERAS=1 so the tf-keras (Keras 2) implementation is used.

Dataset

This project uses the DIV2K dataset.

Clone and setup

The pretrained model weights are tracked with Git LFS, so install it before cloning:

# install Git LFS once per machine, then enable it for your user
git lfs install

git clone https://github.com/chammoru/Q1Net.git
cd Q1Net

# (recommended) create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate

# install the pinned dependencies
pip install -r requirements.txt

# go to the source directory and set up the environment
# (adds the repo root to PYTHONPATH and exports TF_USE_LEGACY_KERAS=1)
cd classifier
. ./env.sh

If you already cloned the repository without Git LFS, fetch the weights with:

git lfs pull

Pretrained weights without Git LFS

If you cannot use Git LFS, download q1net-weights.zip from the Releases page and extract it at the repository root so that classifier/save/<comp_type>/best/ contains the checkpoint files (.index and .data-*). The commands that load the model print a clear error if the weights are missing or are still unresolved Git LFS pointers.

The supported compression types (--comp_type) are jpeg_paper and jpeg_paper_k12.

Prediction

Predict the quality level of a single image:

python3 ./predict_cls.py --in_path ../sample_image/monarch_jpeg_q20.png --comp_type jpeg_paper

The sample image is JPEG quality 20, so the output is close to 20:

predicted quality 20.01, estimated in 0.135 seconds

Evaluation

Evaluate over a directory of images. Each image is compressed at every quality level, predicted, and compared against the ground truth; the mean absolute error is reported and a confusion matrix is saved to --out_path (default out/).

# Download the validation set
wget https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_valid_HR.zip
unzip DIV2K_valid_HR.zip

python3 evaluate_cls.py --comp_type jpeg_paper --in_path DIV2K_valid_HR

Training

# Download the training set
wget https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
unzip DIV2K_train_HR.zip

sh batch_train_jpeg_paper.sh

During training, gen_data.py generates an HDF5 file of training data that train.py then consumes.

Convert the model to TFLite

python3 ./to_tflite.py --comp_type jpeg_paper

Applications

Q1Net can benefit a wide range of applications, including:

Image/photo editors
(Streaming) video players and photo viewers
Web browsers
Video conferencing
Instant messaging apps
And many more

For example, knowing the compression quality of a photo (such as the ID photo in a mobile driver's-license app below) lets an app decide whether to enhance it before display:

Image source: Yonhap News (watermarked); used here for illustration only.

License

This code is released for non-commercial research and evaluation purposes only. The methods implemented here are covered by U.S. Patent No. 12,462,356 B2, owned by Samsung Electronics Co., Ltd.; no patent license is granted, and commercial use requires a separate license. See LICENSE for the full terms.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
classifier		classifier
docs		docs
sample_image		sample_image
tool		tool
train_dataset		train_dataset
val_dataset		val_dataset
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
tf2h5_to_tf1.py		tf2h5_to_tf1.py
tf_util.py		tf_util.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q1Net: Quality Level Prediction of Image Compression using Block-wise Confidence-aware CNN

Highlights

How it works

Results

Authors

Citation

Requirements

Dataset

Clone and setup

Pretrained weights without Git LFS

Prediction

Evaluation

Training

Convert the model to TFLite

Applications

License

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Q1Net: Quality Level Prediction of Image Compression using Block-wise Confidence-aware CNN

Highlights

How it works

Results

Authors

Citation

Requirements

Dataset

Clone and setup

Pretrained weights without Git LFS

Prediction

Evaluation

Training

Convert the model to TFLite

Applications

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages