Official implementation of the BMVC 2021 paper. Paper: https://bmva-archive.org.uk/bmvc/2021/conference/papers/paper_0813.html
Q1Net predicts the quality level of a compressed image (e.g. the JPEG quality factor) directly from the image, using a block-wise, confidence-aware CNN.
- Real-time: predicts the compression quality level in milliseconds, fast enough to run on mobile devices.
- Accurate: over 99% accuracy in the paper's experiments.
- Block-wise & confidence-aware: exploits the characteristic deformations transform coding leaves on small blocks, estimates a per-patch confidence, and fuses only the reliable patches instead of processing the whole image.
- Deployable: exports to TensorFlow Lite for on-device inference.
Instead of looking at the whole image, Q1Net samples small patches around coding blocks, runs a lightweight CNN on each patch to predict a quality value together with a confidence, keeps only the high-confidence patches, and fuses them:
flowchart LR
A[Input image] --> B[Sample small patches<br/>around coding blocks]
B --> C[CNN backbone]
C --> D[Per-patch:<br/>confidence + quality]
D --> E{confidence above<br/>threshold?}
E -- yes --> F[Keep patch]
E -- no --> G[Discard patch]
F --> H[Fuse by median<br/>= predicted quality]
The per-patch backbone is a compact residual CNN operating on 16x16x3 patches:
flowchart LR
I[16x16x3 patch] --> S["CBR + Bottleneck stages<br/>channels 8 - 16 - 32 - 64 - 32 - 16"]
S --> CV[Conv 3x3, ReLU]
CV --> P[Global average pooling]
P --> O["Dense 2, sigmoid x100<br/>= confidence, quality"]
CBR is Conv to BatchNorm to ReLU; the bottleneck is a 1x1 to 3x3 to 1x1 residual block. The confidence-aware loss down-weights unreliable patches during training.
Confusion matrices over 10,000 compressed images spanning all 100 quality levels (Figure 4 from the paper). A sharper diagonal means more accurate quality prediction: Q1Net (c) produces a markedly tighter diagonal than MobileNetV2 (a) and JQE (b), staying accurate across the full quality range.
- Kyuwon Kim (chammoru at gmail, q1.kim at samsung)
- Chulju Yang (ijn9429 at gmail, chulju at samsung)
@InProceedings{kim2021q1net,
title = {Quality Level Prediction of Image Compression using Block-wise Confidence-aware CNN},
author = {Kim, Kyuwon and Yang, Chulju},
booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
month = {November},
year = {2021}
}- Python 3 (tested on 3.12)
- The pinned packages in
requirements.txt(TensorFlow 2.16, installed in the setup step below)
TensorFlow 2.16 defaults to Keras 3, but this project uses the Keras 2 API.
env.sh exports TF_USE_LEGACY_KERAS=1 so the tf-keras (Keras 2) implementation
is used.
This project uses the DIV2K dataset.
The pretrained model weights are tracked with Git LFS, so install it before cloning:
# install Git LFS once per machine, then enable it for your user
git lfs install
git clone https://github.com/chammoru/Q1Net.git
cd Q1Net
# (recommended) create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# install the pinned dependencies
pip install -r requirements.txt
# go to the source directory and set up the environment
# (adds the repo root to PYTHONPATH and exports TF_USE_LEGACY_KERAS=1)
cd classifier
. ./env.shIf you already cloned the repository without Git LFS, fetch the weights with:
git lfs pullIf you cannot use Git LFS, download q1net-weights.zip from the
Releases page and extract it at the
repository root so that classifier/save/<comp_type>/best/ contains the checkpoint
files (.index and .data-*). The commands that load the model print a clear error
if the weights are missing or are still unresolved Git LFS pointers.
The supported compression types (--comp_type) are jpeg_paper and jpeg_paper_k12.
Predict the quality level of a single image:
python3 ./predict_cls.py --in_path ../sample_image/monarch_jpeg_q20.png --comp_type jpeg_paperThe sample image is JPEG quality 20, so the output is close to 20:
predicted quality 20.01, estimated in 0.135 seconds
Evaluate over a directory of images. Each image is compressed at every quality
level, predicted, and compared against the ground truth; the mean absolute error
is reported and a confusion matrix is saved to --out_path (default out/).
# Download the validation set
wget https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_valid_HR.zip
unzip DIV2K_valid_HR.zip
python3 evaluate_cls.py --comp_type jpeg_paper --in_path DIV2K_valid_HR# Download the training set
wget https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
unzip DIV2K_train_HR.zip
sh batch_train_jpeg_paper.shDuring training, gen_data.py generates an HDF5 file of training data that
train.py then consumes.
python3 ./to_tflite.py --comp_type jpeg_paperQ1Net can benefit a wide range of applications, including:
- Image/photo editors
- (Streaming) video players and photo viewers
- Web browsers
- Video conferencing
- Instant messaging apps
- And many more
For example, knowing the compression quality of a photo (such as the ID photo in a mobile driver's-license app below) lets an app decide whether to enhance it before display:
Image source: Yonhap News (watermarked); used here for illustration only.
This code is released for non-commercial research and evaluation purposes only. The methods implemented here are covered by U.S. Patent No. 12,462,356 B2, owned by Samsung Electronics Co., Ltd.; no patent license is granted, and commercial use requires a separate license. See LICENSE for the full terms.

