HEVC: A coding standard that we (will) use
when watching TV/video-streams
David Svoboda
Centre for Biomedical Image analysis
Faculty of Informatics, Masaryk University, Brno, Czech Republic
May 7, 2019
Content
1 Motivation
2 Basic facts
3 Speciﬁcation
4 Pros and cons
5 Recommended sources
D. Svoboda (CBIA) CBIA seminar 2 / 19
Motivation
Mobile data and TV broadcasting are competitors
Capacity of mobile data . . . increasing
Quality of TV channels . . . increasing
The bandwidth of air (as a transmission medium) is limited!
A brief history of digital data transmission in the Czech republic
2005 . . . start of digital transmission (DVB-T)
2011 . . . end of analogue transmission
2014 . . . mobile providers start LTE
2017 . . . start of digital transmission of G2 (DVB-T2)
2021 . . . end of digital transmission of G1 (DVB-T)
D. Svoboda (CBIA) CBIA seminar 3 / 19
Basic facts
DVB-T = Digital Video Broadcasting (Terrestrial)
DVB-T . . . based on MPEG-2 or H.264/MPEG-4 AVC
DVB-T2 . . . based in H.265/HEVC
HEVC = High Efﬁciency Video Coding
HEVC addressed two key issues:
increase video resolution while keeping the bitrate or improve the
data compression (up to 50%) while keeping the same level of
video quality (compared to H.264)
increase use of parallel processing architectures
D. Svoboda (CBIA) CBIA seminar 4 / 19
Speciﬁcation
Basic building blocks of HEVC standard
Color representation
Picture partitioning
Intra prediction
Inter prediction (motion vectors)
Transform coding (of residuals)
Entropy coding (of transform coefﬁcients)
Deblocking ﬁlters (part of reconstruction)
D. Svoboda (CBIA) CBIA seminar 5 / 19
Speciﬁcation
Color representation
Use of YCbCr model rather than RGB or CMYK
Human visual system less sensitive to color than to structure and
texture ⇒ full resolution luma, lower resolution chroma
D. Svoboda (CBIA) CBIA seminar 6 / 19
Speciﬁcation
Color representation (cont’d)
Chroma sub-sampling 4:2:0 has been accepted as the standard
format for consumer video
D. Svoboda (CBIA) CBIA seminar 7 / 19
Speciﬁcation
Picture partitioning
The image is split into coding tree units (CTU) of adaptive size
(16 × 16, 32 × 32, 64 × 64).
Each CTU can be further split into smaller coding units (CU) based on
quad-tree algorithm.
D. Svoboda (CBIA) CBIA seminar 8 / 19
Speciﬁcation
Picture partitioning
The image is initially split into slices/tiles
Each slice is computed independently, i.e. no prediction (relationship)
across the slice boundaries.
Purpose of tiles is the resynchronization after data losses.
Tiles are regular rectangular regions.
The main purpose is the support for parallel processing.
D. Svoboda (CBIA) CBIA seminar 9 / 19
Speciﬁcation
Intra prediction
Each CU can be predicted from neighbouring image data in the same
video frame.
D. Svoboda (CBIA) CBIA seminar 10 / 19
Speciﬁcation
Intra prediction
original
D. Svoboda (CBIA) CBIA seminar 11 / 19
Speciﬁcation
Intra prediction
prediction
D. Svoboda (CBIA) CBIA seminar 11 / 19
Speciﬁcation
Intra prediction
residual
D. Svoboda (CBIA) CBIA seminar 11 / 19
Speciﬁcation
Inter prediction – Motion vectors
Intra prediction . . . spatial prediction
Inter prediction . . . spatial & temporal prediction
Inter prediction . . . predicting from image data in one or two reference pictures
(before or after the current picture in display order), using motion
compensated prediction.
D. Svoboda (CBIA) CBIA seminar 12 / 19
Speciﬁcation
Inter prediction – Motion vectors
D. Svoboda (CBIA) CBIA seminar 13 / 19
Speciﬁcation
Inter prediction – Motion vectors
D. Svoboda (CBIA) CBIA seminar 13 / 19
Speciﬁcation
Transform coding
The discrete cosine transform applied to residuals is followed by
quantization.
D. Svoboda (CBIA) CBIA seminar 14 / 19
Speciﬁcation
Entropy coding
CABAC – context-adaptive binary arithmetic coding
The quantized DCT coefﬁcients are submitted to arithmetic coding
Context allows the adaptability when encoding the ﬁnal bitstream
A simple example of entropy coding
Let’s suppose we have a block with 8-levels coded with 3-bit binary code
i p(i) 3-bit code len(i) new code new len(i)
l0 = 0 0.19 000 3 11 2
l1 = 1 0.25 001 3 01 2
l2 = 2 0.21 010 3 10 2
l3 = 3 0.16 011 3 001 3
l4 = 4 0.08 100 3 0001 4
l5 = 5 0.06 101 3 00001 5
l6 = 6 0.03 110 3 000001 6
l7 = 7 0.02 111 3 000000 6
Using new code brings better (lower) average number of bits per pixel:
Lavg = 2(0.19) + 2(0.25) + 2(0.21) + 3(0.16) + 4(0.08) +
5(0.06) + 6(0.03) + 6(0.02) = 2.7 bits
D. Svoboda (CBIA) CBIA seminar 15 / 19
Speciﬁcation
Deblocking ﬁlters
D. Svoboda (CBIA) CBIA seminar 16 / 19
Speciﬁcation
Deblocking ﬁlters
D. Svoboda (CBIA) CBIA seminar 16 / 19
Pros and cons
+ efﬁciency (50% compared to H.264)
+ speed/parallelism
− computationally intensive
CPU usage
D. Svoboda (CBIA) CBIA seminar 17 / 19
Recommended sources
short (but illustrative) paper:
Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas
Wiegand. 2012. Overview of the High Efﬁciency Video Coding (HEVC)
Standard. IEEE Trans. Cir. and Sys. for Video Technol. 22, 12
(December 2012), 1649-1668
book with complete speciﬁcation:
Sze, Vivienne, ed., Madhukar Budagavi ed., and Gary Joseph Sullivan
ed. High Efﬁciency Video Coding (HEVC): Algorithms and Architectures.
Berlin: Springer, 2014.
codec/decoder source codes:
http://x265.org/, https://www.libde265.org/
software for video analysis (Win/Linux/Mac):
https://ient.github.io/YUView/
software for video transcoding (Win/Linux/Mac):
https://handbrake.fr/
currently installed on Alfa – you can try :-)
D. Svoboda (CBIA) CBIA seminar 18 / 19
Thank you for your attention . . .
D. Svoboda (CBIA) CBIA seminar 19 / 19