Академический Документы
Профессиональный Документы
Культура Документы
Let us consider the general requirements imposed on most multimedia systems: COMP3600 Multimedia Systems Storage multimedia elements require much more storage space than simple text. For example, a full screen true colour image is
uncompressed CD quality stereo audio stream needs bytes . A raw digitised PAL TV signal needs
Data Compression
Wai Wong
bytes
bytes
Interaction to support fast interaction, the end-to-end delay should be small. A face-to-face application, such as video conferencing, requires the delay to be less than . Furthermore, multimedia elements have to be accessed randomly. Conclusion:
frames
bytes
Multimedia elements are very large. We need to reduce the data size using
compression.
lossless
the compression process does not reduce the amount of information. The original can be reconstructed exactly.
Entropy coding
Compression Decompression
Source coding
D
not =
lossy
the compression process reduces the amount of information. Only an approximation of the original can be reconstructed.
Hybrid coding
Compression Decompression
Run-length coding Huffman coding Arithmetic coding Prediction DPCM DM Transformation FFT DCT Vector Quantisation JPEG MPEG H.261
Entropy coding is lossless. Source coding and hybrid coding are lossy.
Coding techniques
Since the code is shorter than the data pattern, compression is achieved. The popular zip application used this method to compress les.
Run-length coding
Digital images, audio and video data stream often contain sequences of the same bytes. For example:
divided into blocks of bytes (where ). A predened table contains a set of patterns is used to code the data blocks. capable of working on almost any type of data. It builds a data dictionary of data occurring in an uncompressed data stream. Patterns of data are identied and are matched to entries in the dictionary. When a match is found the code of the entry is output. Pattern Code Begin 1 End 2 tion 3 ... ... Input function Begin x := 3; . . . Output func@3 @1@4 . . .
compressed to
Differential
coding (also know as prediction or relative coding) The most known coding of this kind is DPCM (Differential Pulse Code Modulation). This method encodes the difference between the consecutive samples instead of the sample values. For example,
PCM 215 218 210 212 208 . . . DPCM 215 3 -8 2 -4 . . . DM (Delta Modulation) is a modication of DPCM. The difference is coded with a single bit.
8. Data Compression (199901) Slide 5
By replacing the sequence with its length, a substantial reduction of data can be achieved.
A special ag that does not occur in the data can be used to indicate the length of sequence.
In practice, there are many variances of this basic compression techniques. One particular variance, known as zero suppression, assumes that only one symbol in the data stream appear very often, for example, a 0 in scanned text. A sequence of zeros is replaced by a ag and the number of occurrences.
Tabulators
A further variation is known as tabulators. A table of symbols is dened to replace sequences of different lengths of identical bytes/bits. For example, the CCITT Group 3 compression scheme denes the coding table shown. For example, 11100000111100 is coded as 1011000110111 The compression ratio can be up to to . The codes for run-length of 0 to 64 are known as terminating codes. For runlength longer than 64 bits, it is encoded using more than one code. One or more makeup codes represent multiple of 64 bits of the runlength, and this is followed by a terminating code. For example, the run-length of 132 white pixels is encoded by the following two codes:
Huffman coding
The principle of Huffman coding is to assign shorter code for symbol that has higher probability of occurring in the data stream. The length of the Huffman code is optimal. For example, a data stream has only with the following ve symbols probabilities:
makeup code for 128 white pixels 10010 terminating code for 4 white pixels 1011 White 0 1 2 3 4 5 6 7 8 9 10
Codes Black 00110101 0 000111 1 0111 2 1000 3 1011 4 1100 5 1110 6 1111 7 10011 8 10100 9 00111 10
Codes 0000110111 010 11 10 011 0011 0010 00011 000101 000100 0000100
repeat
the two steps above until there is only one entry and a binary tree containing all characters is formed. right branches of the binary tree.
64 128 192
64 128 192
assign 0 to the left branches and 1 to the the Huffman code of each character can
be read from the tree starting from the root. A B C D E 011 1 000 010 001
8. Data Compression (199901) Slide 8
A huffman code tree is created using the following procedures: ties are combined to form a binary tree.
JPEG
p(AD) =0.29 p(CE) =0.20 p(D) =0.13 p(A)=0.16
JPEG
The JPEG standard have three levels of denition as follows:
(stands for Joint Photographic Experts Group) is a joint ISO and CCITT working group for developing standards for compressing still images became an international standard in 1992
Baseline
The JPEG image compression standard JPEG can be applied to colour or grayscale images
STEP 2
p(C) =0.09 p(E)=0.11
system must reasonably decompress colour images, maintain a high compression ratio, and handle from 4bits/pixel to 16bits/pixel.
encoding aspects such as variablelength encoding, progressive encoding, and hierarchical mode of encoding.
Extended system covers the various Special lossless function ensures that
at the resolution at which the image is compressed, decompression results in no loss of any detail the was in the original image.
p(B)=0.51 1
p(CE) =0.20 0
p(CE)=0.20 p(AD)=0.29
p(AD) =0.29 1 0 1
p(E)=0.11
p(D) =0.13
p(A)=0.16
JPEG Preparation
Preparation
Processing
Entropy encoding
Pixel
Predictor
Quantisation
Run-length
may have different number of pixels in the horizontal ( ) and vertical ( ) dimension. resolution of the individual plane may be different. pixel is represented by a number of bits where . specied in the standard.
... ... .
Each
..... .. .....
The meaning of the value in these planes is not The image is divided into
COMP3600 Multimedia Systems 8. Data Compression (199901) Slide 11 COMP3600 Multimedia Systems
..... .. ..... ..... ... ... ........ .. .....
blocks.
JPEG Quantisation
coefcient can be dropped to reduce the data size
The DCT coefcients of each blocks are calculated using the formula below.
The
Each element
other DCT coefcients are known as AC coefcients which represent the frequency components of the data block
in the table, known as quantum is used in the following formula to calculate the quantised coefcients :
On the left is quantum matrix for quality level 1, and on the right the result of quantising the example from previous page.
172 21 -9 -10 -8 4 4 0 -18 15 -34 24 -8 -4 6 -5 -2 -3 -2 -4 -3 -4 -8 -4 -8 -8 6 4 5 6 5 3 23 -10 -5 -4 -3 -4 6 2 -9 -14 11 14 4 3 4 2 3 4 4 2 3 1 1 4 19 7 -1 1 6 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The A
DC-coefcients are treated separately from the AC-coefcients DC-coefcient is encoded as the difference between the current coefcient and the previous one
MPEG
MPEG
2. Processing 3. Quantisation 4. Entropy Encoding In the preparation stage, unlike JPEG, MPEG denes the format of the images.
(stands for Moving Picture Experts Group) is also a joint ISO and CCITT working group for developing standards for compressing still images became an international standard in 1993
The MPEG video compression standard MPEG uses technology dened in other
standards, such as JPEG and H.261
Each
DC
AC
01
AC 07
image consists of three components YUV many samplesin the horizontal and vertical axes as the other two components (known as colour sub-sampling
It
Block
The luminance component has twice as The resolution of the luminance component should not exceed eight bits pixels
denes a 1.2Mbits/sec
basic
data
rate
of
Block
It
AC 70 AC 77
is suitable for symmetric as well asymmetric compression It follows the reference scheme that consists of four stages of processing: 1. Preparation
Group of Pictures
The I-, P- and B-frames are often arranged into groups, known as Group of pictures (GOP). A typical GOP consists of twelve frames in the following sequence: IBBPBBPBBPBB When the video is transfered and decoded, the frames are in the following order: I P B B P B B P B B B B This is known as the transmission sequence. When random access function is required, the I-frames will be used as index, and the accessing to the video will be in a resolution of twelve frames.
B-frame
(Bi-directionally predictivecoded frame) It requires information of the previous and following I- and Pframes for encoding and decoding.
I-frame
(Intra-frame) it is a self contained frame, and it is coded without reference to any other frames.
D-frame
(DC-coded frame) Only the lowest frequency component of image is encoded. It is used in fast forward or fast rewind.
I-frame encoding
I-frames are compressed using JPEG compression technique, however, the compression/decompression processing must be carried out in real-time, i.e., on average for 30 frame/sec video stream.
Summary
MPEG-2 It
MPEG-2
pixels/line line/frame frames/sec 352 288 30 720 576 30 1920 1152 60
is a newer video encoding standard which builds on MPEG-1 supports higher video quality and higher data rate (up to 80 Mbits/sec)
Compression methods lossless vs. lossy Entropy coding run-length encoding, Huffman encoding Source coding prediction (DPCM, DM), transformation (DCT) hybrid coding JPEG, MPEG
Original frames 1 2 3 4 5 6 7 8 9
MPEG Compression