Вы находитесь на странице: 1из 29

Introduction to

Image and Video


Coding Algorithms
The need for image compression
512 x 512 pixel color image

512 x 512 x 24bits = 786 Kbytes

Videoconference QCIF (quarter common intermediate format)

(176 x 144 + 88 x 72 + 88 x 72) x 8 x 25 = 7.6 Mbits/s

Digital television

(720 x 576 + 360 x 288 + 360 x 288 ) x 8 x 25 = 124 Mbits/s

High definition television - HDTV

(1440 x 1152 + 720 x 576 + 720 x 576 ) x 8 x 25 = 497 Mbits/s

Multispectral images (satellite)

(6000 x 6000) x 8 x 6 = 216 Mbytes
Image coding
Objective: To find a way to represent the
original image without (?) distortion with the
minimum number of bits possible

Coder

Bit stream
....
Lossless image coding Lossy image coding
Lossless and lossy image coding
Lossless image coding :

The decoded image is pixel by pixel identical to the original

Lossy image coding :

The decoded image is NOT pixel by pixel identical to the original
Coder
Original
Visually indistinguishable
Visually distinguishable
Image Data Properties
Entropy (information content):

Information redundancy:


The estimated :


Compression Ratio:

=
=
1
0
2
)] ( [ log ) (
G
k
e
k P k P H
where G = # of gray values
P(k) = Prob. of gray value k
e
H b r =
where b = # of bits that represents image quantization levels
(e.g. b=8 for an image with 256 distinct gray values)
=
8
2
) (
~
,
~
k P H
e

=
=
1 2
0
2
)] (
~
[ log ) (
~ ~
b
k
e
k P k P H
MN
k h
k P
) (
) (
~
=
h(k): # of occurences
of gray value k for an
MxN image
e
H
b
K
~
=
(Typically b=8, for
a satellite image)
4
~
~
e
H
Lossless compression
Most popular methods:
Huffman encoding: represents data by codes of
variable length. Frequent data is represented by
shorter codes. (After a lossy transformation
JPEG standard uses this method to encode
data)

Lempel-Ziv-Welch (LZW) encoding: represents
data by pointers to dictionary of symbols.
(Traditionally used by GIF standard)
Transform-based Image Coding
Linear
Transform
Quanti-
zatioin
Entropy
Coding
Binary bit
stream
Input
Image
Modified from jpegmpeg.ppt by Yu Hen Hu (UW)
Linear Transform
If the signal is formatted as
a vector, a linear transform
can be formulated as a
matrix-vector product that
transform the signal into a
different domain.
Examples:
K-L Expansion
Discrete Fourier Transform
Discrete cosine transform
Discrete wavelet transform
Energy compaction
property: The transformed
signal vector has few, large
coefficients and many
nearly zero small
coefficients. These few
large coefficients can be
encoded efficiently with few
bits while retaining the
majority of energy of the
original signal.
Block-based Image Coding
An image is a 2D signal
of pixel intensities
(including colors).
A block-based image
coding scheme partitions
the entire image into 8 by
8 or 16 by 16 (or other
size) blocks. Then the
coding algorithm is
applied to individual
blocks independently.
Blocks may be
overlapping or non-
overlapping.
Advantage: parallel
processing can be
applied to process
individual blocks in
parallel. For hand-held
devices, only one block
needs be loaded into
main memory each time.
JPEG to JPEG2000
Joint Photographic Experts Group (JPEG) is an ISO
standard committee with a mission on Coding and
compression of still images.
JPEG coding standard (1988): DCT (discrete cosine
transform) based transform coding to compress bit-map
images.
JPEG2000 efforts started in 1996 to use new methods
such as fractals or wavelets. The target deliver date is
year 2000 and hence the name.


A Tour of JPEG Coding Standard
Key Components
- Transform
- Quantization
- Coding
-88 DCT
-boundary padding
-uniform quantization
-DC/AC coefficients
-Zigzag scan
-run length/Huffman coding
JPEG Baseline Coder
(
(
(
(
(
(
(
(
(
(

169 130
173 129
170 181
170 183
179 181
182 180
179 180
179 179
169 132
171 130
169 183
164 182
179 180
176 179
180 179
178 178
167 131
167 131
165 179
170 179
177 179
182 171
177 177
168 179
169 130
165 132
166 187
163 194
176 116
153 94
153 183
160 183
Tour Example
Step 1: Transform
DC level shifting
2D DCT
(
(
(
(
(
(
(
(
(
(

169 130
173 129
170 181
170 183
179 181
182 180
179 180
179 179
169 132
171 130
169 183
164 182
179 180
176 179
180 179
178 178
167 131
167 131
165 179
170 179
177 179
182 171
177 177
168 179
169 130
165 132
166 187
163 194
176 116
153 94
153 183
160 183
(
(
(
(
(
(
(
(
(
(

41 2
45 1
42 53
42 55
51 53
54 52
51 52
51 51
41 4
43 2
41 55
36 54
51 52
48 51
52 51
50 50
39 3
39 3
37 51
42 51
49 51
54 43
49 49
40 51
41 2
37 4
38 59
35 66
48 12
25 34
25 55
36 55
-128
(
(
(
(
(
(
(
(
(
(

41 2
45 1
42 53
42 55
51 53
54 52
51 52
51 51
41 4
43 2
41 55
36 54
51 52
48 51
52 51
50 50
39 3
39 3
37 51
42 51
49 51
54 43
49 49
40 51
41 2
37 4
38 59
35 66
48 12
25 34
25 55
36 55
(
(
(
(
(
(
(
(
(
(



1 3
4 2
1 2
0 9
4 0
2 1
1 3
4 4
3 0
5 5
4 7
7 3
3 0
4 6
3 2
1 6
1 13
9 16
10 9
6 21
17 9
33 10
8 10
17 20
10 24
27 27
1 32
60 78
44 13
18 27
27 38
56 313
DCT
Step 2: Quantization
(
(
(
(
(
(
(
(
(
(

99 103
101 120
100 112
121 103
98 95
87 78
92 72
64 49
92 113
77 103
104 81
109 68
64 55
56 37
35 24
22 18
62 80
56 69
87 51
57 40
29 22
24 16
17 14
13 14
55 60
61 51
58 26
40 24
19 14
16 10
12 12
11 16
Q-table
(
(
(
(
(
(
(
(
(
(



1 3
4 2
1 2
0 9
4 0
2 1
1 3
4 4
3 0
5 5
4 7
7 3
3 0
4 6
3 2
1 6
1 13
9 16
10 9
6 21
17 9
33 10
8 10
17 20
10 24
27 27
1 32
60 78
44 13
18 27
27 38
56 313
(
(
(
(
(
(
(
(
(
(




0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 1
1 0
1 1
0 1
1 1
0 0
0 1
0 1
2 3
2 1
1 3
2 3
5 20
Q
Why increase
from top-left to
bottom-right?
Step 3: Entropy Coding
Zigzag Scan
(
(
(
(
(
(
(
(
(
(




0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 1
1 0
1 1
0 1
1 1
0 0
0 1
0 1
2 3
2 1
1 3
2 3
5 20
(20,5,-3,-1,-2,-3,1,1,-1,-1,
0,0,1,2,3,-2,1,1,0,0,0,0,0,
0,1,1,0,1,EOB)
Zigzag Scan
End Of the Block:
All following coefficients
are zero
Discrete Cosine Transform
88 two-dimensional separable DCT:





DCT is chosen because it leads to superior energy compaction for
natural images.
F(0,0): DC coefficient ranges (-128x64/4,127x16) needs 12 bits to
represent (including sign bit). 12 bits are more than enough for the
remaining AC coefficients (u > 0, or v > 0)

> + s s
+ +
= =
=


= =
= =
. 0 ; 7 , 0
16
) 1 2 (
cos
16
) 1 2 (
cos ) , (
8
1
; 0 ) , (
4
1
) , (
7
0
7
0
7
0
7
0
v u v u
v n u m
n m f
v u n m f
v u F
m m
m n
t t
Inverse DCT (IDCT)
88 two-dimensional separable IDCT:




IDCT can be computed using the same routine as DCT
7 7
0 0
7 7
0 0
1 (2 1) (2 1)
( , ) cos cos 0;
4 16 16
( , )
1 (2 1) (2 1)
( , ) cos cos 0 , 7; 0.
8 16 16
u v
u v
m u n v
F u v m n
f m n
m u n v
F u v m n m n
t t
t t
= =
= =
+ +
= =

=

+ +

s s + >

DCT Basis Functions


Quantization of DCT Coefficients
Typical quantization
matrices (may be altered
depending on data)
Quantized term = round( )
Original term
Quantization factor
Restored term = Quantized term * Quantization
factor
DPCM of DC coefficients
DC coding: All DC coefficients
of each 8 by 8 blocks of the
entire image are combined to
make a sequence of DC
coefficients.
Next, DPCM is applied:
DiffDC(block
i
) = DC(block
i
)
DC(block
i1
)
Then DiffDCs will be encoded
using Huffman entropy
Example:
Original:
1216 1232 1224 1248
1248 1208
After DPCM:
1216 +16 -8 +24 0 -
40
1216
1248
1232
1248
1224
1208
Huffman Entropy Coding
Entropy coding:
Task: to assign a
variable-length binary
code to a finite set of
alphabets.
Goal: to minimize the
average length (number
of bits) per alphabet.
Approach: Shorter code
for alphabet occurred
more frequently. Longer
for infrequent ones.
Optimal solution:
When the averaged
code length approaches
the entropy of the
source.
Huffman coding:
Code words are derived
from a (perhaps) un-
balanced binary tree.
Arithmetic coding is
another entropy coding
method.
Huffman Encoding of DC Coefficients
Encoding and decoding of
Huffman code is done via look-
up table.
In JPEG, DC coefficients (after
DPCM) are first grouped
according to their magnitudes.
Each category is assigned as
a symbol and a Huffman table
is given. For example, 7 to 4
and 4 to 7 are listed as
category 3 which has a code
"00.
If the number is positive, the
binary representation of the
number will be appended to the
Huffman code of the category
number directly. For example, 6
is encoded as 00 110. If the
number is negative, the
appended code is the 1s
complement of that number.
For example, -5 is encoded as
00 010.
Question: Given such a table,
how to devise a dedicated
hardware to implement the
encoding procedure?
JPEG Huffman Table: Categories
JPEG DC Entropy Coding
Example:
-9: category 4. Hence Base code = 101
1s complement of (-9) = 1C(1001) = 0110
Code word = 101 + 0110 = 1010110
Note that category 3 occurs most frequent and hence has
shortest base code word.
AC Coefficients
AC coefficients are first
weighted with a quantization
matrix:
C(i,j)/q(i,j) = C
q
(i,j)
Then quantized.
Then they are scanned in a
zig-zag order into a 1D
sequence to be subjected to
AC Huffman encoding.
Question: Given a 8 by 8
array, how to convert it into a
vector according to the zig-
zag scan order? What is the
algorithm?
1

2

6

7

15

16

28

29

3

5

8

14

17

27

30

43

4

9

13

18

26

31

42

44

10

12

19

25

32

41

45

54

11

20

24

33

40

46

53

55

21

23

34

39

47

52

56

61

22

35

38

48

51

57

60

62

36

37

49

50

58

59

63

64

Zig-Zag scan order
AC Coefficients Huffman Encoding
The symbols for encoding AC coefficient consists both
the number of significant bits, as well as runs of 0s
preceding the nonzero AC coefficient. For example,
5 0 2 0 0 1 is encoded as: 100101 11100110 110110
This is according to the table below:







Number Run/Category Base code Length Final code
5 0/3 100 6 100 101
02 1/2 111001 8 111001 10
00-1 2/1 11011 6 11011 0
Huffman Decoding
A look-up table procedure.
Challenge: How to perform
decoding fast?
Example: a Huffman table for
six symbols:

The decoding process can
be modeled as a finite state
machine with the following
state diagram. It decodes
one bit of input bit stream per
clock cycle.






Symbol Codeword
A 0
B 10
C 1100
D 1101
E 1110
F 1111
a
0/A
b c
0/B
d
e
1/- 1/-
0/-
1/-
0/C,1/D
0/E,1/F

Вам также может понравиться