Вы находитесь на странице: 1из 29

INSTITUTE OF SCIENCE &

TECHNOLOGY

Dissertation approval Sheet


THIS IS CERTIFY THAT DISSERTATION TITLED

IMAGE COMPRESSION TECHNIQUES


By
Raju Kumar
Rabbul Hussain
Md. Minhaj Alam
Is approved for the degree of Bachelor of technology (B.TECH)

In
COMPUTER SCIENCE AND ENGINEERING (C.S.E)

Ms. Pubali Das


Project Guide Faculty

CERTIFICATE
This Is To Certify That The Project Entitled Image Compression
Techniques Submitted By Raju Kumar (Roll no: 18200111025),
Rabbul Hussain (Roll No:18200111024) and Md.Minhaj Alam
(Roll No:18200111022)For The Partial Fulfillment Of The
Degree Of Bachelor Of Technology In Computer Science, Under
West Bengal University Of Technology Is Actually Based Upon
His Group Work Under The Supervision Of

Ms. Pubali Das

department Of Computer Science And Information Technology,


Institute Of Science & Technology, West Midinipur,West Bengal,
Neither His Project Nor Any Of The Either Project Has Been
Submitted For Any Degree Or Diploma Or Any Other Academic
Award Anywhere Before.

Signature of Project
Coordinator

Signature of Project Guide

Ms. Pubali Das

Signature of HOD
Mr. Gouranga Mondal
HOD, CSE&IT, IST

AKNOWLEDGEMENT
We take this opportunity to express my profound gratitude and
deep regards to our guide Mr. Gouranga Mondal, HOD(Head of
the Department), department of CSE & IT of Institute of
Science & Technology, for his exemplary guidance, monitoring
and constant encouragement throughout the course of this
thesis. The blessing, help and guidance given by him time to
time shall carry us a long way in the journey of our life on
which we are about to embark. We would also like to thanks
our project coordinator and our dear friends for helping us in
this project. Without their help we wouldn't be able to make it.
Thanks once again.

Raju Kumar
Roll no-18200111025
Md.Minhaj Alam
Roll no-18200111022

Rabbul Hussain
Roll no-18200111024

DECLARATION
This is to certify that the project work entitled Image
Compression Techniques which is almost completed and
performed by us in partial fulfillment of the completion
Bachelors degree in Computer Science from Institute of
Science & Technology Which Comprises of our Original Work.

Raju kumar
Roll no-18200111025

Md.Minhaj Alam
Roll no-18200111022

Rabbul Hussain
Roll no-18200111024

ABSTRACT
Digital image compression technology is of special interest for
the fast transmission and real-time processing of digital image
information on the internet. Although still image compression is
a technique developed for a long time, plus there are several
approaches which reduce the compression rate, and accelerate
computation time, there are still a lot to go to improve the
efficacy of compression. In this tutorial, several important
image compression algorithms at present are examined,
including DCT and the derived tools such as JPEG, JPEG 2000,
fractal image compression and wavelet transformation. These
use different aspects to help image processing smoother and
faster.
In the following tutorial, I would like to talk about the
background of image compression first, including when we will
need image compression, categories of techniques, and their
properties. Secondly, I would briefly introduce some common
image compression methods nowadays such as JPEG,
JPEG2000, wavelet-based, fractal-based, and several other
techniques and neural networks. When they can be applied,
how to implement the algorithms, their advantages and
disadvantages, how they differ and their development prospect
will be described as well.

CONTENT
Abstract
I.

Introduction
1. Introduction to Image Compression
1.1 Image
Pixel
RGB
Grayscale
YUV
1.2 Image Compression
Techniques
Advantages of image compression Techniques
Categories
2. Lossy Compression Techniques
2.1 Introduction
2.2 Techniques
Transform Coding
Vector Quantization
Fractal Image Compression
Compression can Fractal achieve
Encoding Images
3. Lossless Compression Techniques
3.1 Introduction
3.2 Techniques
Run Length Encoding
Huffman Encoding
Area Coding

II. Commonly Used Techniques


4. JPEG
4.1 JPEG

Summary

JPEG Encoder
The 2D 8x8 DCT
Quantization
Differential Coding of DC Coefficients

1. Introduction to Image Compression


The digital multimedia is popular nowadays because of their highly
perceptual effects and the advanced development its corresponding
technology. However, it often requires a large amount of data to store
these multimedia contents due to the complex information they may
encounter. Besides, the requirement of resolution is much higher than
before, such that the data size of the image is surprisingly large.
In other words, a still image is a sensory signal that contains significant
amount of redundant information which exists in their canonical forms.
Image data compression is the technique of reducing the redundancies
in image data required to maintain a given quantity of information.
Therefore, how to improve image compression becomes an important
question. Great progress has been made in applying digital signal
processing or wavelet transform techniques in this area.
There are two different technique groups including lossy compression
and lossless compression depending on if the information can be
recovered after compression. I would like to briefly introduce different
methods included in these two techniques. As for JPEG and JPEG2000,
they are very popular compression tools, which will be described in
details in the latter chapters.

1.1 Image
An image is essentially a 2-D signal processed by the human visual
system. The signals representing images are usually in analog form.
However, for
image processing, storage and transmission, they are
converted from analog to digital form. A digital image is basically a 2D array of pixels.
Images are formed of the significant part of data, particularly in remote
sensing, biomedical and video conferencing applications. The use of
and dependence on information and computers continue to grow, so
does our need for efficient ways of storing and transmitting large
amounts of data.

Pixel
In digital image, a pixel is a single point in a raster image. It is the
smallest unit of picture that can be controlled, and is the smallest
addressable screen element as shown in Fig. Each pixel has its own
address. The address of a pixel corresponds to its coordinates. They are
usually arranged in a 2-D grid, and are often represented with dots or
squares.
Each pixel is a sample of an original image. More samples typically
provide more accurate representations of the original. The intensity of
each pixel is variable. In color image systems, a color is typically
represented by three or four component intensities such as red, green,
and blue.

Pixel is smallest element of an image

RGB
When the eye perceives an image on a computer monitor, it is in actually
perceiving a large collection of finite color elements, or pixels [1]. Each of
these pixels is in and of itself composed of three dots of light; a green dot, a
blue dot, and a red dot. The color the eye perceives at each pixel is a result
of varying intensities of green, red, and blue light emanating from that
location. A color image can be represented as 3 matrixes of values, each
corresponding to the brightness of a particular color in each pixel.
Therefore, a full color image can be reconstructed by superimposing these
three matrices of RGB.

A color image is made of 3 matrices

Grayscale
If an image is measured by an intensity matrix with the relative
intensity being represented as a color between black and white, it
would appear to be a grayscale image.

A grayscale image

The intensity of a pixel is expressed within a given range between


a minimum and a maximum. This range is represented in a range
from 0 to 1 with incremental steps according to the bit-depth of
the image. The greater the number of steps is, the larger the bitdepth is. If each intensity value is represented as an 8-bit number,
then there are 256 variations. If the intensity values are
represented as 16-bit numbers, there are 32,000 variations
between absolute black and pure white. Fig. 4 demonstrates a
black to white gradient in 4-bits of intensity.

4-bit black to white gradient

YUV
In the case of a color RGB picture a point-wise transform is made
to the YUV (luminance, blue chrominance, red chrominance)
color space. This space in some sense is more efficient than the
RGB space and allows better quantization. The transform is given
by
0.587
0.114 R 0
Y 0.299
U 0.1687 0.3313
0.5 G 0.5


0.4187 0.0813 B 0.5
V 0.5
,

(1)

and the inverse transform is


0
1.402 Y
R 1
G 1 0.344.4 0.71414 U 0.5

0
B 1 1.772
V 0.5
.

(2)

1.2 Image Compression Techniques


Image compression is an application of data compaction that can
reduce the quantity of data. The block diagram of image coding
system is shown in Fig.

Camera
C

R-G-B
coordinate

Transform to
Y-Cb-Cr
coordinate

Object

Downsample
Chrominance

Encoder

RateDstortion
Comparison

HDD

Upsample
Chrominance

Decoder

Monitor
C

R-G-B
coordinate

Transform to
R-G-B
coordinate

The block diagram of the general image storage system.

The camera captures the reflected light from the surface of the
object, and the received light will be converted into three primary
color components R, G and B. These three primary color
components are processed by coding algorithms afterward.
Image compression addresses the problem of reducing the amount
of data required to represent a digital image. It is a process
intended to yield a compact representation of an image, thereby
reducing the image storage/transmission requirements.
Compression is achieved by the removal of one or more of the
following three basic data redundancies:
1. Coding Redundancy
2.

Inter-pixel Redundancy

3.

Perceptual Redundancy

Coding redundancy occurs when the codes assigned to a set of


events such as the pixel values of an image have not been selected
to take full advantage of the probabilities of the events.
Inter-pixel redundancy usually results from correlations between
the pixels. Due to the high correlation between the pixels, any
given pixel can be predicted from its neighboring pixels.
Perceptual redundancy is due to data that is ignored by the human
visual system. In other words, all the neighboring pixels in the
smooth region of a natural image have a high degree of similarity

and this insignificant variation in the values of the neighboring


pixels is not noticeable to the human eye.

Technicques
Image compression techniques reduce the number of bits required
to represent an image by taking advantage of these redundancies.
An inverse process called decoding is applied to the compressed
data to get the reconstructed image. The objective of compression
is to reduce the number of bits as much as possible, while keeping
the resolution and the quality of the reconstructed image as close
to the original image as possible.
Image compression systems are composed of two distinct
structural blocks: an encoder and a decoder, as shown in Fig.

Image compression system

Image f(x, y) is fed into the encoder, which creates a set of


symbols form the input data and uses them to represent the image.
If we let n1 and n2 denote the number of information carrying
units in the original and encoded images respectively, the
compression that is achieved can be quantified numerically via
the compression ratio, CR = n1/n2
As shown in Fig, the encoder is responsible for reducing the
coding, inter-pixel and perceptual redundancies of input image.
In first stage, the mapper transforms the input image into a format
designed to reduce inter-pixel redundancies. The second stage,

qunatizer block reduces the accuracy of mappers output in


accordance with a predefined criterion. In third and final stage, a
symbol decoder creates a code for quantizer output and maps the
output in accordance with the code. These blocks perform, in
reverse order, the inverse operations of the encoders symbol
coder and map per block. As quantization is irreversible, an
inverse quantization is not included.

Advantages of image Compression


The benefits of image compression can be listed as follows:
1. It provides a potential cost savings associated with sending less
data over switched telephone network where cost of call is
really usually based upon its duration.
2. It not only reduces storage requirements but also overall
execution time.
3. It reduces the transmission errors since fewer bits are
transferred.
4. It also provides a level of security against illicit monitoring.

Categories
The image compression techniques are broadly classified into two
categories depending whether or not an exact replica of the
original image could be reconstructed using the compressed
image. These are:
1. Lossy technique
2. Lossless technique

2. Lossy Compression Techniques


2.1 Introduction
Lossy schemes provide much higher compression ratios than
lossless schemes. Lossy schemes are widely used since the
quality of the reconstructed images is adequate for most
applications. By this scheme, the decompressed image is not
identical to the original image, but reasonably close to it.

Lossy image compression

As shown in Fig, this prediction transformation


decomposition process is completely reversible. The quantization
process results in loss of information. The entropy coding after
the quantization step, however, is lossless. The decoding is a
reverse process. Firstly, entropy decoding is applied to
compressed data to get the quantized data. Secondly, dequantization is applied to it and finally the inverse transformation
to get the reconstructed image.
Major performance considerations of a lossy compression scheme
include:
1. Compression ratio
2. Signal - to noise ratio
3. Speed of encoding and decoding.

Lossy compression techniques includes following schemes:


1. Transform coding
2. Vector quantization
3. Fractal Image Compression

2.2 Techniques
Transform Coding
In this coding scheme, transforms such as DFT (Discrete Fourier
Transform) and DCT (Discrete Cosine Transform) are used to
change the pixels in the original image into frequency domain
coefficients. These coefficients have several desirable properties.
One is the energy compaction property that results in most of the
energy of the original data being concentrated in only a few of the
significant transform coefficients. Only those few significant
coefficients are selected and the remaining is discarded. The
selected coefficients are considered for further quantization and
entropy encoding. DCT coding has been the most common
approach to transform coding.

DCT

Vector Quantization
The basic idea in this technique is to develop a dictionary of
fixed-size vectors, called code vectors. As shown in Fig, a vector
is usually a block of pixel values. A given image is then
partitioned into non-overlapping blocks (vectors) called image
vectors. Each in the dictionary is determined and its index in the
dictionary is used as the encoding of the original image vector.
Thus, each image is represented by a sequence of indices that can
be further entropy coded.
(a)

(b)

(a) Vector quantization coding procedure (b) decoding procedure

Fractal Image Compression


The essential idea here is to decompose the image into segments
by using standard image processing techniques such as color
separation, edge detection, and spectrum and texture analysis.
Then each segment is looked up in a library of fractals. The
library actually contains codes called iterated function system
codes, which are compact sets of numbers. Using a systematic
procedure, a set of codes for a given image are determined, such
that when the IFS codes are applied to a suitable set of image
blocks yield an image that is a very close approximation of the
original. This scheme is highly effective for compressing images
that have good regularity and self-similarity.

Original image and self-similar portions of image

Now, we want to find a map W which takes an input image and


yields an output image. If we want to know when W is

contractive, we will have to define a distance between two


images. The distance is defined as
where f and g are value of the level of grey of pixel (for grayscale
image), P is the space of the image, and x and y are the
coordinates of any pixel. This distance defines position (x, y)
where images f and g differ the most.
Natural images are not exactly self similar. Lena image, a typical
image of a face, does not contain the type of self-similarity that
can be found in the Sierpinski triangle. But next image shows that
we can find self-similar portions of the image. A part of her hat is
similar to a portion of the reflection of the hat in the mirror.

Compression can Fractal achieve


The compression ratio for the fractal scheme is hard to measure
since the image can be decoded at any scale. For example, the
decoded image in Figure 3 is a portion of a 5.7
to 1 compression of the whole Lena image. It is decoded at 4
times its original size, so the full decoded image contains 16
times as many pixels and hence this compression ratio is 91.2 to
1. This many seem like cheating, but since the 4-times later
image has detail at every scale, it really is not.

Encoding Images
The previous theorems tell us that transformation W will have a
unique fixed point in the space of all images. That is, whatever
image (or set) we start with, we can repeatedly
apply W to it and we will converge to a fixed image. Suppose we
are given an image f that we wish to encode. This means we want
to find a collection of transformations w1 , w2 , ...,wN and want f to
be the fixed point of the map W (see fixed Point Theorem). In other
words, we want to partition f into pieces to which we apply the
transformations wi , and get back the original image f. A typical
image of a face, does not contain the type of self-similarity like the
fern in Figure The image does contain other type of self-similarity.
Lena identical, and a portion of the reflection of the hat in the
mirror is similar to the original. These distinctions form the kind of
self-similarity here the image will be formed by copies of properly
transformed parts of the original. These transformed parts do not fit
together, in general, to form an exact copy of the original image,
and so we must allow some error in our representation of an image
as a set of transformations.

3. Lossless Compression Techniques


3.1 Introduction
In lossless compression techniques, the original image can be
perfectly recovered from the compressed image. These are also
called noiseless since they do not add noise to the signal. It is also
known as entropy coding since it use decomposition techniques to
minimize redundancy.
Following techniques are included in lossless compression:
1. Run length encoding
2.

Huffman encoding

Techniques
Run Length Encoding
This is a very simple compression method used for sequential
data. It is very useful in repetitive data. This technique replaces
sequences of identical pixels, called runs by shorter symbols. The
run length code for a gray scale image is represented by a
sequence {Vi, Ri} where Vi is the intensity of pixel and Ri refers
to the number of consecutive pixels with the intensity Vi as
shown in the figure. If both Vi and Ri are represented by one byte,
this span of 12 pixels is coded using eight bytes yielding a
compression ration of 1: 5.

Run-length encoding

Huffman Encoding
This is a general technique for coding symbols based on their
statistical occurrence frequencies.

Huffman encoding

The pixels in the image are treated as symbols. The symbols that
occur more frequently are assigned a smaller number of bits,
while the symbols that occur less frequent are assigned a
relatively larger number of bits. Huffman code is a prefix code.
The binary code of any symbol is not the prefix of the code of any
other symbol. Most image coding standards use lossy techniques
in earlier stages of compression and use Huffman coding as the
final step.

Area Coding
Area coding is an enhanced form of run length coding, reflecting
the two dimensional character of images. This is a significant
advance over the other lossless methods. For coding an image it
does not make too much sense to interpret it as a sequential
stream, as it is in fact an array of sequences, building up a two
dimensional object. The algorithms for area coding try to find
rectangular regions with the same characteristics. These regions
are coded in a descriptive form as an element with two points and
a certain structure. This type of coding can be highly effective but
it bears the problem of a nonlinear method, which cannot be
implemented in hardware. Thus, the performance in terms of

compressio time is not competitive, although the compression


ratio is.

4. JPEG
4.1 JPEG
JPEG (Joint Photographic Expert Group) standard will be simply
described in this section, and the flow chart, and the algorithm
will be introduced.

JPEG Encoder
The block diagram of the JPEG standard. The YCbCr color
transform and the chrominance subsampling format are not
defined in the JPEG standard, but most JPEG software will
perform this processing because it makes the JPEG encoder
reduce the date quantity more efficient. However, we dont
discuss the basic concept of subsampling format here. We will
focus on JPEG encoder as follows.
Huffman
Table for DC

DC
Source
Image

88 DCT

Differential
Coding

Entropy
Encoder

Zero-RunLength Coding

Entropy
Encoder

Quantization

Bitstream
AC

Quantization
Table

The flow chart of JPEG standard

Huffman
Table for AC

2D 8x8 DCT
As we learned, we know that the energy of nature image are
concentrated in low frequency, so we can used DCT transform to
separate low frequency and high frequency. And then reserve the
low frequency component as far as possible, and subtract the high
frequency component to achieve reduction of compression rate.
The encoder performs the DCT on these macroblocks to achieve
decorrelation and frequency analysis. Because the transform are
performed on the 8x8 blocks, the forward 2-D DCT formula is
defined in (3), and x[m,n] and X[u,v] represent the input signal
and the DCT coefficients, respectively.
7
7
1
(2m 1)u
(2n 1)v
X u, v C (u )C (v) x m, n cos
cos

4
16
16

x 0 y 0

for u 0,..., 7 and v 0,..., 7

(3)

where

1
, for k 0

C (k ) 2
1 , otherwise

The discrete cosine


transform shown is closely related to the
5.
Discrete Fourier Transform (DFT). Both take a set of points from
the spatial domain and transform them into an equivalent
representation in the frequency domain. The difference is that
while the DFT takes a discrete signal in one spatial dimension and
transforms it into a set of points in one frequency dimension and
the Discrete Cosine Transform (for an 8x8 block of values) takes
a 64-point discrete signal, which can be thought of as a function
of two spatial dimensions x and y, and turns them into 64 DCT
coefficients which are in terms of the 64 unique orthogonal 2D
spectrum.

The DCT coefficient values are the relative amounts of the 64


spatial frequencies present in the original 64-point input. The
element in the upper most left corresponding to zero frequency in
both directions is the DC coefficient and the rest are called AC
coefficients.

64 two-dimensional spatial frequencies

Because pixel values typically change vary slowly from point to


point across an image, the FDCT processing step lays the
foundation for achieving data compression by concentrating most
of the signal in the lower spatial frequencies. For a typical 8x8
sample block from a typical source image, most of the spatial
frequencies have zero or near-zero amplitude and need not be
encoded.
At the decoder the IDCT reverses this processing step. It takes the
64 DCT coefficients and reconstructs a 64-point output image
signal by summing the basis signals. Mathematically, the DCT is
one-to-one mapping for 64-point vectors between the image and
the frequency domains. In principle, the DCT introduces no loss
to the source image samples; it merely transforms them to a
domain in which they can be more efficiently encoded.

Quantization
After DCT, the encoder performs quantization to reduce precision
of the data and discard the less important high frequency
coefficients. As mentioned above, the human eyes are more
sensitive to the low frequency components than the high
frequency components, the JPEG quantization assigns large
quantization step size to high frequency components to discard
the redundant information, and assign small quantization step size
to the low frequency components to preserve the significant
information. Fig. 18 shows two quantization tables defined in
JPEG, where Qr is the luminance quantization table and Qc is the
chrominance quantization table.
16
12

14

14
QY
18

24
49

72
17
18

24

47
QC
99

99
99

99

11 10 16

24

40

51

12 14 19

26

58

60

13 16 24
17 22 29

40
51

57
87

69
80

22 37 56

68

109 103

35 55 64

81

104 113

64 78 87 103 121 120


92 95 98 112 100 103

61
55
56

62
77

92
101

99

18 24 47 99 99 99 99
21 26 66 99 99 99 99
26 56 99 99 99 99 99

66 99 99 99 99 99 99
99 99 99 99 99 99 99

99 99 99 99 99 99 99
99 99 99 99 99 99 99

99 99 99 99 99 99 99
Quantization table

The quantization step size is smaller in the upper left region to


preserve the low frequency components. On the other hand, the
quantization step is larger in the lower right region to reduce the
less important high frequency components to zero. Since the
distortion of the high frequency features of the image is less
sensitive to the human eyes, it is not easy for us to observe the
difference between the original image and the quantized image.

Differential Coding of DC Coefficients


After 2-D DCT and quantization, we can find that the component
of AC terms in 88 block will consist of many zeros such as Fig.
19. And the DC coefficient is the mean of each corresponding
block, and the current DC coefficient is very similar to the DC
coefficients in its neighboring blocks. Thus, the JPEG encoder
performs the predictive coding on the DC coefficients to reduce
the redundancy, as shown in Fig. 20. The differential coding of
DC coefficient is denoted as difference of DCi and DCi-1.

DC 10

-6

-1

-5

8x8 block after 2-D DCT and quantization

DCi-1

DCi

Blocki-1

Blocki

Diffi = DCi - DCi-1


Differential coding of DC coefficients

SUMMARY
This tutorial starts from the introduction of backgrounds of why
we need image compression, how image is formed, what kind of
image processing tools are there, what we can do via image
compression techniques, to how each method is implemented.
I didnt talk much on how to implement JPEG and JPEG2000, but
focus more on introducing different kinds of compression
techniques. The drawback is that each cannot be discussed in
details, but since the topic is development of compression
techniques, so I would like to talk more broadly. Therefore,
different methods are covered.
There are still many other analysis tools with more complex
algorithms that have not been introduced here. Readers who are
interested can find more information in the references.

Вам также может понравиться