Академический Документы
Профессиональный Документы
Культура Документы
ON
“VIDEO COMPRESSION”
1
Table Of Content
Chapter 1. Introduction
2.1.1 Introduction.........................................................................................4
2.1.2 VIDEO COMPRESSION TECHNOLOGY……...............................5
2.1.3.COMPRESSION STANDARDS………………................................6
2.1.4.MPEG-1...............................................................................................9
2.1.5.MPEG-2..............................................................................................12
2.1.6.MPEG-4..............................................................................................17
2.1.7.MPEG-7..............................................................................................19
2.1.8.H.261....................................................................................................21
2.2 Methodology Adopted………………………………………………………..24
References………………………………………………………………………30
2
Chapter -1 Introduction
1.1. OBJECTIVE
1.1.1. NEED OF THE SYSTEM: Uncompressedxc video (and audio) data are huge. In
HDTV, the bit rate easily exceeds 1 Gbps. -- big problems for storage and network
communications. For example: One of the formats defined for HDTV broadcasting
within the United States is 1920 pixels horizontally by 1080 lines vertically, at 30
frames per second. If these numbers are all multiplied together, along with 8 bits for
each of the three primary colors, the total data rate required would be approximately 1.5
Gb/sec. Because of the 6 MHz. channel bandwidth allocated, each channel will only
support a data rate of 19.2 Mb/sec, which is further reduced to 18 Mb/sec by the fact
that the channel must also support audio, transport, and ancillary data information. As
can be seen, this restriction in data rate means that the original signal must be
compressed by a figure of approximately 83:1. This number seems all the more
impressive when it is realized that the intent is to deliver very high quality video to the
end user, with as few visible artifacts as possible.
1.1.2. OUTCOME OF THE SYSTEM:
3
rate from which they are put out at a slower, synchronous rate. The compressor adaptively
determines the rate buffer capacity control feedback component in relation to instantaneous
data content of the rate buffer memory in relation to its capacity, and it controls the absolute
quantity of data resulting from the normalization step so that the buffer memory is never
completely emptied and never completely filled. In expansion, the system essentially mirrors
the steps performed during compression. An efficient, high speed decoder forms an important
aspect of the present invention. The compression system forms an important element of a
disclosed color broadcast compression system.
4
C/C++ and Fortran API Reference — Covers functions used by the MATLAB external
interfaces, providing information on syntax in the calling language, description, arguments,
return values, and examples
The MATLAB application can read data in various file formats, discussed in the following
sections:
Recommended Methods for Importing Data
Importing MAT-Files
Importing Text Data Files
Importing XML Documents
Importing Excel Spreadsheets
Importing Scientific Data Files
Importing Images
Importing Audio and Video
Importing Binary Data with Low-Level I/O
5
Chapter-2 Problem Analysis
The most commonly used method works by comparing each frame in the video with the
previous one. If the frame contains areas where nothing has moved, the system simply issues a
6
short command that copies that part of the previous frame, bit-for-bit, into the next one. If
sections of the frame move in a simple manner, the compressor emits a (slightly longer)
command that tells the decompressor to shift, rotate, lighten, or darken the copy — a longer
command, but still much shorter than intraframe compression. Interframe compression works
well for programs that will simply be played back by the viewer, but can cause problems if the
video sequence needs to be edited.
At its most basic level, compression is performed when an input video stream is analyzed and
information that is indiscernible to the viewer is discarded. Each event is then assigned a code -
commonly occurring events are assigned few bits and rare events will have codes more bits.
These steps are commonly called signal analysis, quantization and variable length encoding
respectively. There are four methods for compression, discrete cosine transform (DCT), vector
quantization (VQ), fractal compression, and discrete wavelet transform (DWT):
7
other methods (DCT) that work on smaller pieces of the desired data. The result is a
hierarchical representation of an image, where each layer represents a frequency band.
MPEG stands for the Moving picture Expert Group.MPEG is an ISO/IEC working group,
established in 1988 to develop standards for digital audio and video formats. There are four
MPEG standards being used or in development. Each compression standard was designed with
a specific application and bit rate in mind, although MPEG compression scales well with
increased bit rates. They include:
2.1.3.1 MPEG-1
Designed for up to 1.5 Mbit/sec Standard for the compression of moving pictures and audio.
This was based on CD-ROM video applications, and is a popular standard for video on the
Internet, transmitted as .mpg files. In addition, level 3 of MPEG-1 is the most popular standard
for digital compression of audio--known as MP3. MPEG-1 is the standard of compression for
VideoCD, the most popular video distribution format thoughout much of Asia.
2.1.3.2 MPEG-2
Designed for between 1.5 and 15Mbit/sec Standard on which Digital Television set top boxes
and DVD compression is based. It is based on MPEG-1, but designed for the compression and
transmission of digital broadcast television. The most significant enhancement from MPEG-1
is its ability to efficiently compress interlaced video. MPEG-2 scales well to HDTV resolution
and bit rates, obviating the need for an MPEG-3.
2.1.3.4 MPEG-4
Standard for multimedia and Web compression. MPEG-4 is based on object-based
compression, similar in nature to the Virtual Reality Modeling Language. Individual objects
within a scene are tracked separately and compressed together to create an MPEG4 file. This
results in very efficient compression that is very scalable, from low bit rates to very high. It
8
also allows developers to control objects independently in a scene, and therefore introduce
interactivity.
2.1.3.5 MPEG7- this standard, currently under development, is also called the Multimedia
Content Description Interface. When released, the group hopes the standard will provide a
framework for multimedia content that will include information on content manipulation,
filtering and personalization, as well as the integrity and security of the content. Contrary to the
previous MPEG standards, which described actual content, MPEG-7 will represent information
about the content.
2.1.3.6 H.261:-H.261 is an ITU standard designed for two-way communication over ISDN
lines (video conferencing) and supports data rates which are multiples of 64Kbit/s. The
algorithm is based on DCT and can be implemented in hardware or software and uses
intraframe and interframe compression. H.261 supports CIF and QCIF resolutions.
MPEG4 advantages include high compression, low bit rate and motion compensation support.
Disadvantages are latency and blocking artifacts. JPEG, JPEG2000, and MPEG4 have all been
used in video surveillance systems, with the choice depending on what is most important in
that particular application. H.264 is an advanced compression scheme which is also starting to
find its way into video surveillance systems. H.264 offers compression at the expense of
additional hardware complexity. It is not examined this paper, but FPGA-based solutions for
H.264.
2.1.3.7 MPEG21:
The MPEG-21 standard, from the Moving Picture Expert Group, aims at defining an open
framework for multimedia applications. MPEG-21 is ratified in the standards ISO/IEC 21000 -
Multimedia framework (MPEG-21).
9
Digital Items can be considered the kernel of the Multimedia Framework and the users can be
considered as who interacts with them inside the Multimedia Framework. At its most basic
level, MPEG-21 provides a framework in which one user interacts with another one, and the
object of that interaction is a Digital Item. Due to that, we could say that the main objective of
the MPEG-21 is to define the technology needed to support users to exchange, access,
consume, trade or manipulate Digital Items in an efficient and transparent way.
The storage of an MPEG-21 Digital Item in a file format based on the ISO base media file
format, with some or all of Digital Item's ancillary data (such as movies, images or other non-
XML data) within the same file.
2.1.3.8 H.263:
H.263 is a video compression standard originally designed as a low-bit rate compressed format
for video conferencing. It was developed by the (VCEG) ITU-T Video Coding Expert Group
(VCEG).
H.263 has since found many applications on the internet: much Flash video content (as used
on sites such as YouTube,Google Video,MySpace, etc.) used to be encoded in Sorenson Spark
format, though many sites now use VP6 or H.264 encoding.
H.263 was developed as an evolutionary improvement based on experience from H.261, the
previous ITU-T standard for video compression, and the MPEG-1 and MPEG-2 standards. Its
first version was completed in 1995 and provided a suitable replacement for H.261 at all bit
rates. It was further enhanced in projects known as H.263v2; MPEG-4 Part 2 is H.263
compatible in the sense that a basic H.263 bit stream is correctly decoded by an MPEG-4
Video decoder.
2.1.3.9 H.264:
10
The next enhanced codec developed by ITU-T VCEG (in partnership with MPEG) after H.263
is the H.264 standard, also known as AVC and MPEG-4 part 10. As H.264 provides a
significant improvement in capability beyond H.263, the H.263 standard is now considered a
legacy design. Most new videoconferencing products now include H.264 as well as H.263 and
H.261 capabilities. H.264 is used in such applications as players for Blu-ray Discs, videos from
You Tube and the iTunes Store, web software such as the Adobe Flash Player and Microsoft
Silverlight, broadcast services for DVB and SBTVD, direct-broadcast satellite television
services, cable television services, and real-time videoconferencing.
2.1.4 MPEG-1
The ‘Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to
about 1,5 Mbit/s’ (ISO/IEC 11172) or MPEG-1 as it is more commonly known as, standardizes
the storage and retrieval of moving pictures and audio storage media forms the basis for Video
CD and MP3 formats.
This part of the specification describes the coded representation for the compression of video
sequences.
The basic idea of MPEG video compression is to discard any unnecessary information i.e. an
MPEG-1 encoder by analyses:
1
How much movement there is in the current frame compared to the previous frame what
changes of color have taken place since the last frame what changes in light or contrast have
taken place since the last frame what elements of the picture have remained static since the last
frame
The encoder then looks at each individual pixel to see if movement has taken place, if there has
been no movement, the encoder stores an instruction to say to repeat the same frame or repeat
the same frame, but move it to a different position.
1.I intra-frame
2.B Bidirectional frames
3.P Predicted frames
11
Audio, video and time codes are converted into one single stream.
MPEG-1 compression treats video as a sequence of separate images. ‘Picture Elements’, often
referred to as ‘pixels’ are elements in the image. Each pixel consists of three components –
Luminance/luminosity (Y) and two for chrominance Cb and Cr. MPEG-1 encodes Y pixels in
full (check the correct term) resolution as the Human Visual System (HVS) is most sensitive to
luminance/luminosity.
1
Quantification
Predictive coding –the difference between the predicted pixel value and the real value is coded.
Movement compensation (MC) predicts the value of a neighboring block of pixels (1 block =
8x8 pixels) in an image to those of a known block of pixels. A vector describes the 2-
dimensional movement. If no movement takes place, the value is 0.
2
Interframe coding
Sequential coding
VLC (Variable? Coding)
Image Interpolation
Intra (I frames) are coded independently of other images.
MPEG codes images progressively Interlaced images need to be converted into a de-interlaced
format before encoding, Video is encoded and Encoded video is converted into an interlaced
form.
To achieve a high compression ratio: An appropriate spatial resolution for the signal is
chosen/the image is broken down into different pixels; block-based motion compensation is
used to reduce the temporal redundancy.
12
1 Motion compensation is used for causal prediction of the current picture from a
previous picture, for non-causal prediction of the current picture from a future picture, or for
interpolative prediction from past and future pictures.
The difference signal, the prediction error, is further compressed using the discrete cosine
transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion
vectors are combined with the DCT information, and coded using variable length codes.
Part 1 addresses the problem of combining one or more data streams from the video and audio
parts of the MPEG-1 standard with timing information to form a single stream .This is an
important function because, once combined into a single stream, the data are in a form well
suited to digital storage or transmission.
Part 2 specifies a coded representation that can be used for compressing video sequences -
both 625-line and 525-lines - to bitrates around 1,5 Mbit/s. Part 2 was developed to operate
principally from storage media offering a continuous transfer rate of about 1,5 Mbit/s.
Nevertheless it can be used more widely than this because the approach taken is generic. A
number of techniques are used to achieve a high compression ratio. The first is to select an
appropriate spatial resolution for the signal. The algorithm then uses block-based motion
compensation to reduce the temporal redundancy. Motion compensation is used for causal
prediction of the current picture from a previous picture, for non-causal prediction of the
current picture from a future picture, or for interpolative prediction from past and future
pictures. The difference signal, the prediction error, is further compressed using the discrete
cosine transform (DCT) to remove spatial correlation and is then quantized. Finally, the motion
vectors are combined with the DCT information, and coded using variable length codes.
Part 3 specifies a coded representation that can be used for compressing audio sequences -
both mono and stereo. Input audio samples are fed into the encoder. The mapping creates a
filtered and subsampled representation of the input audio stream. A psychoacoustic model
creates a set of data to control the quantiser and coding. The quantiser and coding block creates
13
a set of coding symbols from the mapped input samples. The block 'frame packing' assembles
the actual bit stream from the output data of the other blocks, and adds other information (e.g.
error correction) if necessary.
Part 4 specifies how tests can be designed to verify whether bitstreams and decoders meet the
requirements as specified in parts 1, 2 and 3 of the MPEG-1 standard. These tests can be used
by:
• manufacturers of encoders, and their customers, to verify whether the encoder produces
valid bitstreams.
• manufacturers of decoders and their customers to verify whether the decoder meets the
requirements specified in parts 1,2 and 3 of the standard for the claimed decoder
capabilities.
• applications to verify whether the characteristics of a given bitstream meet the
application requirements, for example whether the size of the coded picture does not
exceed the maximum value allowed for the application.
2.1.5 MPEG-2
MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio
and video signals. MPEG-2 is directed at broadcast formats at higher data rates; it provides
extra algorithmic 'tools' for efficiently coding interlaced video, supports a wide range of bit
rates and provides for multichannel surround sound coding.
1. INTRODUCTION:
The MPEG-2 standard [2] is capable of coding standard-definition television at bit rates from
about 3-15 Mbit/s and high-definition television at 15-30 Mbit/s. MPEG-2 extends the stereo
audio capabilities of MPEG-1 to multi-channel surround sound coding. MPEG-2 decoders will
also decode MPEG-1 bit streams.
2. VIDEO FUNDAMENTALS
14
Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame
consists of two interlaced fields, giving a field rate of 50 Hz. The first field of each frame
contains only the odd numbered lines of the frame (numbering the top frame line as line 1).
The second field contains only the even numbered lines of the frame and is sampled in the
video camera 20 ms after the first field. It is important to note that one interlaced frame
contains fields from two instants in time. American television is similarly interlaced but with a
frame rate of just less than 30 Hz.
The red, green and blue (RGB) signals coming from a color television camera can be
equivalently expressed as luminance (Y) and chrominance (UV) components. The
chrominance bandwidth may be reduced relative to the luminance without significantly
affecting the picture quality. For standard definition video, CCIR recommendation 601 [3]
defines how the component (YUV) video signals can be sampled and digitized to form discrete
pixels. The terms 4:2:2 and 4:2:0 are often used to describe the sampling structure of the
digital picture. 4:2:2 means the chrominance is horizontally sub sampled by a factor of two
relative to the luminance; 4:2:0 means the chrominance is horizontally and vertically sub
sampled by a factor of two relative to the luminance.
Spatial and temporal redundancy: Pixel values are not independent, but are correlated with
their neighbors both within the same frame and across frames. So, to some extent, the value of
a pixel is predictable given the values of neighboring pixels.
The human eye has a limited response to fine spatial detail [4], and is less sensitive to detail
near object edges or around shot-changes. Consequently, controlled impairments introduced
into the decoded picture by the bit rate reduction process should not be visible to a human
observer.
15
Two key techniques employed in an MPEG codec are intra-frame Discrete Cosine Transform
(DCT) coding and motion-compensated inter-frame prediction. These techniques have been
successfully applied to video bit rate reduction prior to MPEG, notably for 625-line video
contribution standards at 34 Mbit/s and video conference systems at bit rates below 2 Mbit/s.
4. MPEG-2 DETAILS
In an MPEG-2 system, the DCT and motion-compensated interframe prediction are combined.
The coder subtracts the motion-compensated prediction from the source picture to form a
'prediction error' picture. The prediction error is transformed with the DCT, the coefficients are
quantized and these quantized values coded using a VLC. The coded luminance and
chrominance prediction error is combined with 'side information' required by the decoder, such
as motion vectors and synchronizing information, and formed into a bit stream for
transmission. In the decoder, the quantized DCT coefficients are reconstructed and inverse
transformed to produce the prediction error. This is added to the motion-compensated
prediction generated from previously decoded pictures to produce the decoded output.
Picture types
In MPEG-2, three 'picture types' are defined. The picture type defines which prediction modes
may be used to code each block..
'Intra' pictures (I-pictures) are coded without reference to other pictures. Moderate compression
is achieved by reducing spatial redundancy, but not temporal redundancy. They can be used
periodically to provide access points in the bitstream where decoding can begin.
'Predictive' pictures (P-pictures) can use the previous I- or P-picture for motion compensation
and may be used as a reference for further prediction. Each block in a P-picture can either be
predicted or intra-coded. By reducing spatial and temporal redundancy, P-pictures offer
increased compression compared to I-pictures.
16
'Bidirectionally-predictive' pictures (B-pictures) can use the previous and next I- or P-pictures
for motion-compensation, and offer the highest degree of compression. Each block in a B-
picture can be forward, backward or bidirectionally predicted or intra-coded. To enable
backward prediction from a future frame, the coder reorders the pictures from natural 'display'
order to 'bitstream' order so that the B-picture is transmitted after the previous and next pictures
it references. This introduces a reordering delay dependent on the number of consecutive B-
pictures.
Buffer control: By removing much of the redundancy from the source images, the coder
outputs a variable bit rate. The bit rate depends on the complexity and predictability of the
source picture and the effectiveness of the motion-compensated prediction.
Part 1 of MPEG-2 addresses the combining of one or more elementary streams of video and
audio, as well as, other data into single or multiple streams which are suitable for storage or
transmission. This is specified in two forms: the Program Stream and the Transport Stream.
Each is optimized for a different set of applications. The Program Stream is similar to MPEG-1
Systems Multiplex. It results from combining one or more Packetised Elementary Streams
(PES), which have a common time base, into a single stream. The Program Stream is designed
for use in relatively error-free environments and is suitable for applications which may involve
software processing. Program stream packets may be of variable and relatively great length.
The Transport Stream combines one or more Packetized Elementary Streams (PES) with one
or more independent time bases into a single stream. Elementary streams sharing a common
timebase form a program. The Transport Stream is designed for use in environments where
errors are likely, such as storage or transmission in lossy or noisy media. Transport stream
packets are 188 bytes long.
17
Part 2 of MPEG-2 builds on the powerful video compression capabilities of the MPEG-1
standard to offer a wide range of coding tools. These have been grouped in profiles to offer
different functionalities. Since the final approval of MPEG-2 Video in November 1994, one
additional profile has been developed. This uses existing coding tools of MPEG-2 Video but is
capable to deal with pictures having a colour resolution of 4:2:2 and a higher bitrate. Even
though MPEG-2 Video was not developed having in mind studio applications, a set of
comparison tests carried out by MPEG confirmed that MPEG-2 Video was at least good, and
in many cases even better than standards or specifications developed for high bitrate or studio
applications.
The Multiview Profile (MVP) is an additional profile currently being developed. By using
existing MPEG-2 Video coding tools it is possible to encode in an efficient way tow video
sequences issued from two cameras shooting the same scene with a small angle between them.
Part 3 of MPEG-2 - Digital Storage Media Command and Control (DSM-CC) is the
specification of a set of protocols which provides the control functions and operations specific
to managing MPEG-1 and MPEG-2 bitstreams. These protocols may be used to support
applications in both stand-alone and heterogeneous network environments. In the DSM-CC
model, a stream is sourced by a Server and delivered to a Client. Both the Server and the Client
are considered to be Users of the DSM-CC network. DSM-CC defines a logical entity called
the Session and Resource Manager (SRM) which provides a (logically) centralized
management of the DSM-CC Sessions and Resources.
Part 4 of MPEG-2 will be the specification of a multichannel audio coding algorithm not
constrained to be backwards-compatible with MPEG-1 Audio. The standard has been approved
in April 1997.
18
Part 5 of MPEG-2 was originally planned to be coding of video when input samples are 10
bits. Work on this part was discontinued when it became apparent that there was insufficient
interest from industry for such a standard.
Part 6 of MPEG-2 is the specification of the Real-time Interface (RTI) to Transport Stream
decoders which may be utilised for adaptation to all appropriate networks carrying Transport
Streams.
2.1.6 MPEG-4
Introduction
The creation of the MPEG-4 specification arose as experts wanted a faster compression rate
than MPEG-2, but which also worked well at low bit rates. Discussions began at the end of
1992 and work on the standards started in July 1993.
MPEG-4 provides a standardized method of:
Elementary Streams:
Each encoded media object has its own Elementary Stream (ES), which is sent to the decoder
and decoded individually, before composition. The following streams are created in
MPEG-4:
1 Scene Description Stream
2 Object Description Stream
3 Visual Stream
19
4 Audio Stream
When data has been encoded, the data streams can be transmitted or stored separately and need
to be composed at the receiving end. Media objects are organized in a hierarchical manner to
form audio-visual scenes. Due to this organizational manner, the media objects, each object
can be described or encoded independently of other objects in the scene e.g. the background.
MPEG-4/BiFS:
1 Allows users to change their view point in a 3D scene or to interact with media objects.
Allows different objects in the same scene to be coded at different levels of quality.
Profiles have been developed to create conformance points for MPEG-4 tools and toolsets,
therefore interoperability of MPEG-4 products with the same Profiles and Levels can be
assured.
A Profile is a subset of the MPEG-4 Systems, Visual or Audio tools set and is used for specific
applications. It limits the tool set a decoder has to implement, therefore many applications only
need a portion of the MPEG-4 toolset. Profiles specified in the MPEG-4 standard include:
1 a. Visual Profile
2 b. Natural Profile
3 c. Synthetic & Natural/Synthetic Hybrid Profiles
4 d. Audio Profile
5 e. Graphic Profile
6 f. Scene Graph Profile
20
The systems part of the MPEG-4 addresses the description of the relationship between the
audio-visual components that constitute a scene. The relationship is described at two main
levels.
• The Binary Format for Scenes (BIFS) describes the spatio-temporal arrangements of
the objects in the scene. Viewers may have the possibility of interacting with the
objects, e.g. by rearranging them on the scene or by changing their own point of view in
a 3D virtual environment. The scene description provides a rich set of nodes for 2-D
and 3-D composition operators and graphics primitives.
• At a lower level, Object Descriptors (ODs) define the relationship between the
Elementary Streams pertinent to each object (e.g. the audio and the video stream of a
participant to a videoconference) ODs also provide additional information.
7
2.1.7 MPEG-7
Introduction
The MPEG standards are an evolving set of standards for video and audio compression. MPEG
7 technology covers the most recent developments in multimedia search and retreival, designed
to standardize the description of multimedia content supporting a wide range of applications
including DVD, CD and HDTV.
MPEG-7 is a seven-part specification, formally entitled ‘Multimedia Content Description
Interface’. It provides standardized tools for describing multimedia content, which will enable
searching, filtering and browsing of multimedia content.
21
Textual - XML which allows editing, searching and filtering of a multimedia description. The
description can be located anywhere, not necessaryily with the content. Binary format -
suitable for storing, transmitting and streaming delivery of the multimedia description.
1 MPEG-7 data is obtained from transport or storage ,handed to delivery layer. This
allows extraction of elementary streams (consisting of individually accessible chunks called
access units) by undoing the transport/storage specific framing and multiplexing and retains
timing information needed for synchronisation.
2
3 Elementary streams are forwarded to the compression layer where the schema streams
(schemes describing strucure of MPEG-7 data) and partial or full description streams (streams
describing the content) are decoded.
22
MPEG-7 tools
• Description Schemes (DS): Specify the structure and semantics of the relations
between its components, these components can be descriptors (D) or description
schemes (DS).
• System tools: These tools deal with binarization, synchronization, transport and storage
of descriptors.
2.1.8 H.261
H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for
transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member
of the H.26x family of video coding standards in the domain of the ITU-T Video Coding
Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit
rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF
(352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0
sampling scheme. It also has a backward-compatible trick for sending still picture graphics
with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later
revision in 1993.
Different steps follow by this standard is:
23
1. Loop filter
The prediction process may be modified by a two-dimensional spatial filter (FIL) which
operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional
horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4
except at block edges where one of the taps would fall outside the block. In such cases the 1-D
filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with
rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half
are rounded up. The filter is switched on/off for all six blocks in a macro block according to the
macro block type.
2. Transformer
Transmitted blocks are first processed by a separable two-dimensional discrete cosine
transform of size 8 by 8. The output from the inverse transform ranges from –256 to +255 after
clipping to be represented with 9 bits. The transfer function of the inverse transform is given
by:
NOTE – Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left
and top edges of the picture, respectively.
The arithmetic procedures for computing the transforms are not defined, but the inverse one
should meet the error Tolerance.
3. Quantization
The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.
Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.
The decision levels are not defined. The INTRA dc coefficient is nominally the transform
value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers
24
is also nominally linear but with a central dead-zone around zero and with a step size of an
even value in the range 2 to 62.
• Video compressor
• AVI Compressor
• MP4 Compressor
• MPEG Compressor
• 3GP Compressor
• You Tube Compressor
• iPod Compressor
• Flash Video Compressor
• Quick Time Compressor
• WMV Compressor
• MKV Compressor
• VOB Compressor
• DVD Compressor
25
2.2 Methodology Adopted
H.261
H.261 is a ITU-T video coding standard, ratified in November 1988. Originally designed for
transmission over ISDN lines on which data rates are multiples of 64 kbit/s. It is one member
of the H.26x family of video coding standards in the domain of the ITU-T Video Coding
Experts Group (VCEG). The coding algorithm was designed to be able to operate at video bit
rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF
(352x288 luma with 176x144 chroma) and QCIF (176x144 with 88x72 chroma) using a 4:2:0
sampling scheme. It also has a backward-compatible trick for sending still picture graphics
with 704x576 luma resolution and 352x288 chroma resolution (which was added in a later
revision in 1993.
Different steps follow by this standard is:
1.Loop filter
The prediction process may be modified by a two-dimensional spatial filter (FIL) which
operates on pixels within a predicted 8 by 8 block..The filter is separable into one-dimensional
horizontal and vertical functions. Both are non-recursive with coefficients of 1/4, 1/2, 1/4
except at block edges where one of the taps would fall outside the block. In such cases the 1-D
filter is changed to have coefficients of 0, 1, 0. Full arithmetic precision is retained with
rounding to 8 bit integer values at the 2-D filter output. Values whose fractional part is one half
are rounded up. The filter is switched on/off for all six blocks in a macro block according to the
macro block type.
26
2.Transformer
Transmitted blocks are first processed by a separable two-dimensional discrete cosine
transform of size 8 by 8. The output from the inverse transform ranges from –256 to +255 after
clipping to be represented with 9 bits. The transfer function of the inverse transform is given
by:
NOTE – Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left
and top edges of the picture, respectively.
The arithmetic procedures for computing the transforms are not defined, but the inverse one
should meet the errorTolerance.
3. Quantization
The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients.
Within a macro block the same quantizer is used for all coefficients except the INTRA dc one.
The decision levels are not defined. The INTRA dc coefficient is nominally the transform
value linearly quantized with a step size of 8 and no dead-zone. Each of the other 31 quantizers
is also nominally linear but with a central dead-zone around zero and with a step size of an
even value in the range 2 to 62.
27
Chapter -3 Project Estimation and Implementation Plan
3.1 Cost and Benefit Analysis
3.1.1 ECONOMICAL
Economic analysis is the most frequently used method for evaluating the candidate system.
More commonly known as cost of Benefit Analysis, the procedure is to determine the
benefits and savings that are expected from the candidate system and compare them with the
costs. If benefit outweighs the cost then the decision is made to design and implementation
otherwise further justification or alterations are made in the proposed system.
This project doesn't have many hardware requirements, thus, it requires less costing to install
the software on the whole.
Though, from the point of economy, the manual handling of the hardware component is
much cheaper and best as compared to computerized systems. This approach normally
works very well in any ordinary organization . The major problem starts when the no. of
hardware components are starts growing with a time. Manual system needs various
registers/books to maintain the daily complain entry, hardware entry done. In case of any
misplacement of hardware component, the concerned registers have to be searched for the
verification of identifying the status of that component . It is very cumbersome job to
maintain all these manually. So it is very easy to maintain all these in the proposed system.
28
• Moreover hardware like Pentium Core PC and software like MATLAB
are easily available in the market.
29
3.2 Schedule Estimate
This is the table of ‘Activity’ and its estimated time duration, which are used to accomplish the
project.
Gantt charts may be simple versions created on graph paper or more complex automated
versions created using project management applications such as Microsoft Project or Excel.
A Gantt chart is constructed with a horizontal axis representing the total time span of the
project, broken down into increments (for example, days, weeks, or months) and a vertical axis
30
representing the tasks that make up the project (for example, if the project is outfitting your
computer with new software, the major tasks involved might be: conduct research, choose
software, install software). Horizontal bars of varying lengths represent the sequences, timing,
and time span for each task. Using the same example, you would put "conduct research" at the
top of the vertical axis and draw a bar on the graph that represents the amount of time you
expect to spend on the research, and then enter the other tasks below the first one and
representative bars at the points in time when you expect to undertake them. The bar spans may
overlap, as, for example, you may conduct research and choose software during the same time
span. As the project progresses, secondary bars, arrowheads, or darkened bars may be added to
indicate completed tasks, or the portions of tasks that have been completed. A vertical line is
used to represent the report date.
Testing
Coding
Design.
Documentation
Analysis
0 5 10 15 20 25 30 35
References
31
[1] HUFFMAN, D. A. (1951). A method for the construction of minimum redundancy codes.
In the
Proceedings of the Institute of Radio Engineers 40, pp. 1098-1101.
[2] CAPON, J. (1959). A probabilistie model for run-length coding of pictures. IRE Trans. On
Information
Theory, IT-5, (4), pp. 157-163.
[3] APOSTOLOPOULOS, J. G. (2004). Video Compression. Streaming Media Systems
Group.
[4] The Moving Picture Experts Group home page. (3. Feb. 2006)
[5] CLARKE, R. J. Digital compression of still images and video. London: Academic press.
1995, pp.
[6] http://www.irf.uka.de/seminare/redundanz/vortrag15/.
(3. Feb. 2006)
[7] PEREIRA, F. The MPEG4 Standard: Evolution or Revolution
[8] MANNING, C. The digital video site.
[9] SEFERIDIS, V. E. GHANBARI, M. (1993). General approach to block-matching motion
estimation.
Optical Engineering, (32), pp. 1464-1474.
[10] GHARAVI, H. MILLIS, M. (1990). Block matching motion estimation algorithms-new
results. IEEE
Transactions on Circuits and Systems, (37), pp. 649-651.
[11] CHOI, W. Y. PARK R. H. (1989). Motion vector coding with conditional transmission.
Signal
Processing, (18). pp. 259-267.
[12] Institut für Informatik – Universität Karlsruhe.
32