Вы находитесь на странице: 1из 18

General Definition

Multimedia is the field concerned (smporkito) with the computer controlled integration of
text, graphics, drawings, still image and moving images (Video), animation, audio, and
any other media where every type of information can be represented, stored, transmitted
and processed digitally.

Why Multimedia?
 Information can often be better represented using audio/video/animation
rather than using text, images and graphics alone.
 Collaboration and virtual environments.
 Potential for improving our lives (e.g., learning, entertainment, and work).
 Convergence of computers, telecommunication, and TV.
 Dramatic increase in CPU processing power.
 High speed fiber optic networks gigabit networks.

Multimedia Applications
 World Wide Web
 Multimedia Authoring, e.g. Adobe/Macromedia Director
 Hypermedia courseware
 Video-on-demand
 Interactive TV
 Computer Games
 Virtual reality
 Digital video editing and production systems
 Multimedia Database systems

What is Hypertext and Hypermedia?

Hypertext is a text which contains links to other texts.

Hypermedia definition
Hypermedia is not constrained to be text-based. It can include other media, e.g.,
graphics, images, and especially continuous media –sound and video.
Example Hypermedia Applications?
 The World Wide Web is a clear example of a hypermedia application.
 PowerPoint
 Adobe Acrobat (or other PDF software)
 Adobe Flash etc.

Multimedia System Definition

A Multimedia System is a system capable of processing multimedia data and

A Multimedia System is characterized by the processing, storage, generation,

manipulation and rendition of Multimedia information.
Characteristics of a Multimedia System
 Multimedia systems must be computer controlled
 Multimedia systems are integrated
 The information they handle must be represented digitally
 The interface to the final presentation of media is usually interactive

Challenges for Multimedia Systems

 Distributed Networks
 Temporal relationship between data

Key Issues for Multimedia Systems

 How to represent and store temporal information.
 How to strictly maintain the temporal relationships on retrieve
 What process are involved in the above.
 Data has to represented digitally —Analog–Digital Conversion, Sampling etc.
 Large Data Requirements —bandwidth, storage.
Desirable Features for a Multimedia System
 Efficient and High I/O: Input and output to the file subsystem needs to be
efficient and fast. Needs to allow for real-time recording as well as playback
of data. e.g. Direct to Disk recording systems.
 Special Operating System: To allow access to file system and process data
efficiently and quickly. Needs to support direct transfers to disk, real-time
scheduling, fast interrupt processing, I/O streaming etc.
 Storage and Memory: Large storage units (of the order of hundreds of Tb if
not more) and large memory (several Gb or more). Large Caches also required
and high speed buses for efficient management.
 Network Support: Client-server systems common as distributed systems
 Software Tools: User friendly tools needed to handle media, design and
develop applications, deliver media.
Components of a Multimedia System
 Capture devices: Video Camera, Video Recorder, Audio Microphone,
Keyboards, mice, graphics tablets, 3D input devices, tactile sensors, VR
devices. Digitizing Hardware
 Storage Devices: Hard disks, CD-ROMs, DVD-ROM, etc
 Communication Networks: Local Networks, Intranets, Internet, Multimedia
or other special high speed networks.
 Computer Systems: Multimedia Desktop machines, Workstations,
 Display Devices: CD-quality speakers, HDTV,SVGA, Hi-Res monitors,
Color printers etc.

Applications of Multimedia System

 World Wide Web
 Hypermedia courseware Video conferencing
 Video-on-demand
 Interactive TV
 Groupware
 Home shopping Games
 Virtual reality
 Digital video editing and production systems

Variation of Media
MHEG (Multimedia and Hypermedia Expert Group provides differentiation of
Media term to distinguish between:

 Perception Media: Perception media refers to nature of information

perceived by humans.
 Representation Media: Representation media refers to how information is
represented internally to the computer.
 Presentation Media: Presentation media refers to the physical means used by
system to reproduce information for human.
 Storage Media: Storage media refers to various physical means for storing
computer data.
 Transmission Media: Transmission media refers to the physical means that
allow the transmission of telecommunication signals.
 Information Exchange Media: Information exchange media include all data
media used to transport information, e.g. storage and transmission media.

Presentation Dimensions
 Discrete Media: Refers to Text, Graphics, and Image, Time independent
information items Can be displayed anytime or in a sequence and still
 Continuous Media: Refers to Sound or Motion Video Time-dependency
between information items. If the timing is changed or the sequence is
modified, the meaning is altered Another consequence: require networks to
respect this time-dependency.

Key Properties of a Multimedia System

 Handle discrete and continuous media: Multimedia application should
process at least one discrete and one continuous medium. Example Microsoft
Word or another word processor.
 Independent Media: Media used in a multimedia system should be
independent. Example Combined text & graphics blocks
 Computer-Controlled Systems: A system that capable to process media in a
computer controlled way
 Integration: Synchronization of time, space, and content between computer-
controlled independent media streams. Example of highly integrated
multimedia systems: Voice recognition, Face detection

Characterizing Data Streams

 Asynchronous: Sender and received not need to coordinate before data
transmitted. Example: Email.
 Synchronous: Sender and received need to coordinate before data
transmitted. Example: Telephone systems.
 Isochronous: Combination of both synchronous and asynchronous
transmission. Example: Voice and digital video transmission.

Data stream characteristics for continuous Media

Static or Discrete Media:
Some media is time independent this media is called static media. Such as Normal
data, text, single images, graphics.
Continuous Media:
Some media is Time dependent this media is called Continuous Media. Such as
video, animation and audio.

Analog-to-Digital-to-Analog Pipeline
Begins at the conversion from the analog input and ends at the conversion from the
output of the processing system to the analog output as shown:

Digital to Analog Converter (DAC)

Digital to Analog Converter (DAC) is a device that transforms digital data into an analog
signal. According to the Nyquist-Shannon sampling theorem.

Analog to Digital Converter (ADC)

Analog to Digital Converter (ADC) is a device that transforms an analog signal into
digital data.
Multimedia Data
Digital Audio
 Digital Audio Synthesis
 MIDI—Synthesis and Compression Control
 Digital Audio Signal Processing/Audio Effects
Graphics/Image Formats
 Color Representation/Human Color Perception
Digital Video
 Chroma Subsampling

Text and Static Data

 Source: keyboard, optical character recognition, data stored on disk.
 Storage: 1 byte per character (text or format character), e.g. ASCII; more
bytes for Unicode.
 Formatted Text: Raw text or formatted text e.g HTML, Word or a program
language source (Java, Python, MATLAB etc.)
 Compression: convenient to bundle files for archiving and transmission of
larger files. E.g. Zip, RAR, 7-zip.

 Format: constructed by the composition of primitive objects such as lines,
polygons, circles, curves and arcs.
 Input: Graphics are usually generated by a graphics editor program (e.g.
Illustrator, Freehand) or automatically by a program (e.g. Postscript).
 Graphics input devices: keyboard (for text and cursor control), mouse,
trackball or graphics tablet.
 Graphics standards : OpenGL -Open Graphics Library, a standard
specification defining a cross-language,cross-platform API for writing
applications that produce 2D/3D graphics.
 Animation: can be generated via a sequence of slightly changed graphics
 Still pictures which (uncompressed) are represented as a bitmap (a grid of
 Input: scanned for photographs or pictures using a digital scanner or from a
digital camera.
 Stored at 1 bit per pixel (Black and White), 8 Bits per pixel (Grey Scale, Color
Map) or 24 Bits per pixel (True Color)
 Size: a 512x512 Grey scale image takes up 1/4 MB, a 512x512 24 bit image
takes 3/4 MB with no compression.

 Audio signals are continuous analog signals.
 Input: microphones and then digitised and stored
 CD Quality Audio requires 16-bit sampling at 44.1 KHz: Even higher
audiophile rates (e.g. 24-bit, 96 KHz)1 Minute of Mono CD quality
(uncompressed) audio = 5MB.
 Stereo CD quality (uncompressed) audio = 10 MB.Usually compressed (E.g.
MP3, AAC, Flac, Ogg Vorbis)

 Input: Analog Video is usually captured by a video camera and then digitised,
although digital video cameras now essentially perform both tasks.
 Raw video can be regarded as being a series of single images. There are
typically 25, 30 or 50 frames per second.
 A 512x512 size monochrome video images take 25*0.25 = 6.25MBfor a
second to store uncompressed.
 Typical PAL digital video (720 ×576 pixels per colour frame) ≈ 1.2 ×25 =
30MB for a second to store uncompressed.

What is Data Compression?

The process of coding that will effectively reduce the total number of bits needed to
represent certain information.

Advantages of Data Compression:

 Decrease in the size of the file which occupies less in the memory.
 Reading and writing the data is fast.
 The speed of the file transfer is more.
 Variable dynamic range (difference between largest and the smallest values).

Disadvantages of Data Compression:

 Complicated.
 There is a chance of transmission errors.
 Byte or pixel relation is not understood.
 Have to decompress all for previous data.

Lossy compression Lossless compression

Lossy compression permanently Lossless compression enables the
eliminates bits of data that are restoration of a file to its original state,
redundant, unimportant or without the loss of a single bit of data,
imperceptible. when the file is uncompressed.
Lossy compression is used for broad Lossless compression is not used for
type of multimedia data. broad type of multimedia data.
Also known as irreversible compression Also known as reversible compression
Reduce the quality Does not reduce the quality
Data reduction is higher Data reduction is lower
Resultant file is smaller than the Resultant file is not small
Commonly used to compress Used for text, data files, audio and image
multimedia data such as audio, video
and image files

How does the human eye sense color?

The human eye and brain together translate light into color. Light receptors within
the eye transmit messages to the brain, which produces the familiar sensations of

Characteristics of human visual system

 Human eye seen intensity much more than the color
 The eye is basically similar to a camera
 It has a lens to focus light onto the Retina of eye
 Retina full of neurons
 Each neuron is either a rod or a cone.
 Rods are not sensitive to color.

Color Lookup Table

It is a way to transform a range of input colors into another range of colors.

How it is used to represent color

When we scroll mouse above the color lookup table it takes the color and stored
each pixel then convert them into physical colors that are visible on a computer
monitor or display device

Advantage of Color Lookup Table

 Save memory
 Speed of color change

File type property JPG TIFF PNG GIF BMP

Lossy Compression Yes
Loss less Compression Yes Yes Yes
Uncompressed option Yes Yes Yes
Gray scale Yes Yes Yes Yes
RGB color Yes Yes Yes
8-bit color (24-bit data) Yes Yes Yes Yes
16-bit color option Yes Yes Yes
Indexed color option Yes Yes Yes
Transparency option Yes Yes
Animation option Yes

Chroma subsampling:
Chroma subsampling is the practice of encoding images by implementing less
resolution for chroma information than for luma information.
Why Chroma Subsampling?
 Chroma subsampling involves the reduction (হ্রাস) of color resolution in JPEG
images in order to save bandwidth.
 The color component information (chroma) is reduced by sampling them at a
lower rate than the brightness (luma).
 Although color information is discarded, human eyes are much more sensitive
to variations in brightness than in color.

Color model conversion?

What is entropy of information
Entropy is the minimum number of bits needed to encode that element.

How the Discrete Cosine Transform (DCT) operated

The Discrete Cosine Transform (DCT), a widely used transform coding technique,
is able to perform decorrelation of the input signal in a data-independent manner.
Because of this, it has gained tremendous popularity.

Why DCT is useful in compression?

 DCT helps separate the image into parts (or spectral sub-bands)
 It transforms image from the spatial domain to the frequency domain

What is CMYK color model

The CMYK color model (process color, four color) is a subtractive color model,
used in color printing, and is also used to describe the printing process itself.
CMYK refers to the four inks used in some color printing: cyan, magenta, yellow,
Three-level hierarchical JPEG image compression

The Major Steps in JPEG image compression

Transform RGB to YCbCr and subsample color
 JPEG makes use of [Y,Cb,Cr] model instead of [R,G,B] model.
 The precision of colors suffer less (for a human eye) than the precision of
contours (based on luminance)

Simple color space model: [R,G,B] per pixel

JPEG uses [Y, Cb, Cr] Model
Y = Brightness
Cb = Color blueness
Cr = Color redness

DCT on Image Blocks

Each image is divided into 8 × 8 blocks. The 2D DCT (Eq. 8.17) is applied to each
block image f (i, j ), with output being the DCT coefficients F(u, v) for each block.
The quantization step in JPEG is aimed at reducing the total number of bits needed
for a compressed image [2]. It consists of simply dividing each entry in the
frequency. Standard Formula Fˆ(u, v) = round (F(u, v)/Q(u, v))
Here, F(u, v) represents a DCT coefficient, Q(u, v) is a quantization matrix entry

Zig-Zag Scan
• Maps 8 x 8 matrix to a 1 x 64 vector.
• Why zigzag scanning?
– To group low frequency coefficients at the top of the vector and high frequency
coefficients at the bottom.
• In order to exploit the presence of the large number of zeros in the quantized
matrix, a zigzag of the matrix is used.

Entropy Coding
The DC and AC coefficients finally undergo an entropy coding step.

The DCT transform coding method in JPEG relies on three major

Observation 1. Useful image contents change relatively slowly across the image—
that is, it is unusual for intensity values to vary widely several times in a small
area—for example, in an 8 × 8 image block.

Observation 2. Psychophysical experiments suggest that humans are much less

likely to notice the loss of very high-spatial frequency components than lower
frequency components.
Observation 3. Visual acuity (accuracy in distinguishing closely spaced lines) is
much greater for gray (“black and white”) than for color.

What is Predictive coding

Predictive coding is a approach that achieves good compression without significant
overload. Two types of predictive coding are Lossless and Lossy

The JPEG-LS Standard

Generally, we would likely apply a lossless compression scheme to images that are
critical in some sense, say medical images of a brain, or perhaps images that are
difficult or costly to acquire. Ascheme in competition with the lossless mode
provided in JPEG2000 is the JPEG-LS standard, specifically aimed at lossless
encoding [11]. The main advantage of JPEG-LS over JPEG2000 is that JPEG-LS is
based on a low complexity algorithm. JPEG-LS is part of a larger ISO effort aimed
at better compression of medical images.