Вы находитесь на странице: 1из 4

Real-Time Video Image Processing in Software

Video DSP The Sony DSP introduced here includes


■ Allows a wide range of two SIMD*1 highly parallel linear ar- ASIC and DSP
video processing to be ray structured processor sets which con- While there are already a wide variety
implemented in software sist of 1080 individual processor ele- of digital video related electronic prod-
ments and was designed for video ap- ucts in the marketplace, the majority of
■ Highly parallel processing plications that require enormous arith- these are implemented using ICs that
metic processing capabilities on the
provides 4.3 GOPS of order of the 4.3 GOPS*2 provided by
have structures that are specific to the
application, in other words, ASICs.
arithmetic processing this device. This DSP supports flexible These are referred to as “hard-wired
processing of high-speed real-time structures.” That is, the most common
■ Architecture appropriate video data in TV and personal computer implementation technique in the audio
for converting between application, in particular, allowing a area, which involves implementing the
wide range of signal-processing func-
image display formats tions and conversion operations be-
actual application with a program run-
ning on a general-purpose DSP or simi-
tween transmission methods (compos- lar chip is almost never used in the
ite/component) and between display video area.
formats (e.g. 1080i, XGA) to be imple-
mented in software.

*1 SIMD: Single instruction stream/multiple


data stream
*2 GOPS: Giga operations per second

TV
• Format conversion
• I/P* conversion
• Zooming effects
• Noise reduction
• Image enhancement
• Color space conversion
* Interlace/Progressive

Video DSP Computer Displays


(LCD and other displays)
• Arbitrary pixel size conversion
• I/P conversion
• Gamma correction (simplified processing)
Digital Cameras • Multiformat monitors
• Camera signal processing (including SVGA, XGA, NTSC, HDTV, and others)
• Gamma correction
(simplified processing)
• NTSC encoding

This DSP supports arbitrary conversion between a wide variety of video signal formats
and a wide range of video signal processing.

■ Figure 1 Video DSP Applications


Software Processing of Linear Array Structure
Video Signals While there have been impressive ad-
This is due to the difference in the band- vances in semiconductor technology, a
widths of video and audio signals, and parallel processing structure of some
results directly from the fact that video sort will remain, for the near future at
data rates are three orders of magnitude least, indispensable for processing
higher than those required for audio video signals in real time in software.
processing. For example, a normal Thus an arithmetic block with multiple
NTSC TV signal has a bandwidth of data paths is required.
about 4 MHz, and a sampling rate of While there are several parallel proces-
13.5 or 14.318 MHz is normally sor architectures, each has its own ad-
selected. For a program to process this vantages and disadvantages. The video
data, it would have to execute several DSP described here adopts a linear ar-
instructions to complete the stipulated ray structure (see figure 2) that is most
processing of each data point in the appropriate for TV video signal pro-
extremely short interval of about 70 ns cessing and for conversion between
before the next data arrives. For a nor- various video formats including per-
mal DSP that only has a single data path sonal computer video outputs.
and arithmetic unit, this would require The linear array structure arranges a
an operating speed in the GHz or higher large number of parallel processor ele-
range, which is extremely difficult to ments in a one-dimensional array. In
achieve with current technology. (See video applications, processor elements
table 1.) are allocated to pixels on the horizon-
tal scanning line in a one-to-one rela-
tionship.

■ Table 1 Bandwidths and Sampling Frequencies used for Real-Time Signal


Processing
Bandwidth Sampling frequency
Voice 3kHz 8kHz
Audio 20kHz 44.1kHz, 48.0kHz, etc.
TV telephone Up to 1MHz 3.1MHz
Standard TV 4 to 5MHz 14.318MHz, 13.50MHz, etc.
HDTV 20 to 30MHz 74.25MHz, 48.00MHz, etc.

Horizontal scanning line

Pixel

SIMD Processor
Controller elements

SIMD: Single Instruction Multiple Data stream


Linear array

■ Figure 2 Linear Array Structure


SIMD Program Control consumer applications, since it allows of 1080 processors, a number adequate
the implementation of parallel proces- not just for standard TV signals, but
This DSP adopts the SIMD technique sors with excellent price performance also for video signals that conform to
as its basic program control strategy. characteristics. the VGA, SVGA, and XGA VESA
This refers to the technique of using standards. That is, the input buffer
parallel processors with multiple data *3 MIMD: Multiple instruction stream, memory, the two processor sets, and the
multiple data stream
paths and arithmetic units and linking output buffer memory implement an
them together under the control of a extremely wide bass bandwidth.
single program control unit. Note that
the term MIMD*3 refers to systems in Architectural Overview
which each of the parallel data path and
arithmetic units has its own indepen- Figure 3 presents an overview of the Processor Elements
dent program control unit. architecture of this DSP. The Sony The Sony video DSP processor
When a video DSP that has the struc- video DSP consists of two linear array elements are 1-bit processors that con-
ture shown in figure 2 processes, for architecture processor sets connected in sist of a 1-bit arithmetic unit and 256
example, a normal NTSC TV signal, series. Each processor set uses the bits of memory as shown in figure 4.
which has a horizontal scan period of SIMD control technique for program While the 1-bit arithmetic units include
about 63.6 µs, the program processing control and an MIMD control tech- circuits such as registers and selectors,
period (or cycle) will be that of the nique is adopted overall for the two pro- they consists of a full-adder ALU and
input signal. This means that if the DSP cessor sets. Each processor set handles the logic circuits required for multipli-
has a clock frequency of 50 MHz, then all of the pixel data for every pixel in a cation based on the second-order Booth
it will be able to perform over 3000 horizontal scanning line at the same algorithm, and as such are extremely
instruction execution cycles per hori- time and transfer of data between pro- simple devices. Three-port memories
zontal scan period in real time. cessors also functions in this manner. are used as the 256-bit memory units
Although SIMD control is a simple This DSP provides memory units so that two-argument operations can be
structure and special techniques are before and after the processor sets. executed efficiently.
required to perform separate operations There are two input buffer memory
in the individual unit processors, most units (serial-to-parallel converters) that
of the processing in image processing convert a single scanning line period
applications, which require enormous of 32-bit wide time series video data to
amounts of iterated calculations, can be parallel, and a single 32-bit wide out-
performed without problem by SIMD put buffer memory (parallel-to-serial
control. In particular, the SIMD control converter) that performs the reverse
technique can be said to be optimal for operation. Each processor set consists

1080PE

Input1 32-bit IR
Input2 32-bit IR

PC 256-bit LM
1.5kW Local
1-bit ALU
Memory
PE : Processor Element
IR : Input Register (256 bits)
LM : Local Memory
PC 256-bit LM OR : Output Register
1.5kW PC : Program Controller
1-bit ALU
1-bit ALU

32-bit OR Output

■ Figure 3 Structural Overview ■ Figure 4 Processor Element


Program Processing of TV Video Signal formats. Resolution conversion (i.e.
pixel count conversion) is especially
Video Data Processing Applications indispensable for applications that use
Since the processor elements that A wide range of video signal-process- displays with fixed numbers of pixels,
correspond to each pixel in the horizon- ing functions for TV video signals can such as LCD panels. Special efforts
tal scanning line are 1-bit processors, be implemented and freely combined have been taken in each of the circuit
programs process one bit of data in each by programs for this Sony DSP. These blocks in this video DSP to assure that
instruction cycle. The parallel data functions include filter processing such it is optimal for resolution conversion.
transfers between the input buffer as bandwidth limiting, noise exclusion,
memories, the two processor sets, and and compensation calculations, and Future Developments
the output buffer memory are also image synthesis operations such as
performed 1 bit at a time. That is, the color matrix processing and chroma key Sony provides a macro assembler based
3000 instructions mentioned earlier re- processing. High-resolution video integrated program development envi-
ferred to bit instruction units. Although signals that require more that the 1080 ronment, and is now developing a
programming at this low level may pixels and thus more than 1080 proces- macro program library that will provide
seem inconvenient, it actually turns out sor elements can be handled easily by a wide range of video processing
to be a highly efficient programming connecting multiple chips due to the functions. Although the current version
technique in which no arithmetic unit excellent expandability of this design. of this chip achieves a performance of
cycles are wasted. Furthermore, many 4.3 GOPS, which is adequate to provide
operations can be written as word at least 120 multiply and accumulate
instructions and expanded to bit instruc- operations, Sony will continue to aim
tions by the macro assembler. Display Format for improved performance in future
Conversion Applications products.
To handle a wide variety of display
formats, such as DTV, and the VGA,
SVGA, and XGA formats used by
personal computers, applications must
be able to freely convert between these

■ Table 2 Main Characteristics


Process 0.4µm CMOS Triple-metal
Number of Transistors 600million
Package 208-pin QFP
Power Supply 3.3V
I/O Data 75MHz 3.3V
Input Data 64bits
Output Data 32bits

Product 1 (developed) Product 2 (developed)

(Announced at the 1996 ISSCC Conference.)

■ Figure 5 Chip Photographs

Вам также может понравиться