DOCUMENT

Liver Tumor Detection Using MatLab
A PROJECT REPORT ON
LIVER TUMOR
DETECTION USING
MATLAB
Dept. of ECE UCET Page 1

CHAPTER 1
INTRODUCTION TO IMAGE PROCESSING
1.1 IMAGE
An image is a two-dimensional picture, which has a similar appearance to some subject
usually a physical object or a person.
Image is a two-dimensional, such as a photograph, screen display, and as well as a three-

dimensional, such as a statue. They may be captured by optical devices—such as cameras,
mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the
human eye or water surfaces.
The word image is also used in the broader sense of any two-dimensional figure such as a
map, a graph, a pie chart, or an abstract painting. In this wider sense, images can also be
rendered manually, such as by drawing, painting, carving, rendered automatically by printing or
computer graphics technology, or developed by a combination of methods, especially in a
pseudo-photograph.
Figure 1.1: The Color & Gray scale Images.
An image is a rectangular grid of pixels. It has a definite height and a definite width counted in
pixels. Each pixel is square and has a fixed size on a given display. However different computer monitors
may use different sized pixels. The pixels that constitute an image are ordered as a grid (columns and
rows); each pixel consists of numbers representing magnitudes of brightness and color.

Figure 1.2: Pixel Segmentation of image.
Each pixel has a color. The color is a 32-bit integer. The first eight bits determine the redness of
the pixel, the next eight bits the greenness, the next eight bits the blueness, and the remaining eight bits
the transparency of the pixel.
• We can think of an image as a function, f, f: R2  R
– f (x, y) gives the intensity at position (x, y)
– Realistically, we expect the image only to be defined over a rectangle, with a

finite range: f: [a,b]x[c,d]  [0,1]
• A color image is just three functions pasted together. We can write this as a “vector-
valued” function:
 r ( x, y ) 
f ( x, y )   g ( x, y ) 
 
 b( x, y ) 

Figure 1.3: Image as a Function.
1.2 IMAGE FILE SIZES

Image file size is expressed as the number of bytes that increases with the number of pixels
composing an image, and the color depth of the pixels. The greater the number of rows and columns, the
greater the image resolution, and the larger the file. Also, each pixel of an image increases in size when its
color depth increases, an 8-bit pixel (1 byte) stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million
colors, the latter known as true color. Image compression uses algorithms to decrease the size of a file.
High resolution cameras produce large image files, ranging from hundreds of kilobytes to megabytes, per
the camera's resolution and the image-storage format capacity. High resolution digital cameras record 12
megapixel (1MP = 1,000,000 pixels / 1 million) images, or more, in true color. For example, an image

recorded by a 12 MP camera; since each pixel uses 3 bytes to record true color, the uncompressed image
would occupy 36,000,000 bytes of memory, a great amount of digital storage for one image, given that
cameras must record and store many images to be practical. Faced with large file sizes, both within the
camera and a storage disc, image file formats were developed to store such large images.
1.3 IMAGE FILE FORMATS
Image file formats are standardized means of organizing and storing images. This entry is about
digital image formats used to store photographic and other images. Image files are composed of either
pixel or vector (geometric) data that are rasterized to pixels when displayed (with few exceptions) in a
vector graphic display. Including proprietary types, there are hundreds of image file types. The PNG,
JPEG, and GIF formats are most often used to display images on the Internet.
In addition to straight image formats, Metafile formats are portable formats which can include
both raster and vector information. The metafile format is an intermediate format. Most Windows
applications open metafiles and then save them in their own native format.
1.3.1 RASTER FORMATS
These formats store images as bitmaps (also known as pixmaps: ).
 JPEG/JFIF
JPEG (Joint Photographic Experts Group) is a compression method. JPEG compressed images
are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is lossy
compression. Nearly every digital camera can save images in the JPEG/JFIF format, which supports 8
bits per color (red, green, blue) for a 24-bit total, producing relatively small files. Photographic

images may be better stored in a lossless non-JPEG format if they will be re-edited, or if small
"artifacts" are unacceptable. The JPEG/JFIF format also is used as the image compression algorithm
in many Adobe PDF files.
● EXIF
The EXIF (Exchangeable image file format) format is a file standard similar to the JFIF format with
TIFF extensions. It is incorporated in the JPEG writing software used in most cameras. Its purpose is to
record and to standardize the exchange of images with image metadata between digital cameras and
editing and viewing software. The metadata are recorded for individual images and include such things as
camera settings, time and date, shutter speed, exposure, image size, compression, name of camera, color
information, etc. When images are viewed or edited by image editing software, all of this image
information can be displayed
 TIFF
The TIFF (Tagged Image File Format) format is a flexible format that normally saves 8 bits or 16
bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually using either the TIFF or
TIF filename extension. TIFFs are lossy and lossless. Some offer relatively good lossless compression for
bi-level (black & white) images. Some digital cameras can save in TIFF format, using the LZW
compression algorithm for lossless storage. TIFF image format is not widely supported by web browsers.
TIFF remains widely accepted as a photograph file standard in the printing business. TIFF can handle
device-specific color spaces, such as the CMYK defined by a particular set of printing press inks.
● PNG
The PNG (Portable Network Graphics) file format was created as the free, open-source successor
to the GIF. The PNG file format supports true color (16 millioncolors).while the GIF supports only 256
colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format
is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of
photographic images, because JPG files are smaller than PNG files. PNG, an extensible file format for the
lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for
GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and true color images are
supported, plus an optional alpha channel. PNG is designed to work well in online viewing applications,
such as the World Wide Web. PNG is robust, providing both full file integrity checking and simple
detection of common transmission errors.

 GIF
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the
GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos
and cartoon style images. The GIF format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more effective when large areas have a single
color, and ineffective for detailed images or dithered images.
 BMP
The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS.
Typically, BMP files are uncompressed, hence they are large. The advantage is their simplicity and wide
acceptance in Windows programs.
1.4 IMAGE PROCESSING
Digital image processing, the manipulation of images by computer, is relatively recent

development in terms of man’s ancient fascination with visual stimuli. In its short history, it has been
applied to practically every type of images with varying degree of success. The inherent subjective appeal
of pictorial displays attracts perhaps a disproportionate amount of attention from the scientists and also
from the layman. Digital image processing like other glamour fields, suffers from myths, mis-connect
ions, mis-understandings and mis-information. It is vast umbrella under which fall diverse aspect of
optics, electronics, mathematics, photography graphics and computer technology. It is truly
multidisciplinary endeavor ploughed with imprecise jargon.
Several factor combine to indicate a lively future for digital image processing. A major factor is
the declining cost of computer equipment. Several new technological trends promise to further promote
digital image processing. These include parallel processing mode practical by low cost microprocessors,
and the use of charge coupled devices (CCDs) for digitizing, storage during processing and display and
large low cost of image storage array.

1.5 FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING
Figure 1.4: Fundamental steps of image processing.
1.5.1 Image Acquisition
Image Acquisition is to acquire a digital image. To do so requires an image sensor and the
capability to digitize the signal produced by the sensor. The sensor could be monochrome or color TV
camera that produces an entire image of the problem domain every 1/30 sec. the image sensor could also
be line scan camera that produces a single image line at a time. In this case, the objects motion past the
line.

Scanner produces a two-dimensional image. If the output of the camera or other imaging sensor is
not in digital form, an analog to digital converter digitizes it. The nature of the sensor and the image it
produces are determined by the application.
1.5.2 Image Enhancement
Image enhancement is among the simplest and most appealing areas of digital image processing.
Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or simply to
highlight certain features of interesting an image. A familiar example of enhancement is when we
increase the contrast of an image because “it looks better.” It is important to keep in mind that
enhancement is a very subjective area of image processing.

1.5.3 Image restoration
Image restoration is an area that also deals with improving the appearance of an image. However,
unlike enhancement, which is subjective, image restoration is objective, in the sense that restoration
techniques tend to be based on mathematical or probabilistic models of image degradation.
Figure 1.5 : The Basic Example for Image Enhancement.

Enhancement, on the other hand, is based on human subjective preferences regarding what
constitutes a “good” enhancement result. For example, contrast stretching is considered an enhancement
technique because it is based primarily on the pleasing aspects it might present to the viewer, where as
removal of image blur by applying a deblurring function is considered a restoration technique.
1.5.4 Color image processing
The use of color in image processing is motivated by two principal factors. First, color is a
powerful descriptor that often simplifies object identification and extraction from a scene. Second,
humans can discern thousands of color shades and intensities, compared to about only two dozen shades
of gray. This second factor is particularly important in manual image analysis.

1.5.5 Wavelets and multi resolution processing
Wavelets are the formation for representing images in various degrees of resolution. Although the
Fourier transform has been the mainstay of transform based image processing since the late1950’s, a more
recent transformation, called the wavelet transform, and is now making it even easier to compress,
transmit, and analyze many images. Unlike the Fourier transform, whose basis functions are sinusoids,
wavelet transforms are based on small values, called Wavelets, of varying frequency and limited duration.
Wavelets were first shown to be the foundation of a powerful new approach to signal processing
and analysis called Multi resolution theory. Multi resolution theory incorporates and unifies techniques
from a variety of disciplines, including sub band coding from signal processing, quadrature mirror
filtering from digital speech recognition, and pyramidal image processing.

1.5.6 Compression
Compression, as the name implies, deals with techniques for reducing the storage required saving
an image, or the bandwidth required for transmitting it. Although storage technology has improved
significantly over the past decade, the same cannot be said for transmission capacity. This is true
particularly in uses of the Internet, which are characterized by significant pictorial content. Image
compression is familiar to most users of computers in the form of image file extensions, such as the jpg
file extension used in the JPEG (Joint Photographic Experts Group) image compression standard.
1.5.7 Morphological processing
Morphological processing deals with tools for extracting image components that are useful in the
representation and description of shape. The language of mathematical morphology is set theory. As such,
morphology offers a unified and powerful approach to numerous image processing problems. Sets in
mathematical morphology represent objects in an image. For example, the set of all black pixels in a
binary image is a complete morphological description of the image.
In binary images, the sets in question are members of the 2-D integer space Z2, where each
element of a set is a 2-D vector whose coordinates are the (x,y) coordinates of a black(or white) pixel in
the image. Gray-scale digital images can be represented as sets whose components are in Z3. In this case,
two components of each element of the set refer to the coordinates of a pixel, and the third corresponds to
its discrete gray-level value.
1.5.8 Segmentation
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged

segmentation procedure brings the process a long way toward successful solution of imaging problems
that require objects to be identified individually.
On the other hand, weak or erratic segmentation algorithms almost always guarantee eventual
failure. In general, the more accurate the segmentation, the more likely recognition is to succeed.
1.5.9 Representation and description
Representation and description almost always follow the output of a segmentation stage, which
usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one
image region from another) or all the points in the region itself. In either case, converting the data to a
form suitable for computer processing is necessary. The first decision that must be made is whether the
data should be represented as a boundary or as a complete region. Boundary representation is appropriate
when the focus is on external shape characteristics, such as corners and inflections.
Regional representation is appropriate when the focus is on internal properties, such as texture or
skeletal shape. In some applications, these representations complement each other. Choosing a
representation is only part of the solution for transforming raw data into a form suitable for subsequent
computer processing. A method must also be specified for describing the data so that features of interest
are highlighted. Description, also called feature selection, deals with extracting attributes that result in
some quantitative information of interest or are basic for differentiating one class of objects from another.
1.5.10 Object recognition
The last stage involves recognition and interpretation. Recognition is the process that assigns a
label to an object based on the information provided by its descriptors. Interpretation involves assigning
meaning to an ensemble of recognized objects.

1.5.11 Knowledgebase
Knowledge about a problem domain is coded into image processing system in the form of a
knowledge database. This knowledge may be as simple as detailing regions of an image when the
information of interests is known to be located, thus limiting the search that has to be conducted in
seeking that information. The knowledge base also can be quite complex, such as an inter related to list of
all major possible defects in a materials inspection problem or an image data base containing high
resolution satellite images of a region in connection with change deletion application. In addition to
guiding the operation of each processing module, the knowledge base also controls the interaction
between modules. The system must be endowed with the knowledge to recognize the significance of the
location of the string with respect to other components of an address field. This knowledge glides not
only the operation of each module, but it also aids in feedback operations between modules through the
knowledge base. We implemented preprocessing techniques using MATLAB.
1.6 COMPONENTS OF AN IMAGE PROCESSING
As recently as the mid-1980s, numerous models of image processing systems being sold
throughout the world were rather substantial peripheral devices that attached to equally substantial host
computers. Late in the 1980s and early in the 1990s, the market shifted to image processing hardware in
the form of single boards designed to be compatible with industry standard buses and to fit into
engineering workstation cabinets and personal computers. In addition to lowering costs, this market shift
also served as a catalyst for a significant number of new companies whose specialty is the development of
software written specifically for image processing.
Figure 1.6:Components of image processing system

Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward miniaturizing and blending
of general-purpose small computers with specialized image processing hardware. Figure 1.6 shows the
basic components comprising a typical general-purpose system used for digital image processing. The
function of each component is discussed in the following paragraphs, starting with image sensing.
 Image sensors
With reference to sensing, two elements are required to acquire digital images. The first is a
physical device that is sensitive to the energy radiated by the object we wish to image. The second, called
a digitizer, is a device for converting the output of the physical sensing device into digital form. For
instance, in a digital video camera, the sensors produce an electrical output proportional to light intensity.
The digitizer converts these outputs to digital data.
 Specialized image processing hardware
Specialized image processing hardware usually consists of the digitizer just mentioned, plus
hardware that performs other primitive operations, such as an arithmetic logic unit (ALU), which
performs arithmetic and logical operations in parallel on entire images. One example of how an ALU is
used is in averaging images as quickly as they are digitized, for the purpose of noise reduction. This type
of hardware sometimes is called a front-end subsystem, and its most distinguishing characteristic is speed.
In other words, this unit performs functions that require fast data throughputs (e.g., digitizing and
averaging video images at 30 frames) that the typical main computer cannot handle.
 Computer
The computer in an image processing system is a general-purpose computer and can range from a
PC to a supercomputer. In dedicated applications, sometimes specially designed computers are used to
achieve a required level of performance, but our interest here is on general-purpose image processing
systems. In these systems, almost any well-equipped PC-type machine is suitable for offline image
processing tasks.
 Image processing software
Software for image processing consists of specialized modules that perform specific tasks. A
well-designed package also includes the capability for the user to write code that, as a minimum, utilizes

the specialized modules. More sophisticated software packages allow the integration of those modules
and general-purpose software commands from at least one computer language.
 Mass storage
Mass storage capability is a must in image processing applications. An image of size 1024*1024
pixels, in which the intensity of each pixel is an 8-bit quantity, requires one megabyte of storage space if
the image is not compressed. When dealing with thousands, or even millions, of images, providing
adequate storage in an image processing system can be a challenge. Digital storage forimage processing
applications fall into three principal categories: (1) short-term storage for use during processing, (2) on-
line storage for relatively fast recall, and (3) archival storage, characterized by infrequent access. Storage
is measured in bytes (eight bits), Kbytes (one thousand bytes), Mbytes (one million bytes), Gbytes
(meaning giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion, bytes).One method of
providing short-term storage is computer memory. Another is by specialized boards, called frame buffers
that store one or more images and can be accessed rapidly, usually at video rates. The latter method
allows virtually instantaneous image zoom, as well as scroll (vertical shifts) and pan (horizontal shifts).
Frame buffers usually are housed in the specialized image processing hardware unit . Online storage
generally takes the form of magnetic disks or optical-media storage. The key factor characterizing on-line
storage is frequent access to the stored data. Finally, archival storage is characterized by massive storage
requirements but infrequent need for access. Magnetic tapes and optical disks housed in “jukeboxes” are
the usual media for archival applications.
 Image displays
Image displays in use today are mainly color (preferably flat screen) TV monitors. Monitors are
driven by the outputs of image and graphics display cards that are an integral part of the computer system.
Seldom are there requirements for image display applications that cannot be met by display cards
available commercially as part of the computer system. In some cases, it is necessary to have stereo
displays, and these are implemented in the form of headgear containing two small displays embedded in
goggles worn by the user.
 Hardcopy
Hardcopy devices for recording images include laser printers, film cameras, heat-sensitive devices,
inkjet units, and digital units, such as optical and CD-ROM disks. Film provides the highest possible
resolution, but paper is the obvious medium of choice for written material. For presentations, images are

displayed on film transparencies or in a digital medium if image projection equipment is used. The latter
approach is gaining acceptance as the standard for image presentations.
Network
Networking is almost a default function in any computer system in use today. Because of the large
amount of data inherent in image processing applications, the key consideration in image transmission is
bandwidth. In dedicated networks, this typically is not a problem, but communications with remote sites
via the Internet are not always as efficient. Fortunately, this situation is improving quickly as a result of
optical fiber and other broadband technologies.

CHAPTER 2
IMAGE SEGMENTATION
2.1 What is Image Segmentation?
Segmentation is a process of that divides the images into its regions or objects that have
similar features or characteristics. Some examples of image segmentation are
1. In automated inspection of electronic assemblies, presence or absence of specific objects can

be determined by analyzing images.
2. Analyzing aerial photos to classify terrain into forests, water bodies etc.
3. Analyzing MRI and X-ray images in medicine for classify the body organs.
Some figures which show segmentation are
Figure 1.7: Image Segmentation.
Segmentation has no single standard procedure and it is very difficult in non-trivial

images. The extent to which segmentation is carried out depends on the problem Specification.
Segmentation algorithms are based on two properties of intensity values- discontinuity and
similarity. First category is to partition an image based on the abrupt changes in the intensity and
the second method is to partition the image into regions that are similar according to a set of
predefined criteria.

In this report some of the methods for determining the discontinuity will be discussed
and also other segmentation methods will be attempted. Three basic techniques for detecting the
gray level discontinuities in a digital images points, lines and edges. The other segmentation
technique is the thresholding. It is based on the fact that different types of functions can be
classified by using a range functions applied to the intensity value of image pixels. The main
assumption of this technique is that different objects will have distinct frequency distribution and
can be discriminated on the basis of the mean and standard deviation of each distribution.
Segmentation on the third property is region processing. In this method an attempt is

made to partition or group regions according to common image properties. These image
properties consist of Intensity values from the original image, texture that are unique to each type
of region and spectral profiles that provide multidimensional image data.A very brief
introduction to morphological segmentation will also be given. This method combines most of
the positive attributes of the other image segmentation methods.
2.2 Segmentation using discontinuities

Several techniques for detecting the three basic gray level discontinuities in a digital
image are points, lines and edges. The most common way to look for discontinuities is by spatial
filtering methods.Point detection idea is to isolate a point which has gray level significantly
different form its background.
w1=w2=w3=w4=w6=w7=w8=w9 =-1, w5 = 8.Response is R = w1z1+w2z2……+w9z9, where z

is the gray level of the pixel.Based on the response calculated from the above equation we can
find out the points desired.

Line detection is next level of complexity to point detection and the lines could be vertical,
horizontal or at +/- 45 degree angle.
Responses are calculated for each of the mask above and based on the value we can detect if the
lines and their orientation.
Edge detection
The edge is a regarded as the boundary between two objects (two dissimilar regions) or
perhaps a boundary between light and shadow falling on a single surface. To find the differences
in pixel values between regions can be computed by considering gradients. The edges of an
image hold much information in that image. The edges tell where objects are, their shape and
size, and something about their texture. An edge is where the intensity of an image moves from a
low value to a high value or vice versa.
There are numerous applications for edge detection, which is often used for various
special effects. Digital artists use it to create dazzling image outlines. The output of an edge
detector can be added back to an original image to enhance the edges. Edge detection is often the
first step in image segmentation. Image segmentation, a field of image analysis, is used to group
pixels into regions to determine an image's composition. A common example of image
segmentation is the "magic wand" tool in photo editing software. This tool allows the user to
select a pixel in an image. The software then draws a border around the pixels of similar value.
The user may select a pixel in a sky region and the magic wand would draw a border around the
complete sky region in the image. The user may then edit the color of the sky without worrying
about altering the color of the mountains or whatever else may be in the image.

Edge detection is also used in image registration. Image registration aligns two images that may
have been acquired at separate times or from different sensors.
Figure 1.8: Different edge profiles.
There is an infinite number of edge orientations, widths and shapes (Figure 1.8). Some
edges are straight while others are curved with varying radii. There are many edge detection
techniques to go with all these edges, each having its own strengths. Some edge detectors may
work well in one application and perform poorly in others. Sometimes it takes experimentation
to determine what the best edge detection technique for application is.The simplest and quickest
edge detectors determine the maximum value from a series of pixel subtractions. The
homogeneity operator subtracts each 8 surrounding pixels from the center pixel of a 3 x 3
window as in Figure 1.9. The output of the operator is the maximum of the absolute value of
each difference.
Figure 1.9: How the homogeneity operator works.
new pixel = maximum{½ 1111½ , ½ 1113½ , ½ 1115½ , ½ 1116½ ,½ 1111½ ,

½ 1116½ ,½ 1112½ ,½ 1111½ } = 5

Similar to the homogeneity operator is the difference edge detector. It operates more quickly
because it requires four subtractions per pixel as opposed to the eight needed by the homogeneity
operator. The subtractions are upper left  lower right, middle left  middle right, lower left 
upper right, and top middle  bottom middle (Figure 1.10).
Figure 1.10: How the difference operator works.
new pixel = maximum{½ 1111½ , ½ 1312½ , ½ 1516½ , ½ 1116½ } = 5
First order derivative for edge detection
If we are looking for any horizontal edges it would seem sensible to calculate the difference
between one pixel value and the next pixel value, either up or down from the first (called the
crack difference), i.e. assuming top left origin
Hc = y_difference(x, y) = value(x, y) – value(x, y+1)
In effect this is equivalent to convolving the image with a 2 x 1 template
Likewise
Hr = X_difference(x, y) = value(x, y) – value(x – 1, y) uses the template –1 1

Hc and Hr are column and row detectors. Occasionally it is useful to plot both
X_difference and Y_difference, combining them to create the gradient magnitude (i.e. the
strength of the edge). Combining them by simply adding them could mean two edges canceling
each other out (one positive, one negative), so it is better to sum absolute values (ignoring the
sign) or sum the squares of them and then, possibly, take the square root of the result.
It is also to divide the Y_difference by the X_difference and identify a gradient direction
(the angle of the edge between the regions)
The amplitude can be determine by computing the sum vector of Hc and Hr
Sometimes for computational simplicity, the magnitude is computed as
The edge orientation can be found by
In real image, the lines are rarely so well defined, more often the change between regions is
gradual and noisy.The following image represents a typical read edge. A large template is needed
to average at the gradient over a number of pixels, rather than looking at two only

2.3 Sobel edge detection
The Sobel operator is more sensitive to diagonal edges than vertical and horizontal edges.
The Sobel 3 x 3 templates are normally given as
X-direction
Y-direction
Original image
absA + absB

Threshold at 12
Other first order operation
The Roberts operator has a smaller effective area than the other mask, making it more
susceptible to noise.
The Prewit operator is more sensitive to vertical and horizontal edges than diagonal edges.
The Frei-Chen mask
In many applications, edge width is not a concern. In others, such as machine vision, it is
a great concern. The gradient operators discussed above produce a large response across an area
where an edge is present. This is especially true for slowly ramping edges. Ideally, an edge
detector should indicate any edges at the center of an edge. This is referred to as localization. If
an edge detector creates an image map with edges several pixels wide, it is difficult to locate the
centers of the edges. It becomes necessary to employ a process called thinning to reduce the edge
width to one pixel. Second order derivative edge detectors provide better edge localization.

Example. In an image such as
The basic Sobel vertical edge operator (as described above) will yield a value right across the
image. For example if
is used then the results is
Implementing the same template on this "all eight image" would yield
This is not unlike the differentiation operator to a straight line, e.g. if y = 3x-2.
Once we have gradient, if the gradient is then differentiated and the result is zero, it
shows that the original line was straight.Images often come with a gray level "trend" on them,
i.e. one side of a regions is lighter than the other, but there is no "edge" to be discovered in the
region, the shading is even, indicating a light source that is stronger at one end, or a gradual color
change over the surface.

Another advantage of second order derivative operators is that the edge contours detected
are closed curves. This is very important in image segmentation. Also, there is no response to
areas of smooth linear variations in intensity.
The Laplacian is a good example of a second order derivative operator. It is distinguished

from the other operators because it is omnidirectional. It will highlight edges in all directions.
The Laplacian operator will produce sharper edges than most other techniques. These highlights
include both positive and negative intensity slopes.
The edge Laplacian of an image can be found by convolving with masks such as
or
The Laplacian set of operators is widely used. Since it effectively removes the general
gradient of lighting or coloring from an image it only discovers and enhances much more
discrete changes than, for example, the Sobel operator. It does not produce any information on
direction which is seen as a function of gradual change. It enhances noise, though larger
Laplacian operators and similar families of operators tend to ignore noise.
2.4 Determining zero crossings
The method of determining zero crossings with some desired threshold is to pass a 3 x 3
window across the image determining the maximum and minimum values within that window. If
the difference between the maximum and minimum value exceed the predetermined threshold,
an edge is present. Notice the larger number of edges with the smaller threshold. Also notice that
the width of all the edges are one pixel wide.
A second order derivative edge detector that is less susceptible to noise is the Laplacian
of Gaussian (LoG). The LoG edge detector performs Gaussian smoothing before application of
the Laplacian. Both operations can be performed by convolving with a mask of the form

where x, y present row and column of an image, s is a value of dispersion that controls the
effective spread.
Due to its shape, the function is also called the Mexican hat filter. Figure e4 shows the cross
section of the LoG edge operator with different values of s. The wider the function, the wider the
edge that will be detected. A narrow function will detect sharp edges and more detail.
Figure 1.11: Cross selection of LoG with various s.
The greater the value of s, the wider the convolution mask necessary. The first zero
crossing of the LoG function is at . The width of the positive center lobe is twice that. To
have a convolution mask that contains the nonzero values of the LoG function requires a width
three times the width of the positive center lobe.
Edge detection based on the Gaussian smoothing function reduces the noise in an image.
That will reduce the number of false edges detected and also detects wider edges.
Most edge detector masks are seldom greater than 7 x 7. Due to the shape of the LoG
operator, it requires much larger mask sizes. The initial work in developing the LoG operator
was done with a mask size of 35 x 35.
Because of the large computation requirements of the LoG operator, the Difference of
Gaussians (DoG) operator can be used as an approximation to the LoG. The DoG can be shown
as

The DoG operator is performed by convolving an image with a mask that is the result of
subtracting two Gaussian masks with different a values. The ratio s 1/s 2 = 1.6 results in a good
approximation of the LoG. Figure 1.12 compares a LoG function (s = 12.35) with a DoG
function (s1 = 10, s2 = 16).
Figure 1.12: LoG vs. DoG functions.
One advantage of the DoG is the ability to specify the width of edges to detect by varying the
values of s1 and s2. Here are a couple of sample masks. The 9 x 9 mask will detect wider edges
than the 7x7 mask.
For 7x7 mask, try

For 9 x 9 mask, try
2.5 Segmentation using thresholding.
Thresholding is based on the assumption that the histogram is has two dominant modes,
like for example light objects and an dark background. The method to extract the objects will be
to select a threshold F(x,y)= T such that it separates the two modes. Depending on the kind of
problem to be solved we could also have multilevel thresholding. Based on the region of
thresholding we could have global thresholding and local thresholding. Where global
thresholding is considering the function for the entire image and local thresholding involving
only a certain region. In addition to the above mentioned techniques that if the thresholding
function T depends on the spatial coordinates then it is known as the dynamic or adaptive
thresholding.
Let us consider a simple example to explain thresholding.

Figure 1.13: for hypothetical frequency distribution of intensity values for fat, muscle and
bone.
A hypothetical frequency distribution f(I) of intensity values I(x,y) for fat, muscle and bone, in a
CT image. Low intensity values correspond to fat tissues, whereas high intensity values
correspond to bone. Intermediate intensity values correspond to muscle tissue. F+ and F- refer to
the false positives and false negatives; T+ and T- refer to the true positives and true negatives.
2.6 Basic global thresholding technique:
In this technique the entire image is scanned by pixel after pixel and hey is labeled as
object or the background, depending on whether the gray level is greater or lesser than the
thresholding function T. The success depends on how well the histogram is constructed. It is
very successful in controlled environments, and finds its applications primarily in the industrial
inspection area.

The algorithm for global thresholding can be summarized in a few steps.
Select an initial estimate for T.
2) Segment the image using T. This will produce two groups of pixels. G1 consisting of all
pixels with gray level values >T and G2 consisting of pixels with values <=T.
3) Compute the average gray level values mean1 and mean2 for the pixels in regions G1 and G2.
4) Compute a new threshold value T=(1/2)(mean1 +mean2).
5) Repeat steps 2 through 4 until difference in T in successive iterations is smaller than a

predefined parameter T0.
Basic adaptive thresholding technique:
Images having uneven illumination make it difficult to segment using the histogram. In
this case we have to divide the image in many sub images and then come up with different
threshold to segment each sub image. The key issues are how to divide the image into sub
images and utilize a different threshold to segment each sub image.

The major drawback to threshold-based approaches is that they often lack the sensitivity and
specificity needed for accurate classification.
Optimal thresholding technique:

In the above two sections we described what global and adaptive thresholding mean.
Below we illustrate how to obtain minimum segmentation error.
Let us consider an image with 2 principle gray levels regions. Let z denote the gray level
values. Values as random quantities and their histogram may be considered an estimate of
probability P(z). Overall density function is the sum or mixture of two densities, one of them is
for the light and other is for the dark region.The total probability density function is the P(z) = P1
p1(z)+P2 p2(z) ,Where P1 and P2 are the probabilities of the pixel (random).
P1 +P2 =1.
The overall error of probability is E(T) = P2 E1(T) + P1 E2(T), where E1 and E2 are the
probability of occurrence of object or background pixels. We need to find the threshold value of

the error E(T) , so by differentiating E w.r.t T we obtain P1p1(T) = P2p2(T).So we can use
Gaussian probability density functions and obtain the value of T.
T = (μ1+μ2)/2 + (σ^2/(μ1-μ2))*ln(P2/P1).
Where μ and σ^2 are the mean and variance of the Gaussian function for the object of a class.
The other method for finding the minimum error is finding the mean square error , to
estimate the gray-level PDF of an image from image histogram.
Ems = (1/n)* (∑ (p(zi) –h(zi))^2 ) for i= 1 to n.

Where n is the number of points in the histogram. The important assumption is that either one of
the objects or both are considered. Probability of classifying the objects and background is
classified erroneously.
Region based segmentation.

We have seen two techniques so far. One dealing with the gray level value and other with
the thresholds. In this section we will concentrate on regions of the image.
Formulation of the regions:
An entire image is divided into sub regions and they must be in accordance to some rules
such as
1. Union of sub regions is the region
2. All are connected in some predefined sense.

3. No to be same, disjoint
4. Properties must be satisfied by the pixels in a segmented region P(Ri)=true if all pixels have
same gray level.
5. Two sub regions should have different sense of predicate.
Segmentation by region splitting and merging:
The basic idea of splitting is, as the name implies, to break the image into many disjoint
regions which are coherent within themselves. Take into consideration the entire image and then
group the pixels in a region if they satisfy some kind of similarity constraint. This is like a divide
and conquers method.
Figure 1.14: Image tree split –merge.

Segmentation by region growing

Region growing approach is the opposite of split and merges.
1. An initial set of small area are iteratively merged based on similarity of constraints.
2. Start by choosing an arbitrary pixel and compared with the neighboring pixel.
3. Region is grown from the seed pixel by adding in neighboring pixels that are similar,
increasing the size of the region.
4 When the growth of one region stops we simply choose another seed pixel which does not yet
belong to any region and start again.
5 This whole process is continued until all pixels belong to some region.
6 A bottom up method.
Some of the undesirable effects of the region growing are .
 Current region dominates the growth process -- ambiguities around edges of adjacent
regions may not be resolved correctly.
 Different choices of seeds may give different segmentation results.
 Problems can occur if the (arbitrarily chosen) seed point lies on an edge.
However starting with a particular seed pixel and letting this region grow completely before
trying other seeds biases the segmentation in favor of the regions which are segmented first.
To counter the above problems, simultaneous region growing techniques have been developed.

 Similarities of neighbouring regions are taken into account in the growing process.
 No single region is allowed to completely dominate the proceedings.
 A number of regions are allowed to grow at the same time.
o similar regions will gradually coalesce into expanding regions.
 Control of these methods may be quite complicated but efficient methods have been
developed.
 Easy and efficient to implement on parallel computers.
Segmentation by Morphological watersheds:
This method combines the positive aspects of many of the methods discussed earlier. The
basic idea to embody the objects in “watersheds” and the objects are segmented. Below only the
basics of this method is illustrated without going into greater details.
The concept of watersheds:

It is the idea of visualizing an image in 3D. 2 spatila versus gray levels. So all points in such a
topology are either
1. belonging to regional minimum.
2. all with certain to a single minimum.
3. equal to two points where more than one minimum
A particular region is called watershed if it is a region minimum satisfying certain conditions.

Watershed lines: Simple if we have a hole and water is poured at a constant rate. The
level of water rises and fills the region uniformly. When the regions are about to merge with the
remaining regions we build dams. Dams are boundaries. The idea is more clearly illustrated with
the help of diagrams. The heights of the structures are proportional to the gray level intensity.
Also the entire structure is enclosed by the height of the dam greatest of the dam height. In the
last figure we can see that the water almost fills the dams out, until the highest level of the gray

level in the images researched. The final dam corresponds to the watershed lines which are the
desired segmentation result.The principle applications of the method are in the extraction of
uniform objects from the background. Regions are characterized by small variations in gray
levels, have small gradient values. So it is applied to the gradient than the image. Region with
minimum correlated with the small value of gradient corresponding to the objects of interest.
Use of Motion in segmentation:
Motion of objects can be very important tool to exploit when the background detail is
irrelevant. This technique is very common in sensing applications.Let us consider two image
frames at time t1 and t2, f(x,y,t1) and f(x,y,t2) and compare them pixel to pixel. One method to
compare is to take the difference of the pixels
D12(x,y) = 1 if | f(x,y,t1) – f(x,y,t2)| > T,

= 0 otherwise.
Where T is a threshold value.

This threshold is to signify that only when the there is a appreciable change in the gray
level, the pixels are considered to be different. In dynamic image processing the D12 has value
set to 1 when the pixels are different; to signify the objects are in motion.
Image segmentation using edge flow techniques:
A region-based method usually proceeds as follows: the image is partitioned into
connected regions by grouping neighboring pixels of similar intensity levels. Adjacent regions
are then merged under some criterion involving perhaps homogeneity or sharpness of region
boundaries.
Over stringent criteria create fragmentation; lenient ones overlook blurred boundaries and
over-merge. Hybrid techniques using a mix of the methods above are also popular.A
connectivity-preserving relaxation-based segmentation method, usually referred to as the active
contour model, were proposed recently. The main idea is to start with some initial boundary
shape represented in the form of spline curves, and iteratively modifies it by applying various
shrink/expansion operations according to some energy function. Although the energy-

minimizing model is not new, coupling it with the maintenance of an ``elastic'' contour model
gives it an interesting new twist. As usual with such methods, getting trapped into a local
minimum is a risk against which one must guard; this is no easy task. In the authors create a
combined method that integrates the edge flow vector field to the curve evolution framework.
Theory and algorithm of Edge flow and curve evolution:
Active contours and curve evolution methods usually define an initial contour C0 and
deform it towards the object boundary. The problem is usually formulated using partial
differential equations (PDE). Curve evolution methods can utilize edge information, regional
properties or a combination of them. Edge-based active contours try to fit an initial closed
contour to an edge function generated from the original image. The edges in this edge function
are not connected, so they don't identify regions by themselves. An initial closed contour is
slowly modified until it fits on the nearby edges.
Let C(ϕ ):[0,1] →R2 be a parameterization of a 2-D closed curve. A fairly general curve
evolution can be written as:
where κ is the curvature of the curve, N is the normal vector to the curve, , α β are
constants, and S is an underlying velocity field whose direction and strength depend on the time
and position but not on the curve front itself. This equation will evolve the curve in the normal
direction. The first term is a constant speed parameter that expands or shrinks the curve, second
term uses the curvature to make sure that the curve stays smooth at all times and the third term
guides the curve according to an independent velocity field. In their independent and parallel
works, Caselles et al.and Malladi et al. initialize a small curve inside one of the object regions
and let the curve evolve until it reaches the object boundary. The evolution of the curve is
controlled by the local gradient. This can be formulated by modifying (1) as:
(2)

where , F ε are constants, and g = 1/(1+ ∇ I ) . I is the Gaussian smoothed image. This is a pure
geometric approach and the edge function, g, is the only connection to the image.
Edge flow image segmentation is a recently proposed method that is based on filtering
and vector diffusion techniques. Its effectiveness has been demonstrated on a large class of
images. It features multiscale capabilities and uses multiple image attributes such as intensity,
texture or color. As a first step, a vector field is defined on the pixels of the image grid. At each
pixel, the vector’s direction is oriented towards the closest image discontinuity at a predefined
scale. The magnitude of the vectors depends on the strength and the distance of the discontinuity.
After generating this vector field, a vector diffusion algorithm is applied to detect the edges. This
step is followed by edge linking and region merging to achieve a partitioning of the image.
Two key components shaping the curve evolution are the edge function g and the external
force field F. The purpose of the edge function is to stop or slow down the evolving contour
when it is close to an edge. So g is defined to be close to 0 on the edges and 1 on homogeneous
areas. The external force vectors F ideally attract the active contour towards the boundaries. At
each pixel, the force vectors point towards the closest object boundary on the image.

Summary
Image segmentation forms the basics of pattern recognition and scene analysis problems. The
segmentation techniques are numerous in number but the choice of one technique over the other
depends only on the application or requirements of the problem that is being considered. In this
report we have considered illustrating a few techniques. But the numbers of techniques are so
large they cannot be all addressed.

CHAPTER 3
IMAGE CLUSTERING
3.1 Introduction
Computer vision tries to understand scene with the help of image processing and machine
learning. Image processing is a process an image with the help of a processor. Image processing
manipulates an image to analyze, to better understand, and to achieve required results. The centre
of attention of digital image processing is to make a digital system with the help of efficient
algorithm and techniques which is capable of processing an image. The input of that type of a
system is an image and output of the system is an image or an attribute of an image. The biggest
example of this is widely used image processing software, Adobe Photoshop. A Digital Image
Processing (DIP) is a process to deal with an image, in which an image is transformed into a
digital image using some operations, algorithms, and techniques. DIP is used to achieve
enhanced and improved image and get relevant information from an image which can be used for
further analyzes. Digital image processing algorithm used to remove noise, prepare images to
display, convert a signal into the digital image from an image sensor, compress image to storage
and transmission, to resize, to extract an image etc.
Digital image processing has many methods to process an image. Segmentation is one of the
methods which have used by image processing to deal with an image. In segmentation, an image
is a partition into multiple parts. Image segmentation is used to identify boundaries and objects in
an image. The main objective of segmentation is to change the representation of an image which
helps to make an image simple, more meaningful, significant, and easy to analyze. Segmentation
has mainly two objectives: first objective is to divide an image into parts for further better
analysis and the second objective is to process an image to change its representation. Image
segmentation uses many techniques to perform segmentation on an image. Image segmentation
refers to break an image into two or more than two regions. These regions are similar in
characteristics such as intensity, texture, color etc. Real life applications of segmentation are
range from computer graphics, object identification, criminal investigation, satellite images
(roads, forest etc.), airport security system, MPEG-4 video object (VO) segmentation, medical
imaging applications (locate tumor, computer guided surgery etc.) etc. The field of image
segmentation attracts people to research. Thousands of techniques are developed over the years,

but there is not even a single technique which applies to all types of images for segmentation.
Based on different techniques, segmentation techniques basically divided into two categories,
these are: • Detecting discontinuities: In detecting discontinuities, partition an image is based on
sudden changes in intensity. Edge detection is an example of detecting discontinuities. Detecting
similarities: In detecting similarities, partition an image is based on similarity depending on a
predefined criterion. Some techniques which are based on similarities are thresholding, region
growing, region splitting, and merging.
Segmentation methods divide an image into multiple parts to get useful information.
Segmentation has used many techniques to achieve region of interest. Edge based, region based,
thresholding, matching, fuzzy based, k-nearest neighbour and k-mean techniques etc. are some
techniques which are used by segmentation. K nearest neighbour and K-mean techniques are
discussed in this paper. K-nearest neighbour technique mainly based on nearest neighbour
classification technique. K-mean technique decides a number of the cluster to segment an image.
What is Clustering?
Clustering can be considered the most important unsupervised learning problem; so, as
every other problem of this kind, it deals with finding a structure in a collection of unlabeled
data.
A loose definition of clustering could be “the process of organizing objects into groups whose
members are similar in some way”. A cluster is therefore a collection of objects which are
“similar” between them and are “dissimilar” to the objects belonging to other clusters. We can
show this with a simple graphical example:

In this case we easily identify the 4 clusters into which the data can be divided; the similarity
criterion is distance: two or more objects belong to the same cluster if they are “close” according
to a given distance (in this case geometrical distance). This is called distance-based clustering.
Another kind of clustering is conceptual clustering: two or more objects belong to the same
cluster if this one defines a concept common to all that objects. In other words, objects are
grouped according to their fit to descriptive concepts, not according to simple similarity
measures.
The Goals of Clustering
So, the goal of clustering is to determine the intrinsic grouping in a set of unlabeled data.
But how to decide what constitutes a good clustering? It can be shown that there is no absolute
“best” criterion which would be independent of the final aim of the clustering. Consequently, it
is the user which must supply this criterion, in such a way that the result of the clustering will
suit their needs.
For instance, we could be interested in finding representatives for homogeneous groups (data
reduction), in finding “natural clusters” and describe their unknown properties (“natural” data
types), in finding useful and suitable groupings (“useful” data classes) or in finding unusual data
objects (outlier detection).

Possible Applications
Clustering algorithms can be applied in many fields, for instance:
 Marketing: finding groups of customers with similar behavior given a large database of
customer data containing their properties and past buying records;
 Biology: classification of plants and animals given their features;
 Libraries: book ordering;
 Insurance: identifying groups of motor insurance policy holders with a high average
claim cost; identifying frauds;
 City-planning: identifying groups of houses according to their house type, value and
geographical location;
 Earthquake studies: clustering observed earthquake epicenters to identify dangerous
zones;
 WWW: document classification; clustering weblog data to discover groups of similar
access patterns.
Requirements
The main requirements that a clustering algorithm should satisfy are:
 scalability;
 dealing with different types of attributes;
 discovering clusters with arbitrary shape;
 minimal requirements for domain knowledge to determine input parameters;
 ability to deal with noise and outliers;
 insensitivity to order of input records;
 high dimensionality;
 Interpretability and usability.
Problems
There are a number of problems with clustering. Among them:
 current clustering techniques do not address all the requirements adequately (and
concurrently);

 dealing with large number of dimensions and large number of data items can be
problematic because of time complexity;
 the effectiveness of the method depends on the definition of “distance” (for distance-
based clustering);
 if an obvious distance measure doesn’t exist we must “define” it, which is not always
easy, especially in multi-dimensional spaces;
 the result of the clustering algorithm (that in many cases can be arbitrary itself) can be
interpreted in different ways.
3.2 Clustering Algorithms
Classification
Clustering algorithms may be classified as listed below:
 K-means Clustering
 Exclusive Clustering
 Overlapping Clustering
 Hierarchical Clustering
 Probabilistic Clustering
In the first case data are grouped in an exclusive way, so that if a certain datum belongs to a
definite cluster then it could not be included in another cluster. A simple example of that is
shown in the figure below, where the separation of points is achieved by a straight line on a bi-
dimensional plane.
On the contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data, so
that each point may belong to two or more clusters with different degrees of membership. In this
case, data will be associated to an appropriate membership value.

Instead, a hierarchical clustering algorithm is based on the union between the two nearest
clusters. The beginning condition is realized by setting every datum as a cluster. After a few
iterations it reaches the final clusters wanted. Finally, the last kind of clustering uses a
completely probabilistic approach.
In this we have to discuss about four of the most used clustering algorithms:
 K-means
 Fuzzy C-means
 Hierarchical clustering
 Mixture of Gaussians
Each of these algorithms belongs to one of the clustering types listed above. So that, K-means is
an exclusive clustering algorithm, Fuzzy C-means is an overlapping clustering algorithm,
Hierarchical clustering is obvious and lastly Mixture of Gaussian is a probabilistic clustering
algorithm. We will discuss about each clustering method in the following paragraphs.
Distance Measure
An important component of a clustering algorithm is the distance measure between data

points. If the components of the data instance vectors are all in the same physical units then it is
possible that the simple Euclidean distance metric is sufficient to successfully group similar data
instances. However, even in this case the Euclidean distance can sometimes be misleading.
Figure shown below illustrates this with an example of the width and height measurements of an

object. Despite both measurements being taken in the same physical units, an informed decision
has to be made as to the relative scaling. As the figure shows, different scalings can lead to
different clusterings.Notice however that this is not only a graphic issue: the problem arises from
the mathematical formula used to combine the distances between the single components of the
data feature vectors into a unique distance measure that can be used for clustering purposes:
different formulas leads to different clusterings.Again, domain knowledge must be used to guide
the formulation of a suitable distance measure for each particular application.
Minkowski Metric
For higher dimensional data, a popular measure is the Minkowski metric, where d is the
dimensionality of the data. The Euclidean distance is a special case where p=2, while Manhattan
metric has p=1. However, there are no general theoretical guidelines for selecting a measure for
any given application.
It is often the case that the components of the data feature vectors are not immediately
comparable. It can be that the components are not continuous variables, like length, but nominal
categories, such as the days of the week. In these cases again, domain knowledge must be used to
formulate an appropriate measure.
K-means Clustering Algorithm
What is k-Means Clustering?
K-means clustering, or Lloyd’s algorithm, is an iterative, data-partitioning algorithm that

assigns n observations to exactly one of k clusters defined by centroids, where k is chosen before
the algorithm starts.
K-means (Mac Queen, 1967) is one of the simplest unsupervised learning algorithms that solve
the well known clustering problem. The procedure follows a simple and easy way to classify a
given data set through a certain number of clusters (assume k clusters) fixed a priori. The main
idea is to define k centroids, one for each cluster. These centroids should be placed in a cunning
way because of different location causes different result. So, the better choice is to place them as
much as possible far away from each other. The next step is to take each point belonging to a
given data set and associate it to the nearest centroid. When no point is pending, the first step is

completed and an early groupage is done. At this point we need to re-calculate k new centroids
as barycenters of the clusters resulting from the previous step. After we have these k new
centroids, a new binding has to be done between the same data set points and the nearest new
centroid. A loop has been generated. As a result of this loop we may notice that the k centroids
change their location step by step until no more changes are done. In other words centroids do
not move any more.Finally, this algorithm aims at minimizing an objective function, in this case
a squared error function. The objective function
Where is a chosen distance measure between a data point and the cluster centre
, is an indicator of the distance of the n data points from their respective cluster centres.
The algorithm is composed of the following steps:
1. Place K points into the space represented by the objects that are being
clustered. These points represent initial group centroids.
2. Assign each object to the group that has the closest centroid.
3. When all objects have been assigned, recalculate the positions of the K
centroids.
4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a
separation of the objects into groups from which the metric to be minimized
can be calculated.
Although it can be proved that the procedure will always terminate, the k-means algorithm does
not necessarily find the most optimal configuration, corresponding to the global objective
function minimum. The algorithm is also significantly sensitive to the initial randomly selected
cluster centres. The k-means algorithm can be run multiple times to reduce this effect.K-means is

a simple algorithm that has been adapted to many problem domains. As we are going to see, it is
a good candidate for extension to work with fuzzy feature vectors.
An example
Suppose that we have n sample feature vectors x1, x2, ..., xn all from the same class, and we
know that they fall into k compact clusters, k < n. Let mi be the mean of the vectors in cluster i.
If the clusters are well separated, we can use a minimum-distance classifier to separate them.
That is, we can say that x is in cluster i if || x - mi || is the minimum of all the k distances. This
suggests the following procedure for finding the k means:
 Make initial guesses for the means m1, m2, ..., mk

 Until there are no changes in any mean
o Use the estimated means to classify the samples into clusters
o For i from 1 to k
 Replace mi with the mean of all of the samples for cluster i
o end_for
 end_until
Here is an example showing how the means m1 and m2 move into the centers of two clusters.
Remarks
This is a simple version of the k-means procedure. It can be viewed as a greedy algorithm for
partitioning the n samples into k clusters so as to minimize the sum of the squared distances to
the cluster centers. It does have some weaknesses:

 The way to initialize the means was not specified. One popular way to start is to
randomly choose k of the samples.
 The results produced depend on the initial values for the means, and it frequently happens
that suboptimal partitions are found. The standard solution is to try a number of different
starting points.
 It can happen that the set of samples closest to mi is empty, so that mi cannot be updated.
This is an annoyance that must be handled in an implementation, but that we shall ignore.
 The results depend on the metric used to measure || x - mi ||. A popular solution is to
normalize each variable by its standard deviation, though this is not always desirable.
 The results depend on the value of k.
This last problem is particularly troublesome, since we often have no way of knowing how many
clusters exist. In the example shown above, the same algorithm applied to the same data
produces the following 3-means clustering. Is it better or worse than the 2-means clustering?
Unfortunately there is no general theoretical solution to find the optimal number of clusters for
any given data set. A simple approach is to compare the results of multiple runs with different k
classes and choose the best one according to a given criterion (for instance the Schwarz Criterion
- see Moore's slides), but we need to be careful because increasing k results in smaller error
function values by definition, but also an increasing risk of over fitting.
Simple Mat lab Programming:
load fisheriris
X = meas(:,3:4);

figure;
plot(X(:,1),X(:,2),'k*','MarkerSize',5);
title 'Fisher''s Iris Data';
xlabel 'Petal Lengths (cm)';
ylabel 'Petal Widths (cm)';
rng(1); % For reproducibility
[idx,C] = kmeans(X,3);
x1 = min(X(:,1)):0.01:max(X(:,1));
x2 = min(X(:,2)):0.01:max(X(:,2));
[x1G,x2G] = meshgrid(x1,x2);
XGrid = [x1G(:),x2G(:)]; % Defines a fine grid on the plot
idx2Region = kmeans(XGrid,3,'MaxIter',1,'Start',C);
Figure 1.15: Fisher’s Iris Data.

Figure 1.16: Output of fisher’s iris image using K-means clustering algorithm.

CHAPTER 4
IMAGE TRANSFORM TECHNIQUES
4.1 Introduction
The choice of a particular transform in a given application depends on the amount of
reconstruction error that can be tolerated and the computational resources available.
Compression is achieved during the quantization of the transformed coefficients not during the
transformation step. Image modeling or transformation is aimed at the exploitation of statistical
characteristics of the image (i.e. high correlation, redundancy).
4.2 Some transform techniques
 Fourier Transform (FFT, DFT, WFT)

 Discrete Cosine Transform (DCT)
 Walsh-Hadamand Transform (WHT)
 Wavelet Transform (CWT, DWT, FWT)
For Fourier Transform and DCT basis images are fixed i.e. they are input independent
and sinusoidal (cosines and sines) in nature. Provides frequency view i.e. provide frequency
information and temporal information is lost in transformation process. WHT is non-sinusoidal
in nature and easy to implement.(Frequency domain) Wavelet Transforms provides time-
frequency view i.e. provides both frequency as well as temporal (localization) information.
Wavelets give time-scale viewpoint and exhibits multiresolution characteristics.Fourier is good
for periodic or stationary signals but Wavelet is good for transients i.e. for non-stationary data.
Localization property allows wavelets to give efficient representation of transients. 2. Fourier
Transform Since the Fourier Transform is widely used in analyzing and interpreting signals and
images, I will first have a survey on it prior to going further to the Wavelet Transform. The tool
which converts a spatial (real space) description of an image into one in terms of its frequency
components is called the Fourier transform. Through Fourier Transform, it is possible to
compose a signal by superposing a series of sine and cosine functions. These sine and cosine
functions are known as basis functions (Figure 2.2.1) and are mutually orthogonal. The transform
decomposes the signal into the basis functions, which means that it determines the contribution

of each basis function in the structure of the original signal. These individual contributions are
called the Fourier coefficients. Reconstruction of the original signal from its Fourier coefficients
is accomplished by multiplying each basis function with its corresponding coefficient and adding
them up together, i.e. a linear superposition of the basis functions. Fourier Analysis and
Orthogonality Fourier analysis is one of the most widely used tools in spectral analysis. The
basis for this analysis is the Fourier Integral, which computes the amplitude spectral density F(ω)
of a time-domain signal f(t).
F(ω) is actually complex, so one obtains the amplitude spectral density A(f) and phase
spectral density φ(f) as a function of frequency. Another way of looking at the Fourier transform
is that it answers the question: what continuous distribution of sine waves. A (f) cos(jωt+ φ(f))
when added together on a continuous basis best represents the original time signal? We call these
distributions the amplitude and phase spectral densities (or spectra). Complex exponentials are
popular basis functions because in many engineering and science problems, the relevant signals
are sinusoidal in nature. It is noticed that when signals are not sinusoidal in nature, a wide
spectrum of the basis function is needed in order to represent the time signal accurately. An
important property of any family of basis functions ψ(t) is that it is orthogonal.
The basis functions in the Fourier Transform are ψ(t) = exp(+jωt), so the Fourier Transform
could be more generally written as
where * denotes complex conjugate. The test for orthogonality is done as follows
For complex exponentials, because they are infinite in duration, one end up with k=∞, when m=n
so it is necessary to define the orthogonality test in a different way:

For complex exponential, this becomes:
When the constant k=1,the function is said to be orthonormal.
Various types of signals can be analyzed with the Fourier Transform. If f(t) is periodic,
then the amplitude spectral density clusters at discrete frequencies that are harmonics (integer
multiples) of the fundamental frequency. One need to invoke Dirac Delta functions if the Fourier
Transform is used– otherwise Fourier series coefficients can be computed and same result can be
obtained. If f (t) is deterministic and discrete, the discrete time Fourier Transform (DTFT) may
be used to generate a periodic frequency response. If f(t) is assumed to be both periodic and
discrete, then the discrete Fourier Transform (DFT), or its fast numeric equivalent the FFT may
be applied to compute the spectrum. If f(t) is random, then in general one will have a difficult
time of computing the Fourier Integral of the random ‘data’. Hence treat the input as data and use
an FFT, but the result of doing so is a random spectrum. This single random spectrum can give
an idea of the frequency response, but in many instances it can be misleading. A better approach
is to take the average of the random spectra. This leads to the formulation of power spectral
density, which is an average over the FFT magnitude spectrum squared.
In certain signals, both random and deterministic, we are interested in the spectrum as a
function of time in the signal. This suggests finding the spectrum over a limited time bin, moving
the bin (sometimes with overlap, sometimes without), re-computing the spectrum, and so on.
This method is known as the short-time Fourier Transform (STFT), or the Gabor Transform.
Discrete Fourier Transform (DFT) is an estimation of the Fourier Transform, which uses a
finite number of sample points of the original signal to estimate the Fourier Transform of it. The
order of computation cost for the DFT is in order of O(n2 ),where n is the length of the signal.

Fast Fourier Transform (FFT) is an efficient implementation of the Discrete Fourier

Transform, which can be applied to the signal if the samples are uniformly spaced. FFT reduces
the computation complexity to the order of O(nlogn) by taking advantage of self similarity
properties of the DFT.If the input is a non-periodic signal, the superposition of the periodic
basis functions does not accurately represent the signal. One way to overcome this problem is to
extend the signal at both ends to make it periodic.
Another solution is to use Windowed Fourier Transform (WFT). In this method the
signal is multiplied with a window function ( in below Figure) prior to applying the Fourier
transform. The window function localizes the signal in time by putting the emphasis in the
middle of the window and attenuating the signal to zero towards both ends.
Figure: A Set of Fourier basis functions.
A window function. A windowed Signal.
Discrete Cosine Transform (DCT)
The discrete cosine transform (DCT) helps separate the image into parts (or spectral sub-bands)
of differing importance (with respect to the image’s visual quality). The DCT is similar to the
discrete Fourier transform: it transforms a signal or image from the spatial domain to the
frequency domain.
With an input image, A, the coefficients for the output “image,” B, are:
B(k1,k2)=

The input image is N2 pixels wide by N1 pixels high; A(i,j) is the intensity of the pixel in row i
and column j; B(k1,k2) is the DCT coefficient in row k1 and column k2 of the DCT matrix. All
DCT multiplications are real. This lowers the number of required multiplications, as compared to
the discrete Fourier transform. The DCT input is an 8 by 8 array of integers. This array contains
each pixel’s gray scale level; 8 bit pixels have levels from 0 to 255. The output array of DCT
coefficients contains integers; these can range from -1024 to 1023. For most images, much of the
signal energy lies at low frequencies; these appear in the upper left corner of the DCT. The lower
right values represent higher frequencies, and are often small - small enough to be neglected with
little visible distortion. It is computationally easier to implement and more efficient to regard the
DCT as a set of basis functions which given a known input array size (8 x 8) can be pre-
computed and stored. This involves simply computing values for a convolution mask (8 x8
window) that get applied (sum values x pixel the window overlap with image apply window
across all rows/columns of image). The values as simply calculated from the DCT formula. The
64 (8 x 8) DCT basis functions are there. Most software implementations use fixed point
arithmetic. Some fast implementations approximate coefficients so all multiplies are shifts and
adds.
DCT Vs Fourier:
 DCT is similar to the Fast Fourier Transform (FFT), but can approximate lines
well with fewer coefficients .
 DCT (Discrete Cosine Transform) is actually a cut-down version of the FFT i.e. it is
only the real part of FFT .
 DCT Computationally simpler than FFT and much effective for Multimedia
Compression.
 DCT is associated with very less MSE value in comparison to others.
 DCT has best information packing ability.

 DCT minimizes the block like appearance (blocking articrafts),that results when
the boundaries between the sub-images become visible. But DFT gives rise to
boundary discontinuities.



Wavelet Transform:
Wavelet means ‘small wave’. So wavelet analysis is about analyzing signal with short
duration finite energy functions. They transform the signal under investigation in to another
representation which presents the signal in more useful form. This transformation of the signal is
called Wavelet Transform [1,7,10] i.e. Wavelet Transforms are based on small waves, called
wavelets, of varying frequency and limited duration. Unlike the Fourier transform, we have a
variety of wavelets that are used for signal analysis. Choice of a particular wavelet depends on
the type of application in hand. Wavelet Transforms provides time-frequency view i.e. provides
both frequency as well as temporal (localization) information and exhibits multiresolution
characteristics. Fourier is good for periodic or stationary signals and Wavelet is good for
transients. Localization property allows wavelets to give efficient representation of transients. In
Wavelet transforms a signal can be converted and manipulated while keeping resolution across
the entire signal and still based in time i.e. Wavelets have special ability to examine signals
simultaneously in both time and frequency. Wavelets are mathematical functions that satisfy
certain criteria, like a zero mean, and are used for analyzing and representing signals or other
functions. A set of dilations and Translations of a chosen mother wavelet is used for the
spatial/frequency analysis of an input signal. The Wavelet Transform uses overlapping functions
of variable size for analysis. The overlapping nature of the transform alleviates the blocking
artifacts, as each input sample contributes to several samples of the output. The variable size of
the basis functions, in addition, leads to superior energy compaction and good perceptual quality
of the decompressed image. Wavelets Transform is based on the concept of sub-band coding.
The current applications of wavelet include statistical signal processing, Image processing,
climate analysis, financial time series analysis, heart monitoring, seismic signal de- noising, de-
noising of astronomical images, audio and video compression, compression of medical image
stacks, finger print analysis, fast solution of partial differential equations, computer graphics and
so on.

Wavelets Vs Fourier and DCT:
 Fourier and DCT transforms converts a signal from time Vs amplitude to frequency Vs
amplitude i.e. provides only frequency information and temporal information is lost
during transformation process. But Wavelet transforms provides both frequency as well
as temporal (localization) information.

 In Fourier and DCT basis functions are sinusoids (sine and cosine) and cosines
respectively but in Wavelet Transform basis functions are various wavelets.
 Since Wavelet Transforms are both computationally efficient and inherently local (i.e.
there basis functions are limited in duration),subdivision of original image before
applying transformation is not required as required in DCT and others.
 The removal of subdivision step in Wavelet Transform eliminates the blocking articraft
but FFT suffers from it. This property also characterizes DCT-based approximation, at
higher compression ratios.
 Wavelets provide unconditional basis for large signal class. Wavelet coefficients drops
sharply hence good for compression, de-noising, detection and recognition.
 Fourier is good for periodic or stationary signals. Wavelet is good for transients.
Localization property allows wavelets to give efficient representation of transients.
 Wavelets have local description and separation of signal characteristics. Fourier puts
localization information in the phase in a complicated way. STFT cannot give
localization and orthogonality.
 Wavelets can be adjusted or adapted to application.
 Computation of wavelet coefficients is well suited to computer. No derivatives of
integrals needed as required in Fourier and DCT and hence turn out to be a digital filter
bank.

CHAPTER 5
DISCRETE WAVELET TRANSFORM
5.1 INTRODUCTION
Wavelet is a “small wave”. It is a special kind of function which exhibits oscillatory
behaviour for a short period of time and then dies out. In signal processing using Fourier
transform, the signal is decomposed into a series of sines or cosines. It is impossible to know
simultaneously the exact frequency and the exact time of occurrence of that particular frequency
in a signal. In order to know the frequency, the signal must be spread in time, or vice versa. A
solution is to split the signal up into components that are not sine or cosine waves. A single
function and its dilations and translations may be used to generate a set of orthonormal basis
functions to represent a signal. This would help to condense the information in both the time and
frequency domains. This idea led to the introduction of wavelets. Figure 1.17 shows the example
of a wavelet.
Figure 1.17: Morlet Wavelet
Wavelets (Soman, Ramachandran, & Resmi, 2010) can be manipulated in two ways – by
translation and by scaling. In translation, the central position of the wavelet is changed along the
time axis. In scaling, its frequency is changed.
Figures 1.18 and 1.19 show the translated and scaled versions of a wavelet.

Figure 1.18: Translation of a Wavelet
Figure 1.19: Scaling of a Wavelet
References:
1. Rafael C. Gonzalez and Richard E. Woods,” Digital Image Processing”, 2Ed, 2002.
2. B. Sumengen, B. S. Manjunath, C. Kenney, "Image Segmentation using Curve Evolution and
Flow Fields,” Proceedings of IEEE International Conference on Image
Processing (ICIP), Rochester, NY, USA, September 2002.
3. W. Ma, B.S. Manjunath, “Edge Flow: a technique for boundary detection and image
segmentation,” Trans. Image Proc., pp. 1375-88, Aug. 2000
4. Venugopal, “Image segmentation “written reports 2003.
5. Jin Wang,” Image segmentation Using curve evolution and flow fields”, written reports 2003.
6. http://vision.ece.ucsb.edu/segmentation

CHAPTER 6
LITERATURE SURVEY
Liver tumors or hepatic tumors are tumors or growths on or in the liver. Several distinct
types of tumors can develop in the liver because the liver is made up of various cell types. liver
cancer tends to occur in livers damaged by birth defects, alcohol, or chronic infection with
diseases such as hepatitis B and C, hemochromatosis (a hereditary disease associated with too
much iron in the liver) [1].
New liver cancer cases 39,230 adults (28,410 men and 10,820 women) in the United States in
2015. Death due to liver cancer in USA, 2015 is 27,170 deaths (18,280 men and 8,890 women).
The survey was done on the basis of survival rate. If the person has survived for 5 years then the
survival rate is chosen as 5 years.
The first year survival rate for people with liver cancer is 44%. And fifth year survival rate for
people with liver cancer is 17%. Liver cancer is one of the most popular cancer diseases and
causes a large amount of death every year. The chances for liver cancer in men and women have
increased to 40% and 23% respectively. Image segmentation is another useful tool used in the
field of image processing. the main purpose of the this idea is to capture like characteristics of an
image and bring them out to where they are more visible than they were before. Segmentation of
images is used to provide information such as structures of organs, identifying the regions of
interest i,e locating tumors, abnormalities etc. the liver is the largest gland ant he largest internal
organ in the human body, liver is dark red, wedge-shaped gland approximately eight and half
inches long.
Early detection and accurate analysis of liver cancer is an important issue in practical radiology.
Liver lesions are the injury, wound, disease or tumor to the liver tissues. CT scan can identify the
liver lesions by difference in pixel intensity from that of the liver. Manual segmentation is very
time consuming and tedious process, were as the automatic segmentation is very challenging task
due to the factors like indefinite shape of the lesions and low intensity contrast between lesions
and similar to these of nearby tissues. The irregularity in the liver shape,size between the
patients. Various automatic/semiautomatic techniques for liver tumor segmentation have been

developed.R.Rajagopal. et al. [2] proposed a novel system for detecting ă and ă segmenting ă
liver ă lesions.ăIt ă utilizes ăaăotsu‘să thresholding method and is employed in median filtering
using mathematical morphology. Morphological filtering is applied to extract the regions shap I,e
edges. Only erosion is used. And gabor transform filter used for edge detection process. It yields
accurate results for different types of liver tumors with ease and without manual interaction.
It can also be improved by neural network and fuzzy algorithm. Gang Chen. et al. [3] multiple
initialization, multiplestep LSM are used. The multiple-initialization curves are first evolved
separately using the fast marching methods and LSM, which are combined with a convex hall
(CH) algorithm to obtain a rough liver contour. Parallel propagation using FMM and LSM based
on these initial curves are implemented. Combination of the partial segmentation results using a
CH algorithm. Smoothing the primary liver contour using LSM. Multipleiitialization LSM is
much faster, can cover more liver regions. Overcome the leakage and over segmentation
problem.
An automated perform perfusion analysis method is proposed to automatically conclude liver

perfusion curves. the under segmentation problem still exists on lower sharp corner regions due
to the low gradient definition of the lower half of the liver regions in the ă abdominal ă MRI‘s
Chen Zhaoxue. Et al. [4] Simple line search method for plane domain segmentation to extract
binary image composed of isolated white pixel clusters mainly from the liver part based on the
histogram distribution and spatial characteristics the liver is obtained. Gaussian blurring
technique is introduced to connect the isolated pixel clusters. Thresholding The blurred image
after the post-processing step of mending holes and size filters. Liver image registration between
slices so as to increase accuracy of the perfusion computation and measurement. S. Luo et al. [5]
a three step liver segmentation algorithm. Texture analysis is applied into abdominal CT images
to extract pixel level features. Two other main features are wavelet coefficient and haralick
texture description are used. here SVM is implemented to classify the data into pixelwised liver
or non liver. Morphological operation is designed as a processor to remove noise and to delineate
the liver. It has been proven that wavelets features present better classification the haralick
texture descriptors when SVM are used. The combination of morphological operation with a
pixel-wised SVM classfier can delineate volumetric liver accurately. Shraddha Sangewar et al.
[6] the segmentation is based on combining a modified K-means segmentation method with a

special localized contouring algorithm in the segmentation process in order to divide the image,
five separate regions are identified on the T image frames. It provides fast and accurate liver
segmentation and 3D rendering as well as delineating tumor regions. O. Fekry Abd-Elaziz et al.
[7] combination of intensity analysis, region growing and pre-processing steps for automatic
segmentation of liver and a second region growing process for tumors segmentation. a method
for automatic segmentation of liver tumor. Decrease the computation time by removing the
regions of other structures. In most techniques liver was segmented using region growing method
that started from a seed point automatically selected. Wenhan Wang et al. [8] a morphological
feature of the liver region under various window level setting, applied the region growing
algorithm to remove other tissues such as skeleton, kidney & stomach. a discrete points of the
liver region can be acquired. The gradient information based edge correction and three
dimensional restoration are adopted to optimize the recovered liver image. And has a lower time
complexity but there is likely over segmentation. Ina Singh et al. [9] Discussed he standard k-
means clustering algorithm and analyzes the shortcomings of standard k-means algorithm, such
as the k-means clustering algorithm has to calculate the distance between each data object and all
cluster centers in each iteration, which makes the efficiency of clustering was not high. This
paper proposes an improved k-means algorithm in order to solve this question, requiring a simple
data structure to store some information in every iteration, which was to be used in the next
interation. The improved method avoids computing the distance of each data object to the cluster
centers repeatly, saving the running time. Experimental results show that the improved method
can effectively improve the speed of clustering and accuracy, reducing the computational
complexity of the k-means. Gambino, O. and et al. [10] proposed an automatic texture based
volumetric region growing method for liver segmentation was proposed. 3D seeded region
growing was based on texture features with the automatic selection of the seed voxel inside the
liver organ and the automatic threshold value computation for the region growing stop condition.
Co-occurrence 3D texture features are extracted from CT abdominal volumes and the seeded
region growing algorithm was based on statistics in the features space.
Chung-Ming Wu, et al. [1] proposed a texture feature called Multiresolution Fractal (MF) feature
to distinguish normal, hepatoma and cirrhosis liver using ultrasonic liver images with an
accuracy of 90%. Yasser M. Kadah, et al. [2] extracted first order Graylevel parameters like
mean and first percentile and second order Gray level parameters like Contrast, Angular Second

Moment, Entropy and Correlation, and trained the Functional Link Neural Network for
automatic diagnosis of diffused liver diseases like fatty and cirrhosis using ultrasonic images and
showed that very good diagnostic rates can be obtained using unconventional classifiers trained
on actual patient data. Aleksandra Mojsilovic, et al. [3] investigated the application and
advantages of the nonseparable wavelettransform features for diffused liver tissue
characterization using B-Scan liver images and compared the approach with other texture
measures like SGLDM (Spatial Gray Level Dependence Matrices), Fractal texture measures and
Fourier measures. The classification accuracy was 87% for the SGLDM, 82% for Fourier
measures and 69% for Fractal texture measures and 90% for wavelet approach.
E-Liang Chen, et al. [4] used Modified Probabilistic Neural Network (MPNN) on CT abdominal
images in conjunction with feature descriptors generated by fractal feature information and the
Gray level co-occurrence matrix and classified liver tumours into hepatoma and haemangioma
with an accuracy of 83%. Pavlopoulos, et al. [5] proposed a CAD system based on texture
features estimated from Gray Level Difference Statistics (GLDS), SGLDM, Fractal Dimension
(FD) and a novel fuzzy neural network classifier to classify a liver ultrasound images into
normal, fatty and cirrhosis with accuracy in the order of 82.7%. Jae-Sung Hong, et al. [6]
proposed a CAD system based on Fuzzy C Means Clustering for liver Tumor extraction with an
accuracy of 91% using features like area, circularity and minimum distance from liver boundary
to Tumor and Bayes classifier for classifying normal and abnormal slice. The CAD system
proposed by Gletsos Miltiades, et al. [7] consists of two basic modules: the feature extraction and
the classifier modules. In their work, region of interest (liver Tumor) were identified manually
from the CT liver images and then fed to the feature extraction module. The total performance of
the system was 97% for validation set and100% for testing set. Horlick transform and Hopfield
Neural Network were used to segment 90% of the liver pixels correctly from the CT abdominal
image by John. E. Koss, et al. [8]. However, texture based segmentation results in coarse and
block wise contour leading to poor boundary accuracy. Chien-Cheng Lee, et al. [9] identified
liver region by using the fuzzy descriptors and fuzzy rules constructed using the features like
location, distance, intensity, area, compactness and elongated-ness from CT abdominal images.

CHAPTER 7
PROPOSED METHODOLOGY
7.1 Problem Statement
Worldwide cancer is the fifth reason for death therefore detection and treatment of cancer
having great significance because of wide spread episodes of diseases, reoccurrence after
treatment and high death rate. There are different types of cancer in which liver cancer is at third
position for death factor. This cancer is also known as hepatic cancer. This type of cancer is

Homogenious polynomial; in;Gaussian radial basis;

starting from the liver and then growing further if the not diagnosed early. The cancer
which is started from some other organ and travels to liver is not treated as liver cancer. Liver
cancer is consisting of the malignant hepatic growths called tumours over liver or inside liver.
Therefore, early detection of liver cancer is challenging task in practical radiology. There are
number of computer aided diagnostic methods designed using image processing terminologies
for early detection accurately. Early stage detection of liver cancer helps to prevent it completely
through the proper treatment.
Liver Cancer is one in every of the speediest growing cancer in the world. The early
detection and diagnosing of liver tumor growth is vital for the hindrance of liver tumor growth.
More than 30% of cancer deaths may be prevented by avoiding risk factor, early detection,
accurate diagnosis, and effective treatment. Segmentation of liver from medical images from the
abdominal space is vital for diagnosing of tumor and for surgical procedures. Accurate detection
of the type of the liver abnormality is very essential for treatment designing which may minimize
the fatal results.
The major issues with image processing based techniques are efficiency, processing time
and accuracy of detection. Designing time efficient, highly accurate and simple method for
detection is main research problem. The Choice of cluster and threshold values and the images
are justifying by checking if the threshold falls within the same range estimated for each image.
This contributes by providing a computer aided diagnostic system for the diagnosis of the liver
cancer using the images framed through the CT scan of certain patients. Proposed system is used
to segment the Tumor with remarkable satisfaction. Results are evaluated with radiologists. In
this paper liver lesion and enhancement is done using CT images.
7.2 Proposed System algorithm
In total, our method consists of three stages:

Step 1. Preprocessing (including feature extraction and feature reduction);
Step 2. Training the kernel SVM;
Step 3. Submit new MRI brains to the trained kernel SVM, and output the prediction.
As shown in Fig. 1.20, this flowchart is a canonical and standard

classification method which has already been proven as the best

classification method . We will explain the detailed procedures of the preprocessing in the
following subsections.
FEATURE FEATURE
DWT EXTRACTION PCA REDUCTION
LIVER IMAGE WT
KERNEL SVM
NORMAL
/ABNORMAL
Figure 1.20: Block diagram of Proposed methodology.
Feature Extraction
The most conventional tool of signal analysis is Fourier transform (FT),which breaks
down a time domain signal into constituent sinusoids of different frequencies, thus, transforming
the signal from time domain to frequency domain. However, FT has a serious drawback as
discarding the time information of the signal. For example, analyst can not tell when a particular
event took place from a Fourier spectrum. Thus, the quality of the classification decreases as

time information is lost. Gabor adapted the FT to analyze only a small section of the signal at a
time. The technique is called windowing or short time Fourier transform (STFT). It adds a
window of particular shape to the signal. STFT can be regarded as a compromise between the
time information and frequency information. It provides some information about both time and
frequency domain. However, the precision of the information is limited by the size of the
window.
Wavelet transform (WT) represents the next logical step: a windowing technique with variable
size. Thus, it preserves both time and frequency information of the signal. The development of
signal analysis. Another advantage of WT is that it adopts \scale" instead of traditional
\frequency", namely, it does not produce a time-frequency view but a time-scale view of the
signal. The time-scale view is a different way to view data, but it is a more natural and powerful
way,because compared to \frequency", \scale" is commonly used in daily life. Meanwhile, \in
large/small scale" is easily understood than \in high/low frequency".
Discrete Wavelet Transform
The discrete wavelet transform (DWT) is a powerful implementation of

the WT using the dyadic scales and positions .
Figure 1.21: The Image Compression with DWT.
In case of 2D images, the DWT is applied to each dimension separately. Fig. 4 illustrates the
schematic diagram of 2D DWT. As a result, there are 4 sub-band (LL, LH, HH, and HL)
images at each scale. The sub-band LL is used for next 2D DWT.

Figure 1.22: A 3-level & Schematic diagram of 2D DWT, wavelet decomposition tree.
The LL subband can be regarded as the approximation component of the image, while the LH,
HL, and HH subbands can be regarded as the detailed components of the image. As the level of
decomposition increased, compacter but coarser approximation component was obtained.
Thus, wavelets provide a simple hierarchical framework for interpreting the image
information. In our algorithm, level-3 decomposition via Harr wavelet was utilized to extract
features. The border distortion is a technique issue related to digital filter which is commonly
used in the DWT. As we filter the image, the mask will extend beyond the image at the edges, so
the solution is to pad the pixels outside the images. In our algorithm, symmetric padding method
was utilized to calculate the boundary value.
Feature Reduction
Excessive features increase computation times and storage memory. Furthermore, they
sometimes make classification more complicated, which is called the curse of dimensionality. It
is required to reduce the number of features. PCA is an efficient tool to reduce the dimension of
a data set consisting of a large number of interrelated variables while retaining most of the
variations. It is achieved by transforming the data set to a new set of ordered variables according
to their variances or importance. This technique has three effects: it orthogonalizes the
components of the input vectors so that uncorrelated with each other, it orders the resulting
orthogonal components so that those with the largest variation come first, and eliminates those
components contributing the least to the variation in the data set. It should be noted that the
input vectors be normalized to have zero mean and unity variance before performing PCA. The
normalization is a standard procedure.
Kernel SVM
The introduction of support vector machine (SVM) is a landmark in the field of machine
learning. The advantages of SVMs include high accuracy, elegant mathematical tractability, and
direct geometric interpretation. Recently, multiple improved SVMs have grown rapidly, among
which the kernel SVMs are the most popular and effective. Kernel SVMs have the following
advantages :
(1) work very well in practice and have been remarkably successful in such diverse fields as
natural language categorization, bioinformatics and computer vision;
(2) have few tunable parameters.
(3) training often involves convex quadratic optimization. Hence, solutions are global and
usually unique, thus avoiding the convergence to local minima exhibited by other statistical
learning systems, such as neural networks.
Motivation
Suppose some prescribed data points each belong to one of two classes, and the goal is to
classify which class a new data point will be located in. Here a data point is viewed as a p-
dimensional vector, and our task is to create a (p − 1)-dimensional hyperplane. There are many
possible hyperplanes that might classify the data successfully. One reasonable choice as the best
hyperplane is the one that represents the largest separation, or margin, between the two classes,
since we could expect better behavior in response to unseen data during training, i.e., better
generalization performance.
Therefore, we choose the hyperplane so that the distance from it to the nearest data point on each
side is maximized. Below Fig. shows the geometric interpolation of linear SVMs, here H1, H2,
H3 are three hyperplanes which can classify the two classes successfully, however, H2 and H3
does not have the largest margin, so they will not perform well to new test data. The H1 has the
maximum margin to the support vectors (S11, S12, S13, S21, S22, and S23), so it is chosen as
the best classification hyperplane.
Principles of Linear SVMs
Given a p-dimensional N -size training dataset of the form

{(xn, yn)|xn ∈ Rp, yn ∈ {−1, +1}} , n = 1, . . . , N

Figure 1.23: The geometric interpolation of linear SVMs (H denotes for the hyperplane, S
denotes for the support vector).
where yn is either −1 or 1 corresponds to the class 1 or 2. Each xn is a p-dimensional vector.

The maximum-margin hyperplane which divides class 1 from class 2 is the support vector
machine we want. Considering that any hyperplane can be written in the form of
w•x−b=0
where • denotes the dot product and W the normal vector to the hyperplane. We want to choose
the W and b to maximize the margin between the two parallel (as shown in Fig. 6) hyperplanes
as large as possible while still separating the data. So we define the two parallel hyperplanes by
the equations as
w • x − b = ±1
Kernel SVMs
Traditional SMVs constructed a hyperplane to classify data, so they cannot deal with
classification problem of which the different types of data located at different sides of a hyper
surface, the kernel strategy is applied to SVMs . The resulting algorithm is formally similar,
except that every dot product is replaced by a nonlinear kernel function. The kernel is
related to the transform ϕ(xi) by the equation k(xi, xj ) = ∑(xi)ϕ(xj ).
CHAPTER 8
RESULTS & DISCUSSION

EXPERIMENTS AND DISCUSSIONS
The experiments were carried out on the platform of P4 IBM with 3 GHz processor
and 2 GB RAM, running under Windows XP operating system. The algorithm was in-house
developed via the wavelet toolbox, the bio statistical toolbox of Matlab 2011b (The
Mathworks . We downloaded the open SVM toolbox, extended it to Kernel SVM, and applied it
to the MR brain images classification. The programs can be run or tested on any computer
platforms where Matlab is available.
Figure 1.24: Liver Images Data Set.
The three levels of wavelet decomposition greatly reduce the input image size. As stated
above, the number of extracted features was reduced from 65536 to 1024. However, it is still
too large for calculation. Thus, PCA is used to further reduce the dimensions of features to a
higher degree. The curve of cumulative sum of variance versus the number of principle
components. We tested four SVMs with different kernels (LIN, HPOL, IPOL, and
GRB).

In the case of using linear kernel, the KSVM degrades to original

linear SVM.We computed hundreds of simulations in order to estimate the
optimal parameters of the kernel functions, such as the order d in
HPOL and IPOL kernel, and the scaling factor γ in GRB kernel. The results showed that the
proposed DWT+PCA+KSVM method obtains quite excellent results on both training and

validation images. Computation time is another important factor to evaluate the classifier.
The time for SVM training was not considered, since the parameters
of the SVM keep unchanged after training.
CHAPTER 9
CONCLUSION & FUTURESCOPE

9.1 CONCLUSION
In this PROJECT, we have developed a novel DWT+PCA+KSVM method to distinguish between

normal and abnormal MRIs of the brain. We picked up four different kernels as LIN, HPOL,
IPOL and GRB. The experiments demonstrate that the GRB kernel SVM obtained 99.38%
classification accuracy on the few liver images, higher than HPOL, IPOL and GRB kernels, and
other popular methods in recent literatures.
9.2 FUTURE SCOPE
Future work should focus on the following four aspects: First, the proposed SVM based method
could be employed for MR images with other contrast mechanisms such as T1-weighted, Proton
Density weighted, and diffusion weighted images. Second, the computation time could be
accelerated by using advanced wavelet transforms such as the lift-up wavelet. Third, Multi-
classification, which focuses on specific disorders studied using liver images, can also be explored.
Forth, novel kernels will be tested to increase the classification accuracy. The DWT can efficiently
extract the information from original CT images with little loss. The advantage of DWT over Fourier
Transforms is the spatial resolution, viz., DWT captures both frequency and location
information. In this study we choose the Harr wavelet, although there are other outstanding
wavelets such as Daubechies series. We will compare the performance of different families of
wavelet in future work. Another research direction lies in the stationary wavelet transform and the
wavelet packet transform.
CHAPTER 10
REFERENCES

1. Zhang, Y., L. Wu, and S. Wang, “Magnetic resonance brain image classification
by an improved artificial bee colony algorithm,” Progress In Electromagnetics
Research, Vol. 116, 65-79, 2011.
2. Mohsin, S. A., N. M. Sheikh, and U. Saeed, “MRI induced heating of deep brain
stimulation leads: Effect of the air-tissue interface,” Progress In Electromagnetics
Research, Vol. 83, 81-91, 2008.
3. Golestanirad, L., A. P. Izquierdo, S. J. Graham, J. R. Mosig, and C. Pollo,
“Effect of realistic modeling of deep brain stimulation on the prediction of volume
of activated tissue,” Progress In Electromagnetics Research, Vol. 126, 1-16, 2012.
4. Mohsin, S. A., “Concentration of the specific absorption rate around deep
brain stimulation electrodes during MRI,” Progress In Electromagnetics Research, Vol.
121, 469-484, 2011.
5. Oikonomou, A., I. S. Karanasiou, and N. K. Uzunoglu, “Phased-array near field
radiometry for brain intracranial applications,” Progress In Electromagnetics Research,
Vol. 109, 345-360, 2010.
6. Scapaticci, R., L. Di Donato, I. Catapano, and L. Crocco, “A feasibility
study on microwave imaging for brain stroke monitoring,” Progress In
Electromagnetics Research B, Vol. 40, 305-324, 2012.
7. Asimakis, N. P., I. S. Karanasiou, P. K. Gkonis, and N. K. Uzunoglu,
“Theoretical analysis of a passive acoustic brain monitoring system,” Progress In
8. Chaturvedi, C. M., V. P. Singh, P. Singh, P. Basu, M. Singaravel, R. K. Shukla,
A. Dhawan, A. K. Pati, R. K. Gangwar, and S. P. Singh. “2.45 GHz (CW)
microwave irradiation alters circadian organization, spatial memory, DNA structure in
the brain cells and blood cell counts of male mice, mus musculus,” Progress In
9. Emin Tagluk, M., M. Akin, and N. Sezgin, “Classification of sleep apnea by
using wavelet transform and artificial neural networks,” Expert Systems with
Applications, Vol. 37, No. 2, 1600-1607, 2010.

10. Zhang, Y., L. Wu, and G. Wei, “A new classifier for polarimetric SAR images,”
Progress in Electromagnetics Research, Vol. 94, 83-104, 2009.
11. Camacho, J., J. Picó, and A. Ferrer, “Corrigendum to ‘The best approaches in
the on-line monitoring of batch processes based on PCA: Does the modelling structure
matter?’ [Anal. Chim. Acta Volume 642 (2009) 59-68],” Analytica Chimica Acta,
Vol. 658, No. 1, 106-106, 2010.
12. Chaplot, S., L. M. Patnaik, and N. R. Jagannathan, “Classification of magnetic
resonance brain images using wavelets as input to support vector machine and neural
network,” Biomedical Signal Processing and Control, Vol. 1, No. 1, 86-92, 2006.
13. Cocosco, C. A., A. P. Zijdenbos, and A. C. Evans, “A fully automatic and
robust brain MRI tissue classification method,” Medical Image Analysis, Vol. 7, No. 4,
513-527, 2003.
14. Zhang, Y. and L. Wu, “Weights optimization of neural network via improved
BCO approach,” Progress In Electromagnetics Research, Vol. 83, 185-198, 2008.
15. Yeh, J.-Y. and J. C. Fu, “A hierarchical genetic algorithm for
segmentation of multi-spectral human-brain MRI,” Expert Systems with
Applications, Vol. 34, No. 2, 1285-1295, 2008.
16. Patil, N. S., et al., “Regression models using pattern search
assisted least square support vector machines,” Chemical Engineering Research
and Design, Vol. 83, No. 8, 1030-1037, 2005.
17. Wang, F.-F. and Y.-R. Zhang, “The support vector machine for dielectric
target detection through a wall,” Progress In Electromagnetics Research Letters,
Vol. 23, 119-128, 2011.
18. Xu, Y., Y. Guo, L. Xia, and Y. Wu, “An support vector regression based
nonlinear modeling method for Sic mesfet,” Progress In Electromagnetics Research
Letters, Vol. 2, 103-114, 2008.
19. Li, D., W. Yang, and S. Wang, “Classification of foreign fibers in cotton lint
using machine vision and multi-class support vector machine,” Computers and
Electronics in Agriculture, Vol. 4, No. 2,
274-279, 2010.

20. Gomes, T. A. F., et al., “Combining meta-learning and search techniques to

select parameters for support vector machines,” Neurocomputing, Vol. 75, No. 1, 3-13,
2012.
21. Hable, R., “Asymptotic normality of support vector machine variants and
other regularized kernel methods,” Journal of Multivariate Analysis, Vol. 106, 92-
117, 2012.
22. Ghosh, A., B. Uma Shankar, and S. K. Meher, “A novel approach to neuro-
fuzzy classification,” Neural Networks, Vol. 22, No. 1, 100-109, 2009.
23. Gabor, D., “Theory of communication. Part 1: The analysis ofinformation,”
Journal of the Institution of Electrical Engineers Part III: Radio and Communication
Engineering, Vol. 93, No. 26,429-441, 1946.
24. Zhang, Y. and L. Wu, “Crop classification by forward neural network with
adaptive chaotic particle swarm optimization, Sensors, Vol. 11, No. 5, 4721-4743,
2011.
25. Zhang, Y., S. Wang, and L. Wu, “A novel method for magnetic resonance brain
image classification based on adaptive chaotic PSO,” Progress In Electromagnetics
Research, Vol. 109, 325-343, 2010.
26. Ala, G., E. Francomano, and F. Viola, “A wavelet operator on the
interval in solving Maxwell’s equations,” Progress In Electromagnetics Research
Letters, Vol. 27, 133-140, 2011.
27. Iqbal, A. and V. Jeoti, “A novel wavelet-Galerkin method for modeling radio
wave propagation in tropospheric ducts,” Progress In Electromagnetics Research B,
Vol. 36, 35-52, 2012.
28. Messina, A., “Refinements of damage detection methods based on wavelet
analysis of dynamical shapes,” International Journal of Solids and Structures, Vol. 45,
Nos. 14-15, 4068-4097, 2008.
29. Martiskainen, P., et al., “Cow behaviour pattern recognition using a three-
dimensional accelerometer and support vector machines,” Applied Animal Behaviour
Science, Vol. 119, Nos. 1-2, 32-38, 2009.
30. Bermejo, S., B. Monegal, and J. Cabestany, “Fish age categorization

from otolith images using multi-class support vector machines,” Fisheries Research,
Vol. 84, No. 2, 247-253, 2007.

DOCUMENT

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

DOCUMENT

Загружено:

Авторское право:

Доступные форматы

Liver Tumor Detection Using MatLab

Dept. of ECE UCET Page 1

INTRODUCTION TO IMAGE PROCESSING

Image is a two-dimensional, such as a photograph, screen display, and as well as a three-

Figure 1.1: The Color & Gray scale Images.

Dept. of ECE UCET Page 2

Figure 1.2: Pixel Segmentation of image.

• We can think of an image as a function, f, f: R2  R

– f (x, y) gives the intensity at position (x, y)

– Realistically, we expect the image only to be defined over a rectangle, with a

Dept. of ECE UCET Page 3

Figure 1.3: Image as a Function.

1.2 IMAGE FILE SIZES

Dept. of ECE UCET Page 4

1.3 IMAGE FILE FORMATS

1.3.1 RASTER FORMATS

These formats store images as bitmaps (also known as pixmaps: ).

Dept. of ECE UCET Page 5

Dept. of ECE UCET Page 6

1.4 IMAGE PROCESSING

Digital image processing, the manipulation of images by computer, is relatively recent

Dept. of ECE UCET Page 7

1.5 FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING

Figure 1.4: Fundamental steps of image processing.

1.5.1 Image Acquisition

Dept. of ECE UCET Page 8

1.5.2 Image Enhancement

Dept. of ECE UCET Page 9

1.5.3 Image restoration

Figure 1.5 : The Basic Example for Image Enhancement.

1.5.4 Color image processing

Dept. of ECE UCET Page 10

1.5.5 Wavelets and multi resolution processing

Dept. of ECE UCET Page 11

1.5.7 Morphological processing

Dept. of ECE UCET Page 12

1.5.9 Representation and description

1.5.10 Object recognition

Dept. of ECE UCET Page 13

1.6 COMPONENTS OF AN IMAGE PROCESSING

Figure 1.6:Components of image processing system

Dept. of ECE UCET Page 14

 Specialized image processing hardware

 Image processing software

Dept. of ECE UCET Page 15

Dept. of ECE UCET Page 16

Dept. of ECE UCET Page 17

2.1 What is Image Segmentation?

1. In automated inspection of electronic assemblies, presence or absence of specific objects can

Some figures which show segmentation are

Figure 1.7: Image Segmentation.

Segmentation has no single standard procedure and it is very difficult in non-trivial

Dept. of ECE UCET Page 18

Segmentation on the third property is region processing. In this method an attempt is

2.2 Segmentation using discontinuities

w1=w2=w3=w4=w6=w7=w8=w9 =-1, w5 = 8.Response is R = w1z1+w2z2……+w9z9, where z

Dept. of ECE UCET Page 19

Dept. of ECE UCET Page 20

Figure 1.8: Different edge profiles.

Figure 1.9: How the homogeneity operator works.

new pixel = maximum{½ 1111½ , ½ 1113½ , ½ 1115½ , ½ 1116½ ,½ 1111½ ,

Dept. of ECE UCET Page 21

Figure 1.10: How the difference operator works.