Вы находитесь на странице: 1из 16

SECTION –C 20 Marks Each

Module-1

1.Explain the convolution property in 2D discrete fourier transform? How do you obtain 2D DFT
for a digital image? Discuss about the time complexities in image processing system?
assignment

2. Explain all the different transforms used in digital image processing and also discuss the most
advantageous one in detail along with their equations or symbolic representation? assignment
3. Define Optical Illusion and state the problems encountered when viewing such images constantly
for longer period of time? assignment
4. Why KL transform is called optimal transform and also discuss how Fast Fourier Transform
(FFT) is different from KL transform with the help of symbolic notations of image transforms?
The KL Transform is also known as the Hoteling transform or the Eigen Vector transform. The KL Transform is based on the
statistical properties of the image and has several important properties that make it useful for image processing particularly for
image compression.
The main purpose of image compression is to store the image in fewer bits as compared to original image, now data from
neighboring pixels in an image are highly correlated.
More image compression can be achieved by de-correlating this data. The KL transform does the task of de-correlating the
data thus facilitating higher degree of compression
There are four major steps in order to find the KL transform :-
(I) Find the mean vector and covariance matrix of the given image x
(II) Find the Eigen values and then the eigen vectors of the covariance matrix
(III) Create the transformation matrix T, such that rows of T are eigen vectors
(IV) Find the KL Transform

The Fast Fourier Transform is an efficient algorithm for computing the Discrete Fourier Transform. Fast Fourier Transform (FFT) is any efficient
algorithm for calculating the DFT. FFT can compute the same result in only Θ(nlogn)Θ(nlog⁡n) steps. For large sequences, this constitutes quite
a substantial gain. The Fast Fourier Transform (FFT) is an efficient algorithm for the evaluation of that operation (actually, a family of such
algorithms). However, it is easy to get these two confused. Often, one may see a phrase like "take the FFT of this sequence", which really means to
take the DFT of that sequence using the FFT algorithm to do it efficiently. FFT is a faster version of the DFT algorithm.DFT established a
relationship between the time domain and frequency domain representation whereas FFT is an implementation of DFT. computing complexity of
DFT is O(M^2) whereas FFT has M(log M) where M is a data size

5. Obtain forward KL transform for the given vectors.X1 = [ 1 0 0 ] ; X2 = [ 1 0 1 ] ; X3 = [ 1 1 0 ]


(Transpose these vectors) and analyse how the principal components are used for remote
sensingapplications?

Module-2

1. Describe how homomorphic filtering is used to separate illumination and reflectance components
with a suitable example? assignment
2. Explain the following concepts along with its application and a suitable example of each.
a. Image Negatives b. Contrast Stretching assignment
c. Bit-plane Slicing d. Histogram processing
3. Compare spatial and frequency domain methods used for performing image enhancement in
spatial and frequency domain in DIP?
DIFFERENCE BETWEEN SPATIAL DOMAIN AND
FREQUENCY DOMAIN.
In spatial domain , we deal with images as it is. The value of the pixels of the image change with respect to scene. Whereas in frequency
domain , we deal with the rate at which the pixel values are changing in spatial domain.

For simplicity , Let’s put it this way.

SPATIAL DOMAIN

In simple spatial domain , we directly deal with the image matrix. Whereas in frequency domain , we deal an image like this.

FREQUENCY DOMAIN
We first transform the image to its frequency distribution. Then our black box system perform what ever processing it has to performed ,
and the output of the black box in this case is not an image , but a transformation. After performing inverse transformation , it is
converted into an image which is then viewed in spatial domain.

It can be pictorially viewed as

4. What is the requirement of image sampling and quantization? Explain the significant of spatial
resolution with an example?
. Sampling and quantization
In order to become suitable for digital processing, an image function f(x,y) must be digitized both spatially and in amplitude. Typically, a frame grabber or
digitizer is used to sample and quantize the analogue video signal. Hence in order to create an image which is digital, we need to covert continuous data into
digital form. There are two steps in which it is done:

 Sampling

 Quantization

The sampling rate determines the spatial resolution of the digitized image, while the quantization level determines the number of grey levels in the digitized
image. A magnitude of the sampled image is expressed as a digital value in image processing. The transition between continuous values of the image
function and its digital equivalent is called quantization.
The number of quantization levels should be high enough for human perception of fine shading details in the image. The occurrence of false contours is the
main problem in image which has been quantized with insufficient brightness levels.

Spatial resolution states that the clarity of an image cannot be determined by the pixel resolution. The number
of pixels in an image does not matter.
Spatial resolution can be defined as the
smallest discernible detail in an image. (Digital Image Processing - Gonzalez, Woods - 2nd Edition)
Or in other way we can define spatial resolution as the number of independent pixels values per inch.
In short what spatial resolution refers to is that we cannot compare two different types of images to see that
which one is clear or which one is not. If we have to compare the two images, to see which one is more clear or
which has more spatial resolution, we have to compare two images of the same size

Measuring spatial resolution


Since the spatial resolution refers to clarity, so for different devices, different measure has been made to
measure it.

For example

 Dots per inch

 Lines per inch

 Pixels per inch

Dots per inch


Dots per inch or DPI is usually used in monitors.

Lines per inch


Lines per inch or LPI is usually used in laser printers.

Pixel per inch


Pixel per inch or PPI is measure for different devices such as tablets , Mobile phones e.t.c.

5. Describe histogram equalization. Obtain histogram equalization for the following image
segment of size 5 X 5. Write the interference on the image segment before and afterequalization.
20 20 20 18
16 15 15 16
18 15 15 15
19 15 17 16
17 19 18 16

Module-3
1. Explain why noise probability density functions are important? Give the properties and relations
of each of the following noises-
a. Gaussian Noise b. Gamma noise c. Exponential noise
d. Rayleigh Noise e. Uniform noise
2. Enumerate the differences between image enhancement and image restoration along with
suitableexample=

3. Explain about restoration filters used when the image degradation is due to noise only. Justify
your answer using various mean filters?
= Image restoration is an emerging field of image processing in which the focus is on recovering an
original image from a degraded image. Image restoration can be defined as the process of removal or
reduction of degradation in an image through linear or non-linear filtering. Degradation is usually
incurred during the acquisition of the image itself. Just as in image enhancement, the ultimate goal in
restoration is to improve an image. Enhancement is a subjective process while restoration is an
objective process
Degradation can be due to:
1 Image sensor noise,
2 Blur due miss focus,
3 Noise from transmission channel etc

If the degradation present in an image is only due to noise, then,


g (x, y) = f (x, y) + η (x, y)
G (u, v) = F (u, v) + N (u, v)
The restoration filters used in this case are,
1. Mean filters
2. Order static filters and
3. Adaptive filters.
Mean filter

The mean filter is a simple sliding-window spatial filter that replaces the center
value in the window with the average (mean) of all the pixel values in the window.
The window, or kernel, is usually square but can be any shape. An example of
mean filtering of a single 3x3 window of values is shown below.

unfiltered values

5 3 6

2 1 9

8 4 7

5 + 3 + 6 + 2 + 1 + 9 + 8 + 4 + 7 = 45
45 / 9 = 5

mean filtered

* * *

* 5 *

* * *

Center value (previously 1) is replaced by the mean of all nine values (5).
Order static filter:-
This type of filter is based on estimators and is based on "order", the sense of order is about some quatities like minmin (first order
statistic), maxmax (largest order statistic) and etc...

Adaptive filter:-

An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those
parameters according to an optimization algorithm. Because of the complexity of the optimization algorithms, almost all adaptive filters are digital
filters. Adaptive filters are required for some applications because some parameters of the desired processing operation (for instance, the
locations of reflective surfaces in a reverberant space) are not known in advance or are changing. The closed loop adaptive filter uses feedback
in the form of an error signal to refine its transfer function.
4. What are the two approaches used for blind image restoration? How do you differentiate indirect
estimation from direct measurement in degradation process?
= Blind Image Restoration

The goal of image restoration is to reconstruct the original (ideal) scene from a degraded observation. The recovery process is critical to many image
processing applications. Ideally, image restoration aims to undo the image degradation process during image acquisition and processing. If degradation is
severe, it may not be possible to completely recover the original scene, but partial recovery may be plausible.

Typical forms of degradation during image acquisition involve blurring and noise. The blurring may be from, for example, sensor motion or out-of-focus
cameras. In such a situation, the blurring function (called a point-spread function) must be known prior to image restoration. When this blurring function is
unknown, the image restoration problem is called blind image restoration.

Blind image restoration is the process of simultaneously estimating both the original image and point-spread function using partial information about the
image processing and possibly even the original image. The various approaches that have been proposed depend upon the particular degradation and
image models.

Degradation may be difficult to measure or may be time varying in an unpredictable manner. In such cases
information about the degradation must be extracted from the observed image either explicitly or implicitly. This
task is called blind image restoration.
Indirect estimation Direct measurement approaches for blind image restoration

Indirect estimation method employ temporal or spatial averaging to either obtain a restoration or to
obtain key elements of an image restoration algorithm.

5. Geometric spatial transformation is somehow different from spatial transformation. Justify your
answer with suitable example?
= A spatial transformation defines a geometric relationship between each point in the input and output
images. An input image consists entirely of reference points whose coordinate values are known
precisely. The output image is comprised of the observed (warped) data. The general mapping function
can be given in two forms: either relating the output coordinate system to that of the input, or vice
versa. Respectively

A geometric transformation is any bijection of a set having some geometric structure to itself or another such set. Specifically, "A geometric transformation
is a function whose domain and range are sets of points. Most often the domain and range of a geometric transformation are both R2 or both R3. Often
geometric transformations are required to be 1-1 functions, so that they have inverses." [1] The study of geometry may be approached via the study of these
transformations.[2]

Geometric transformations can be classified by the dimension of their operand sets (thus distinguishing between planar transformations and those of space

Module-4

1. What is meant by optimal thresholding? How do you obtain the threshold for imageprocessing
tasks? Write morphological concepts applicable for image processing.

Optimal thresholding is a technique that approximates the histogram using a weighted sum of
distribution functions, and then sets a threshold in such a way that the number of in- correctly
segmented pixels (as predicted from the approximation) is minimum.
2. Why is edge detection useful in image processing? Discuss various gradient operators used in
detecting the edge points?
Edge detection is a process that detects the presence and location of edges constituted by sharp changes
in intensity of an image. Edges define the boundaries between regions in an image, which helps with
segmentation and object recognition. Edge detection of an image significantly reduces the amount of
data and filters out useless information, while preserving the important structural properties in an
image. The general method of edge detection is to study the changes of a single image pixel in an area,
use the variation of the edge neighboring first order or second-order to detect the edge. In this paper
after a brief introduction, overview of different edge detection techniques like differential operator
method such as sobeloperator,prewitt’stechnique,Canny technique and morphological edge detection
technique are given.
various gradient operators used in detecting the edge points
Prewitt Operator-Prewitt operator is used for detecting edges horizontally and vertically.
Sobel Operator-The sobel operator is very similar to Prewitt operator. It is also a derivate mask and is used for
edge detection. It also calculates edges in both horizontal and vertical direction.
Robinson Compass Masks-This operator is also known as direction mask. In this operator we take one mask
and rotate it in all the 8 compass major directions to calculate edges of each direction.
Kirsch Compass Masks-Kirsch Compass Mask is also a derivative mask which is used for finding edges.
Kirsch mask is also used for calculating edges in all the directions.
Laplacian Operator-Laplacian Operator is also a derivative operator which is used to find edges in an image.
Laplacian is a second order derivative mask. It can be further divided into positive laplacian and negative
laplacian.
All these masks find edges. Some find horizontally and vertically, some find in one direction only and some find
in all the directions.

3. Differentiate between parametric and non-parametric decision making. Explain two non-
parametric decision making methods?

two non-parametric decision making methods=


1KNN algorithm-K-Nearest Neighbors is one of the most basic yet essential classification algorithms in
Machine Learning. It belongs to the supervised learning domain and finds intense application in pattern
recognition, data mining and intrusion detection.
It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying
assumptions about the distribution of data
K-Nearest Neighbors is one of the most basic yet essential classification algorithms in Machine Learning. It
belongs to the supervised learning domain and finds intense application in pattern recognition, data mining and
intrusion detection.
It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying
assumptions about the distribution of data

2 Decision tree- Decision tree is the most powerful and popular tool for classification and prediction. A
Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each
branch represents an outcome of the test, and each leaf node (terminal node) holds a class label.Decision trees
classify instances by sorting them down the tree from the root to some leaf node, which provides the
classification of the instance. An instance is classified by starting at the root node of the tree,testing the attribute
specified by this node,then moving down the tree branch corresponding to the value of the attribute as shown in
the above figure.This process is then repeated for the subtree rooted at the new node.
 Decision trees are able to generate understandable rules.
 Decision trees perform classification without requiring much computation.
 Decision trees are able to handle both continuous and categorical variables.
 Decision trees provide a clear indication of which fields are most important for prediction or
classification.

1. How MPEG standard is different from JPEG also draw and explain the block diagram of MPEG
encoder? How the quality can be achieved using both the standards?
 JPEG is mainly used for image compression.JPEG stands for Joint Photographic Expert Group.The file
name for a JPEG image is .jpg or .jpeg.JPEG is the most commonly used format for photographs. It is
specifically good for color photographs or for images with many blends or gradients. However, it is not
the best with sharp edges and might lead to a little blurring.JPEG is a method of lossy compression for
digital photography.An advantage to using the JPEG format is that due to compression, a JPEG image
will take up a few MB of data.Due to the popularity of JPG, it is also accepted in most if not in all
programs. It is quite popular for web hosting of images, for amateur and average photographers, digital
cameras, etc.

 MPEG, stands for the Moving Picture Experts Group. It is a working group of experts that was formed
in 1988 by ISO and IEC.The aim of MPEG was to set standards for audio and video compression and
transmission.The standards as set by MPEG consist of different Parts. Each part covers a certain aspect
of the whole specification. MPEG has standardized the following compression formats and ancillary
standards:

 MPEG-1 : Coding of moving pictures and associated audio for digital storage media
 MPEG-2 : Generic coding of moving pictures and associated audio information (ISO/IEC 13818).
 MPEG-3: dealt with standardizing scalable and multi-resolution compression and was intended for
HDTV compression but was found to be redundant and was merged with MPEG-2.
 MPEG-4 : Coding of audio-visual objects. It includes the MPEG-4 Part 14 (MP4)

Encoder mpeg
2. The table below shows different symbols with their corresponding occurring probabilities.

Symbol Probability
a 0.05
b 0.2
c 0.1
d 0.05
e 0.3
f 0.2
g 0.1

Create a Huffman tree and Huffman table for the above table. The table should show the code
word for each symbol and the corresponding code-word length.

Module-5

1. How do you classify objects while performing object recognition? State and prove Baye’s
theorem as applied to pattern recognition? Discuss in detail.
Object recognition is a computer vision technique for identifying objects in images or videos. Object
recognition is a key output of deep learning and machine learning algorithms. When humans look at a
photograph or watch a video, we can readily spot people, objects, scenes, and visual details.

The Bayes theorem describes the probability of an event based on the prior knowledge of the conditions that

might be related to the event. If we know the conditional probability , we can use the bayes rule to

find out the reverse probabilities .

How can we do that?


The above statement is the general representation of the Bayes rule.

We can generalize the formula further.

If multiple events Ai form an exhaustive set with another event B.

We can write the equation as

2. Explain the following concepts with an example of each –


Feature Vector -A vector is a series of numbers. It is like a matrix with only one row but multiple columns (or
only one column but multiple rows). An example is: [1,2,3,5,6,3,2,0].
A feature vector is just a vector that contains information describing an object's important characteristics.
In image processing, features can take many forms. A simple feature representation of an image is the raw
intensity value of each pixel. However, more complicated feature representations are also possible. For facial
expression analysis, mostly SIFT descriptor features (scale invariant feature transform). These features capture
the prevalence of different line orientations.
example of color image, color of an object is also represented as one the feature. then it can be represented
as f=[r,g,b] where r, g, b are corresponding values of pixel in three different planes which represents the
color feature of the particular pixel. Likewise collection of other features that may be related with texture,
object etc considered as feature vector.

Random variablesA random variable, usually written X, is a variable whose possible values are numerical
outcomes of a random phenomenon. There are two types of random variables, discrete and continuous.
A discrete random variable is one which may take on only a countable number of distinct values such as
0,1,2,3,4,........ Discrete random variables are usually (but not necessarily) counts. If a random variable can take
only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include
the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's
surgery, the number of defective light bulbs in a box of ten.
A continuous random variable is one which takes an infinite number of possible values. Continuous random
variables are usually measurements. Examples include height, weight, the amount of sugar in an orange, the
time required to run a mile.

Conditional probability -The conditional probability of an event B is the probability that the event will occur
given the knowledge that an event A has already occurred. This probability is written P(B|A), notation for
the probability of B given A. In the case where events A and B are independent (where event A has no effect on
the probability of event B), the conditional probability of event B given event A is simply the probability of
event B, that is P(B)If events A and B are not independent, then the probability of the intersection of A and
B (the probability that both events occur) is defined by
P(A and B) = P(A)P(B|A).

the conditional probability P(B|A) is easily obtained by dividing by P(A):

Image Analysis-Image analysis is a technique often used to obtain quantitative data from tissue samples using
analysis software that segments pixels in a digital image based on features such as color (i.e., RGB), density, or
texture. A limitation of image analysis is that it often requires assumptions to be made and only provides
measurements of relative changes to the object(s) of interest in tissues. Even with its recognized
limitations, image analysis is a powerful tool when used correctly to obtain quantitative data .image analysis is a
major research and development task. Without model-based techniques that make the segmentation as robust,
reproducible, and efficient as possible, the interactive visualization described in the following would not be
possible in routine clinical practice.
Voice Analysis-Voice Analysis is a voice biometrics program used for law enforcement and criminal
identification. It analyzes audio evidence accurately by applying voice biometrics technology in a way that
makes it easier to work with audio evidence. It assists forensics experts and security organizations complete
voice treatment and speaker identification processes accurately. With the straightforward identification it
provides, Forensic Voice Analysis contributes to criminal investigation and prosecution of suspects. It involves
the following-Gender detection, format verification,speaker identification , speech silence detection and
likelihood ratio calculation.
Pattern Classes-a pattern is an arrangement of descriptors(or features). A pattern class is a family of patterns
that share some common properties . pattern classes are denoted w1,w2,…..wn where n is the numberof classes .
pattern recognition by machine involves techniques for assigning patterns to their respective classes
-automatically and with as littlehuman intervention possible .pattern recognition consists of two steps-
1. Feature selection (extraction)
2. Matching(classification)
3. Explain all the real world applications of pattern recognition and also write how one can identify
facial expressions using image analysis approaches?
Applications –
 Image processing, segmentation and analysis
Pattern recognition is used to give human recognition intelligence to machine which is required in image
processing.
 Computer vision
Pattern recognition is used to extract meaningful features from given image/video samples and is used in
computer vision for various applications like biological and biomedical imaging.
 Seismic analysis
Pattern recognition approach is used for the discovery, imaging and interpretation of temporal patterns in
seismic array recordings. Statistical pattern recognition is implemented and used in different types of
seismic analysis models.
 Radar signal classification/analysis
Pattern recognition and Signal processing methods are used in various applications of radar signal
classifications like AP mine detection and identification.
 Speech recognition
The greatest success in speech recognition has been obtained using pattern recognition paradigms. It is
used in various algorithms of speech recognition which tries to avoid the problems of using a phoneme
level of description and treats larger units such as words as pattern
 Finger print identification
The fingerprint recognition technique is a dominant technology in the biometric market. A number of
recognition methods have been used to perform fingerprint matching out of which pattern recognition
approaches is widely used.
facial expressions using image analysis approaches-
1.convert vedio into frames.
2.read the input vedio frame image .
3.convert the image into gray scale image
4.enhance the input image with median , wiener and gaussian filters.
5.Find the best filter based PSNR, RMSE values .
6.Apply viola -jones algorithm to detect the face region .
7.Use bounding box method and crop the face region .
8.Use threshold value to extract nnon skin regions
9.Apply morphological operations to extract continuous boundaries of non skinregion .
10.Mask the boundary from the original image
11.Extract the mouth region
12.Area is calculated from mouth region
13.Recognize facial emotions based on the value of area .

Read the input video


Convert video into frames.

frame image
Convert the image into
grayscale image.
Enhance the input image
with median, wiener and .
Mask the boundary from the
original image.
Extract the mouth region.
Area is calculated from
extracted mouth region.
Recognize facial emotions
based on the
Read the input video frame
image
Convert the image into
grayscale image.
Enhance the input image
with median, wiener and
Mask the boundary from the
original image.
Extract the mouth region.
Area is calculated from
extracted mouth region.
Recognize facial emotions
based on the value of area.
4. “Suppose you have recorded voices of 10 peoples. Now your job is to identify the voice of a
particular person among those individuals”. Write all the steps you would perform for
identifying those voice samples?

 The first stage of the speech recognition process is preprocessing. In order for any speech recognition
system to operate at a reasonable speed, the amount of data used as input must be kept to a minimum.
The inherent challenge in this is to remove the "bad" data, such as noise, without losing or distorting
the critical data needed to identify what has been said. Two of the more common ways of reducing this
data are sampling, where "snapshots" of the data are taken at regular time intervals, and filtering, where
only data in a certain frequency range is kept.

 These analog samples are then converted to digital form. Most approaches group the samples into small
time intervals called frames. The preprocessor then extracts acoustic patterns from each frame, as well
as the changes that occur between frames. This process is called spectral analysis because it focuses on
individual frequency elements (Markowitz).

 Once preprocessing is completed, the input data moves to the recognition stage, where the primary
work involved in speech recognition is accomplished. There are two main approaches to attacking the
speech recognition problem: a knowledge-based approach and a data-based approach. In the
knowledge-based approach, the goal is to express what people know about speech in terms of specific
rules (whether they are phonetic, syntactic, or some other type) that can then be used to analyze the
input. The two main problems with this approach are the depth of linguistic knowledge and the large
amount of manual labor needed to establish a fast, accurate system. As a result, the data-based
approaches are the dominant technique used in commercial speech recognition products on the market
today

 Data-based approaches become more accurate as they encounter more data; that is, the data
encountered gradually improves the models used to analyze the data .Hidden Markov Models are an
example of a data-based approach to speech recognition.

 For the speech recognition equipment, although it cannot determine with certainty what words occurred
earlier in the input signal it received, it can use its feature information to determine which words had
the greatest probability of occurring.

 The final stage in the speech recognition process is the communication stage. In this stage, the software
system acts upon the voice input it has received and translated. Applications of speech recognition
systems are usually grouped into four categories: Command-and-Control, Data Entry, Data
Access/Information Retrieval, and Dictation

5. Explain the concept of morphology and discuss all the morphological algorithms used in digital
image processing, take suitable example to justify your answer?

Morphology -Identification, analysis, and description of the structure of the smallest unit of words .it is
Theory and technique for the analysis and processing of geometric structures
– Based on set theory, lattice theory, topology, and random functions
– Extract image components useful in the representation and description of region shape such as
boundaries, skeletons, and convex hull
– Input in the form of images, output in the form of attributes extracted from those images
– Attempt to extract the meaning of the images
1. Erosion and dilation-
∗ Erosion shrinks or thins objects in a binary image .Morphological filter in which image details
smaller than the se are filtered/removed from the image
Dilation –Grows or thickens objects in a binary image.Bridging gaps in broken characters. Lowpass
filtering produces a grayscale image; morphological operation produces a binary image

2. Opening and closing


-- Opening smoothes the contours of an object, breaks narrow isthmuses, and eliminates thin
protrusions
--Closing smoothes sections of contours, fusing narrow breaks and long thin gulfs, eliminates
small holes, and fills gaps in the contour
(B)z ⊆ A} – Union of all translates of B that fit into A
3. Hit -or- mis transformation - Basic tool for shape detection in a binary image – Uses the
morphological erosion operator and a pair of disjoint set
– First set fits in the foreground of input image; second set misses it completely
– The pair of two set is called composite structuring element

the morphological algorithms used in digital image processing,

1Boundary extraction– extracting the boundary of an object is often useful .


Boundary of a set A –∗ Denoted by β(A)
∗ Extracted by eroding A by a suitable set B and computing set difference between A and its erosion
β(A) = A − (A B)
Using a larger set will yield a thicker boundary

2Hole filling –
 Hole -Background region surrounded by a connected border of foreground pixels.
 Algorithm based on set dilation, complementation, and intersection
 Let A be a set whose elements are 8-connected boundaries, each boundary enclosing a background
(hole)
 Given a point in each hole, we want to fill all holes
 Start by forming an array X0 of 0s of the same size as A - The locations in X0 corresponding to the
given point in each hole are set to 1

 let B be a symmetric se with 4-connected neighbors to the origin


01 0
111
010
 Compute Xk = (Xk−1 ⊕ B) ∩ Ac k = 1, 2, 3, . . .
 Algorithm terminates at iteration step k if Xk = Xk−1
 Xk contains all the filled holes – Xk∪ A contains all the filled holes and their boundaries
 The intersection with Ac at each step limits the result to inside the roi∗ Also called conditioned dilation
3 Extraction of connected components
– Let A be a set containing one or more connected components
– Form an array X0 of the same size as A .All elements of X0 are 0 except for one point in each connected
component set to 1
– Select a suitable se B, possibly an 8-connected neighborhood as
111
111
111
– Start with X0 and find all connected components using the iterative procedure Xk = (Xk−1 ⊕ B) ∩ A k = 1, 2,
3, . . .
– Procedure terminates when Xk = Xk−1; Xk contains all the connected components in the input image
– The only difference from the hole-filling algorithm is the intersection with A instead of Ac . This is because
here, we are searching for foreground points while in hole filling, we looked for background points (holes)

4Convex hull
– Convex set A - Straight line segment joining any two points in A lies entirely within A
– Convex hull H of an arbitrary set of points S is the smallest convex set containing S
– Set difference H
− S is called the convex deficiency of S
– Convex hull and convex deficiency are useful to describe objects
– Algorithm to compute convex hull C(A) of a set A . Let Bi ,i = 1, 2, 3, 4 represent the four structuring
elements in the figure · Bi is a clockwise rotation of Bi−1 by 90◦ . Implement the equation Xi k = (Xk−1 ~ B i )
∪ A i = 1, 2, 3, 4 and k = 1, 2, 3, . . . with Xi 0 = A . Apply hit-or-miss with B1 till Xk == Xk−1, then, with B2
over original A, B3 , and B4 . Procedure converges when Xi k = Xi k−1 and we let Di = Xi k . Convex hull of A
is given by C(A) = [ 4 i=1 Di
Convex hull can grow beyond the minimum dimensions required to guarantee convexity. May be fixed by
limiting growth to not extend past the bounding box for the original set of points .
5 Thinning
– Transformation of a digital image into a simple topologically equivalent image .Remove selected foreground
pixels from binary images . Used to tidy up the output of edge detectors by reducing all lines to single pixel
thickness
– Thinning of a set A by se B is denoted by A ⊗ B
– Defined in terms of hit-or-miss transform as A ⊗ B = A
− (A ~ B) = A ∩ (A ~ B) c
– Only need to do pattern matching with set; no background operation required in hit-or-miss transform
– A more useful expression for thinning A symmetrically based on a sequence of set {B} = {B 1 , B2 , . . . , Bn }
where Bi is a rotated version of Bi−1
– Define thinning by a sequence of set as A ⊗ {B} = ((. . .((A ⊗ B 1 ) ⊗ B 2 ). . .) ⊗ B n )
– Iterate over the procedure till convergence

6Thickening
– Morphological dual of thinning defined by A B = A ∪ (A ~ B)
– set complements of those used for thinning
– Thickeningcan also be defined as a sequentialoperation A{B}=((...((AB1)B2)...) B n ) Usual practice to thin
the background and take the complement .May result in disconnected points . Post-process to remove the
disconnected points
7 Skeletons – Skeleton S(A) of a set A .Deductions 1. If z is a point of S(A) and (D)z is the largest disk centered
at z and contained in A, one cannot find a larger disk (not necessarily centered at z) containing (D)z and
included in A; (D)z is called a maximum disk 2. Disk (D)z touches the boundary of A at two or more different
places
– Skeleton can be expressed in terms of erosions and openings S(A) = [ K k=0 Sk(A) where
Sk(A) = (A kB) −(A kB) ◦ B
_--8A kB indicates k successive erosions of A (A kB) = ((. . .((A B) B) . . .) B) .
K is the last iterative step before A erodes to an empty set K = max{k | (A kB) 6= ∅} .
S(A) can be obtained as the union of skeleton subsets Sk(A) .
A can be reconstructed from the subsets using the equation [ K k=0 (Sk(A) ⊕ kB) where (Sk(A) ⊕ kB) denotes
k successive dilations of Sk(A) (Sk(A) ⊕ kB) = ((. . .((Sk(A) ⊕ B) ⊕ B) ⊕ . . .) ⊕ B)
8Pruning
– Complement to thinning and sketonizing algorithms to remove unwanted parasitic components
– Automatic recognition of hand-printed characters .Analyze the shape of the skeleton of each character .
Skeletons characterized by “spurs” or parasitic components . Spurs caused during erosion by non-uniformities in
the strokes .Assume that the length of a spur does not exceed a specific number of pixels
– Skeleton of hand-printed “a” .Suppress a parasitic branch by successively eliminating its end point ∗
Assumption: Any branch with ≤ 3 pixels will be removed .Achieved with thinning of an input set A with a
sequence of ses designed to detect only end points X1 = A ⊗ {B}
– Result of applying the above thinning three times . Restore the character to its original form with the parasitic
branches removed . Form a set X2 containing all end points in X1 X2 = [ 8 k=1 (X1 ~ B k ).Dilate end points
three times using set A as delimiter X3 = (X2 ⊕ H) ∩ A where H is a 3 × 3 se of 1s and intersection with A is
applied after each step .The final result comes from X4 = X1 ∪ X3

Вам также может понравиться