Вы находитесь на странице: 1из 67

1

CHAPTER 1

INTRODUCTION

1.1 Introduction to the project

Interactive systems have attracted a lot of research interest in recent years,


especially for content- based image retrieval systems. Contrary to the early systems,
which focused on fully automatic strategies, recent approaches have introduced
human-computer interaction. In this paper, we focus on the retrieval of concepts
within a large image collection. We assume that a user is looking for a set of images,
the query concept, within a database. The aim is to build a fast and efficient strategy
to retrieve the query concept. In content-based image retrieval (CBIR), the search
may be initiated using a query as an example. The top rank similar images are then
presented to the user. Then, the interactive process allows the user to refine his
request as much as necessary in a relevance feedback loop. Many kinds of
interaction between the user and the system have been proposed, but most of the
time, user information consists of binary labels indicating whether or not the image
belongs to the desired concept.

1.2 Organization profile

Software Solutions is an IT solution provider for a dynamic environment where


business and technology strategies converge. Their approach focuses on new ways of
business combining IT innovation and adoption while also leveraging an organization’s
current IT assets. Their work with large global corporations and new products or
services and to implement prudent business and technology strategies in today’s
environment.

Image Retrieval’s range of expertise includes:


• Software Development Services
• Engineering Services
• Systems Integration
2

• Customer Relationship Management


• Product Development
• Electronic Commerce
• Consulting
• IT Outsourcing
We apply technology with innovation and responsibility to achieve two broad objectives:
• Effectively address the business issues our customers face today.
• Generate new opportunities that will help them stay ahead in the future.

This approach rests on:


• A strategy where we architect, integrate and manage technology services and
solutions - we call it AIM for success.
• A robust offshore development methodology and reduced demand on customer
resources.
• A focus on the use of reusable frameworks to provide cost and times benefits.
They combine the best people, processes and technology to achieve excellent results
- consistency. We offer customers the advantages of:
Speed:
They understand the importance of timing, of getting there before the
competition. A rich portfolio of reusable, modular frameworks helps jump-start projects.
Tried and tested methodology ensures that we follow a predictable, low - risk path to
achieve results. Our track record is testimony to complex projects delivered within and
evens before schedule.

Expertise:
Our teams combine cutting edge technology skills with rich domain expertise.
What’s equally important - they share a strong customer orientation that means they
actually start by listening to the customer. They’re focused on coming up with solutions
that serve customer requirements today and anticipate future needs.
3

A Full Service Portfolio:


They offer customers the advantage of being able to Architect, integrate and
manage technology services. This means that they can rely on one, fully accountable
source instead of trying to integrate disparate multi vendor solutions.

1.3 Proposed System

In case of the proposed system we use the following method to improve the efficiency.
They are as follows.
• We implemented our models in a CBIR system for a specific application domain,
the retrieval of coats of arms. We implemented altogether 19 features, including a
color histogram, symmetry features.
• Content-based image retrieval uses the visual contents of an image such as color,
shape, texture, and spatial layout to represent and index the image.

1.4 Existing system


In the existing system the CBIR method faced a lot of disadvantage in case of the image
retrieval. The following are the main disadvantage faced in case of the medical field -
Medical image description is an important problem in content-based medical image
retrieval. Hierarchical medical image semantic features description model is proposed
according to the main sources to get semantic features currently. Hence we propose the
new algorithm to overcome the existing system.
In existing system, Images were first annotated with text and then searched using a text-
based approach from traditional database management systems.

1.5 Need for computation


We all know the importance of computerization. The world is moving ahead at
lightening speed and every one is running short of time. One always wants to get the
information and perform a task he/she/they desire(s) within a short period of time and
too with amount of efficiency and accuracy. The application areas for the
computerization have been selected on the basis of following factors:
4

• Minimizing the manual records kept at different locations.


• There will be more data integrity.
• Facilitating desired information display, very quickly, by retrieving information
from users.
• Facilitating various statistical information which helps in decision-making?
• To reduce manual efforts in activities that involved repetitive work.
• Updating and deletion of such a huge amount of data will become easier.
5

CHAPTER 2

PROBLEM STATEMENT

In the existing system the CBIR method faced a lot of disadvantage in case of the image
retrieval. The following are the main disadvantage faced in case of the medical field -
Medical image description is an important problem in content-based medical image
retrieval. Hierarchical medical image semantic features description model is proposed
according to the main sources to get semantic features currently. Hence we propose the
new algorithm to overcome the existing system. In existing system, Images were first
annotated with text and then searched using a text-based approach from traditional
database management systems.
6

CHAPTER 3

THE SURVEY OF LITERATURE

Initially developed in document retrieval (Salton 1989), relevance feedback was


transformed and introduced into content-based multimedia retrieval, mainly content-
based image retrieval (CBIR),during early and mid 1990.s (Kurita and Kato 1993; Picard
et al. 1996; Rui et al. 1998). Interestingly, it appears to have attracted more attention in
the new field than the previous, a variety of solutions has been proposed within a short
period and it remains an active research topic. The reasons can be that more ambiguities
arise when interpreting images than words, which makes user interaction more of a
necessity; and in addition, judging a document takes time while an image reveals its
content almost instantly to a human observer, which makes the feedback process faster
and more sensible for the end user. The need for user-in-the-loop in CBIR stems from
the fact that images reside in a continuous representation space, in which semantic
concepts are best described in discriminative subspaces. Cars are of certain shape
while .sunset. is more describable by color. More importantly, different users at different
times may have different interpretations or intended usages for the same image, which
makes off-line learning unfeasible in general. Fully automated off-line preprocessing
(e.g., clustering, classification) makes sense for some specific applications with well-
defined image classes. But for many others, the best answer does not exist and ignoring
user’s individuality can beas senseless as trying to determine the world’s greatest color.
A straightforward way of getting user into the loop is to ask the user to tune the system
parameters during retrieval process, but it is too much a burden for a common user (Rui
et al. 1998).A more feasible form of interaction is to ask the user to provide feedbacks
regarding the (ir) relevance of the current retrieval results. The system then learns from
these training examples to achieve an improved performance next round, iteratively if the
user so desires. Relevance feedback algorithms have been shown to provide dramatic
performance boost in retrieval systems (MacArthur et al. 2000; Picard et al. 1996; Rui et
al. 1998; Tieu and Viola 2000;
7

The Characteristics of the Relevance Feedback Problem

Since the general assumption is that every user’s need is different (Kurita and Kato
1993) and time varying, the database cannot adopt a fixed clustering structure; and the
total number of classes and the class membership are not available before-hand since
these are assumed to be user-dependent and time varying as well. Of course, these rather
extreme assumptions can be relaxed in a real-world application to the degree of choice.
A typical scenario for relevance feedback in content-based image retrieval is as follows:

Step 1. Machine provides initial retrieval results, through query-by-keyword, sketch, or


example,etc.;

Step 2. User provides judgment on the currently displayed images as to whether, and to
what degree, they are relevant or irrelevant to her/his request;

Step 3. Machine learns and tries again. Go to step 2.

If each image/region is represented by a point in a feature space, relevance feedback with


only positive (i.e., relevant) examples can be cast as a density estimation (Ishikawa et al.
1998; Meilhac and Nastar 1999) or novelty detection (Chen et al. 2001; Scholkopf et al.
2000) problem; while with both positive and negative training examples it becomes a
classification problem, or an on-line learning problem in a batch mode, but with the
following characteristics associated with this specific application scenario:

Small sample issue. The number of training examples is small (typically <20 per round
of interaction) relative to the dimension of the feature space (from dozens to hundreds, or
even more),while the number of classes is large for most real-world image databases. For
such small sample size,some existing learning machines such as support vector machines
(SVM) (Vapnik 1995) cannot give stable or meaningful results (Tieu and Viola 2000;
Zhou and Huang 2001), unless more training samples can be elicited from the user (Tong
and Chang 2001).

Asymmetry in training sample. The desired output of information retrieval is not


necessarily a binary decision on each point as given by a classifier, but rather a rank-
ordered top-k returns. This is a less demanding task since the rank or configuration of the
8
irrelevant classes/points is of no concern as long as they are well beyond the top-k
returns. Most classification or learning algorithms, e.g., discriminant analysis (Duda and
Hart 1973) or SVM (Vapnik 1995), treat the positive and negative examples
interchangeably and assume that both sets represent the true distributions equally well.
However, in reality, the small number of negative examples is unlikely to be
representative for all the irrelevant classes; thus, an asymmetric treatment may be
necessary (Zhou and Huang 2001).

Real time requirement Finally, since the user is interacting with the machine in real
time, the algorithm shall be sufficiently fast, and avoid if possible heavy computations
over the whole dataset.

The Variants of Relevance Feedback Algorithms

It is not the intention of this section to list all existing techniques but rather to
point out the major variants and compare their merits. We would emphasize that under
the same notion of .relevance feedback, different methods might have adopted different
assumptions or problem settings thus incomparable.
Content-based Image Retrieval (CBIR) deals with finding images based on their content,
without using any additional textual information. CBIR has been investigated
for quite some time, with main focus on system entered methods [10]. In textual
information retrieval, it was shown that relevance feedback, i.e. having a user judge
retrieved documents and using these to refine the search can lead to a significant
performance improvement [8]. In image retrieval (content-based, as well as using textual
information), relevance feedback has been discussed in some papers, but in contrast to
the text retrieval domain, where the Rocchio relevance feedback method can be
considered as a reasonable and well established baseline, no standard method is defined.
A survey on relevance feedback techniques for image retrieval until 2002 was presented
in [12]. Most approaches use the marked images as individual queries and combine the
retrieval results. More recent approaches follow a query-instance-based approach [4] or
use support vector machines to learn a two-class classifier. The approach presented here,
is similar to the approach presented in [4], because it also follows a nearest neighbor
search for each query image, but
9
instead of using only the best matching query/database image combination, we consider
all query images (positive and negative) jointly. The nearest-neighbor approach is
appealing, because nowadays, most CBIR systems are based on nearest neighbor
searches on arbitrary vectorial features rather than restricting the features to sparse
binary feature spaces as they are used in the GNU Image Finding Tool (GIFT) [6] and
therefore the technique can easily be integrated into most image retrieval systems. Here,
we present a technique that has both advantages: the individual queries (i.e. positive and
negative images from relevance feedback) are considered independently and in a nearest
neighbor-like manner, but machine-learning techniques are applied to optimize the
search criterion for the nearest neighbors.
The learning step consists of tuning of weights for the distance function used in the
nearest neighbor search. The learning of weights for nearest-neighbor searches (using
L2-norm) was investigated in [7], where a weight for each component of the vectors to
be compared is determined from labeled training-data. In contrast to this work, here we
use the L1-norm, which is known to be better suited for histograms. Additionally, here
the amount of available training data is much smaller than normally used for distance
learning [7].
Relevance Feedback
In the following, we present our new methods for relevance feedback and compare it to
relevance score [4] and Rocchio’s method [8]. The methods presented can use negative
feedback, but also work without. The number of feedback images1 is not limited, so that
a user is free to mark as many images as relevant/non-relevant as he likes, so that a user
who marks many images is likely to obtain better results than a user who only marks a
few images.
In our setup, given an initial query, the user is presented with the top-ranked 20 images
from the database, and can choose to mark images as relevant, nonrelevant,or to not
mark them at all. Then, these judgements are sent back to the retrieval system which can
use the provided information to refine the results and to provide the user with a new set
of (hopefully) better results. In each iteration of relevance feedback, the system can use
machine learning techniques to refine its similarity measure and thus to improve the
results. In Section 3, we present the new distance learning technique which can be used
in combination with all retrieval schemes.
10

CHAPTER 4

SYSTEM ANALYSIS

4.1 Introduction
After analyzing the requirements of the task to be performed, the next step is to
analyze the problem and understand its context. The first activity in the phase is studying
the existing system and other is to understand the requirements and domain of the new
system. Both the activities are equally important, but the first activity serves as a basis of
giving the functional specifications and then successful design of the proposed system.
Understanding the properties and requirements of a new system is more difficult and
requires creative thinking and understanding of existing running system is also difficult,
improper understanding of present system can lead diversion from solution.

4.2 Analysis Model

Systems Development Life Cycle (SDLC) is any logical process used by a


systems analyst to develop an information system including requirements, validation,
training and user ownership. Any SDLC should result in a high quality system that meets
or exceeds customer expectations, reaches completion within time and cost estimates,
works effectively and efficiently in the current and planned information Technology
infrastructure, and is inexpensive to maintain and cost-effective to enhance.

Computer systems have become more complex and often (especially with the advent of
Service Oriented Architecture) link multiple traditional systems potentially supplied by
different software vendors. To manage this level of complexity, a number of system
development life cycle (SDLC) models have been created: “waterfall”," "foution,"
"spiral," "build and fix," " Rapid Prototyping and "synchronize and stabilize."

SDLC models can be described along a spectrum of agile to iterative to sequential.


Agile Methodologies such as XP and Scrum focus on light-weight processes which
allow for rapid changes along the development cycle. Interactive methodologies, such as
Rational Unified Process and DSDM, focus on limited project scopes and expanding or
improving products by multiple iterations. Sequential or big-design-upfront (BDUF)
11
models, such as waterfall, focus on complete and correct planning to guide large projects
and risks to successful and predictable results.

Some agile and iterative proponents confuse the term SDLC with sequential or "more
traditional" processes; however, SDLC is an umbrella term for all methodologies for the
design, implementation, and release of software.

In Project Management a project has both a life cycle and a "systems development life
cycle," during which a number of typical activities occur. The project life cycle (PLC)
encompasses all the activities of the project, while the systems development life cycle
focuses on realizing the product Requirements

Fig 4.1 Software development life cycle

4.3 System Requirements


Hardware Requirements:
• SYSTEM : Pentium IV 2.4 GHz
• HARD DISK : 40 GB
• FLOPPY DRIVE : 1.44 MB
• MONITOR : 15 VGA colour
• MOUSE : Logitech.
• RAM : 256 MB
• KEYBOARD : 110 keys enhanced.
12

Software Requirements:
• Operating system : Windows XP Professional
• Front End : Microsoft Visual Studio .Net 2008
• Coding Language : C#.

4.4 Input and Output


The major inputs and outputs and major functions of the system are follows:
Inputs:
In the process of execution, the inputs taken are a set consisting of images and an image
to compare with the set.
Outputs:
The image will take the relevant image what the user search. One can see that we have
selected concepts of different levels of complexities. The performances go from few
percentages of Mean average precision to 89%. The concepts that are the most difficult
to retrieve are very small and/or have a much diversified visual conten
13

CHAPTER 5

FEASIBILITY REPORT

Preliminary investigation examine project feasibility, the likelihood the system


will be useful to the organization. The main objective of the feasibility study is to test the
Technical, Operational and Economical feasibility for adding new modules and
debugging old running system. All system is feasible if they are unlimited resources and
infinite time. There are aspects in the feasibility study portion of the preliminary
investigation:
• Technical Feasibility
• Operational Feasibility
• Economical Feasibility

5.1 Economic Feasibility

Economic feasibility attempts 2 weigh the costs of developing and implementing


a new system, against the benefits that would accrue from having the new system in
place. This feasibility study gives the top management the economic justification for the
new system.A simple economic analysis which gives the actual comparison of costs and
benefits are much more meaningful in this case. In addition, this proves to be a useful
point of reference to compare actual costs as the project progresses. There could be
various types of intangible benefits on account of automation. These could include
increased customer satisfaction, improvement in product quality better decision making
timeliness of information, expediting activities, improved accuracy of operations, better
documentation and record keeping, faster retrieval of information, better employee
morale.

5.2 Operational Feasibility

Proposed project is beneficial only if it can be turned into information systems


that will meet the organizations operating requirements. Simply stated, this test of
feasibility asks if the system will work when it is developed and installed. Are there
major barriers to Implementation? Here are questions that will help test the operational
14
feasibility of a project:

Is there sufficient support for the project from management from users? If the current
system is well liked and used to the extent that persons will not be able to see reasons for
change, there may be resistance.

Are the current business methods acceptable to the user? If they are not, Users may
welcome a change that will bring about a more operational and useful systems.

Have the user been involved in the planning and development of the project?

Early involvement reduces the chances of resistance to the system and in general and
increases the likelihood of successful project.

Since the proposed system was to help reduce the hardships encountered. In the existing
manual system, the new system was considered to be operational feasible.

5.3 Technical Feasibility

Evaluating the technical feasibility is the trickiest part of a feasibility study. This
is because, .at this point in time, not too many detailed design of the system, making it
difficult to access issues like performance, costs on (on account of the kind of technology
to be deployed) etc. A number of issues have to be considered while doing a technical
analysis.

Understand the different technologies involved in the proposed system before


commencing the project we have to be very clear about what are the technologies that are
to be required for the development of the new system. Find out whether the organization
currently possesses the required technologies. Is the required technology available with
the organization?
15

CHAPTER 6

SOFTWARE REQUIREMENT SPECIFICATION

6.1 Introduction
Purpose: The main purpose for preparing this document is to give a general insight into
the analysis and requirements of the existing system or situation and for determining the
operating characteristics of the system.
Scope: This Document plays a vital role in the development life cycle (SDLC) and it
describes the complete requirement of the system. It is meant for use by the developers
and will be the basic during testing phase. Any changes made to the requirements in the
future will have to go through formal change approval process.

Developers responsibilities overview:


The developer is responsible for:
• Developing the system, which meets the SRS and solving all the requirements of
the system?
• Demonstrating the system and installing the system at client's location after the
acceptance testing is successful.
• Submitting the required user manual describing the system interfaces to work on
it and also the documents of the system.
• Conducting any user training that might be needed for using the system.
• Maintaining the system for a period of one year after installation.

6.2 Functional Requirements


Output Design
Outputs from computer systems are required primarily to communicate the
results of processing to users. They are also used to provide a permanent copy of the
results for later consultation. The various types of outputs in general are:
• External Outputs, whose destination is outside the organization.
• Internal Outputs whose destination is within organization and they are the
16

• User’s main interface with the computer.


• Operational outputs whose use is purely within the computer department.
• Interface outputs, which involve the user in communicating directly.

Output Definition

The outputs should be defined in terms of the following points:

 Type of the output


 Content of the output
 Format of the output
 Location of the output
 Frequency of the output
 Volume of the output
 Sequence of the output
It is not always desirable to print or display data as it is held on a computer. It
should be decided as which form of the output is the most suitable.

Input Design
Input design is a part of overall system design. The main objective during the
input design is as given below:
• To produce a cost-effective method of input.
• To achieve the highest possible level of accuracy.
• To ensure that the input is acceptable and understood by the user.

Input Stages:
The main input stages can be listed as below:
• Data recording
• Data transcription
• Data conversion
• Data verification
• Data control
• Data transmission
17

• Data validation
• Data correction

Input types:
It is necessary to determine the various types of inputs. Inputs can be categorized as
follows:
• External inputs, which are prime inputs for the system.
• Internal inputs, which are user communications with the system.
• Operational, which are computer department’s communications to the
system?
• Interactive, which are inputs entered during a dialogue.

Input media:
At this stage choice has to be made about the input media. To conclude about the
input media consideration has to be given to;
• Type of input
• Flexibility of format
• Speed
• Accuracy
• Verification methods
• Rejection rates
• Ease of correction
• Storage and handling requirements
• Security
• Easy to use
• Portability
Keeping in view the above description of the input types and input media, it can
be said that most of the inputs are of the form of internal and interactive. As
Input data is to be the directly keyed in by the user, the keyboard can be considered to be
the most suitable input device.

Error Avoidance
18
At this stage care is to be taken to ensure that input data remains accurate form
the stage at which it is recorded up to the stage in which the data is accepted by the
system. This can be achieved only by means of careful control each time the data is
handled.

Error Detection
Even though every effort is make to avoid the occurrence of errors, still a small
proportion of errors are always likely to occur, these types of errors can be discovered by
using validations to check the input data.

Data Validation
Procedures are designed to detect errors in data at a lower level of detail. Data
validations have been included in the system in almost every area where there is a
possibility for the user to commit errors. The system will not accept invalid data.
Whenever an invalid data is keyed in, the system immediately prompts the user and the
user has to again key in the data and the system will accept the data only if the data is
correct. Validations have been included where necessary.
The system is designed to be a user friendly one. In other words the system has
been designed to communicate effectively with the user. The system has been designed
with popup menus.

User Interface Design


It is essential to consult the system users and discuss their needs while designing
the user interface:

User Interface systems can be broadly classified as:


1. User initiated interface the user is in charge, controlling the progress of the
user/computer dialogue. In the computer-initiated interface, the computer selects
the next stage in the interaction.
2. Computer initiated interfaces
In the computer initiated interfaces the computer guides the progress of the
user/computer dialogue. Information is displayed and the user response of the computer
takes action or displays further information.
19

User Initiated interfaces


User initiated interfaces fall into tow approximate classes:
1. Command driven interfaces: In this type of interface the user inputs
commands or queries which are interpreted by the computer.
2. Forms oriented interface: The user calls up an image of the form to
his/her screen and fills in the form. The forms oriented interface is chosen because it
is the best choice.

Computer initiated interfaces


The following computer – initiated interfaces were used:
1. The menu system for the user is presented with a list of alternatives
and the user chooses one; of alternatives.
2. Questions – answer type dialog system where the computer asks
question and takes action based on the basis of the users reply.
Right from the start the system is going to be menu driven, the opening menu
displays the available options. Choosing one option gives another popup menu with
more options. In this way every option leads the users to data entry form where the user
can key in the data.

Error message Design


The design of error messages is an important part of the user interface design. As
user is bound to commit some errors or other while designing a system the system should
be designed to be helpful by providing the user with information regarding the error
he/she has committed.
This application must be able to produce output at different modules for different inputs.

6.3 Performance Requirements


Performance is measured in terms of the output provided by the application.
Requirement specification plays an important part in the analysis of a system.
Only when the requirement specifications are properly given, it is possible to design a
system, which will fit into required environment. It rests largely in the part of the users
of the existing system to give the requirement specifications because they are the people
20
who finally use the system. This is because the requirements have to be known during
the initial stages so that the system can be designed according to those requirements. It
is very difficult to change the system once it has been designed and on the other hand
designing a system, which does not cater to the requirements of the user, is of no use.
The requirement specification for any system can be broadly stated as given below:
• The system should be able to interface with the existing system
• The system should be accurate
• The system should be better than the existing system
The existing system is completely dependent on the user to perform all the duties.
21

CHAPTER 7

SOFTWARE PROFILE

7.1 INTRODUCTION TO .NET FRAMEWORK

The Microsoft .NET Framework is a software technology that is available with


several Microsoft Windows operating systems. It includes a large library of pre-coded
solutions to common programming problems and a virtual machine that manages the
execution of programs written specifically for the framework. The .NET Framework is a
22
key Microsoft offering and is intended to be used by most new applications created for
the Windows platform.

The pre-coded solutions that form the framework's Base Class Library cover a large
range of programming needs in a number of areas, including user interface, data access,
database connectivity, cryptography, web application development, numeric
algorithms, and network communications. The class library is used by programmers,
who combine it with their own code to produce applications.

Programs written for the .NET Framework execute in a software environment that
manages the program's runtime requirements. Also part of the .NET Framework, this
runtime environment is known as the Common Language Runtime (CLR). The CLR
provides the appearance of an application virtual machine so that programmers need
not consider the capabilities of the specific CPU that will execute the program. The CLR
also provides other important services such as security, memory management, and
exception handling. The class library and the CLR together compose the .NET
Framework.

PRINCIPAL DESIGN FEATURES:

Interoperability
Because interaction between new and older applications is commonly required,
the .NET Framework provides means to access functionality that is implemented in
programs that execute outside the .NET environment. Access to COM components is
provided in the System.Runtime.InteropServices and System.EnterpriseServices
namespaces of the framework; access to other functionality is provided using the
P/Invoke feature.

Common Runtime Engine


23
The Common Language Runtime (CLR) is the virtual machine component of
the .NET framework. All .NET programs execute under the supervision of the CLR,
guaranteeing certain properties and behaviors in the areas of memory management,
security, and exception handling.

Base Class Library


The Base Class Library (BCL), part of the Framework Class Library (FCL), is a
library of functionality available to all languages using the .NET Framework. The BCL
provides classes which encapsulate a number of common functions, including file
reading and writing, graphic rendering, database interaction and XML document
manipulation.

Simplified Deployment
Installation of computer software must be carefully managed to ensure that it
does not interfere with previously installed software, and that it conforms to security
requirements. The .NET framework includes design features and tools that help address
these requirements.

Security
The design is meant to address some of the vulnerabilities, such as buffer
overflows, that have been exploited by malicious software. Additionally, .NET provides
a common security model for all applications.

Portability
The design of the .NET Framework allows it to theoretically be platform
agnostic, and thus cross-platform compatible. That is, a program written to use the
framework should run without change on any type of system for which the framework is
implemented. Microsoft's commercial implementations of the framework cover
Windows, Windows CE, and the Xbox 360. In addition, Microsoft submits the
specifications for the Common Language Infrastructure (which includes the core class
24
libraries, Common Type System, and the Common Intermediate Language), the C#
language, and the C++/CLI language to both ECMA and the ISO, making them available
as open standards. This makes it possible for third parties to create compatible
implementations of the framework and its languages on other platforms.

Architecture

Fig 7.1 Visual overview of the Common Language Infrastructure (CLI)

Common Language Infrastructure


The core aspects of the .NET framework lie within the Common Language
Infrastructure, or CLI. The purpose of the CLI is to provide a language-neutral platform
for application development and execution, including functions for exception handling,
garbage collection, security, and interoperability. Microsoft's implementation of the CLI
is called the Common Language Runtime or CLR.

Assemblies
The intermediate CIL code is housed in .NET assemblies. As mandated by
specification, assemblies are stored in the Portable Executable (PE) format, common on
the Windows platform for all DLL and EXE files. The assembly consists of one or more
files, one of which must contain the manifest, which has the metadata for the assembly.
25
The complete name of an assembly (not to be confused with the filename on disk)
contains its simple text name, version number, culture, and public key token. The public
key token is a unique hash generated when the assembly is compiled, thus two
assemblies with the same public key token are guaranteed to be identical from the point
of view of the framework. A private key can also be specified known only to the creator
of the assembly and can be used for strong naming and to guarantee that the assembly is
from the same author when a new version of the assembly is compiled (required to add
an assembly to the Global Assembly Cache).

Metadata
All CLI is self-describing through .NET metadata. The CLR checks the metadata
to ensure that the correct method is called. Metadata is usually generated by language
compilers but developers can create their own metadata through custom attributes.
Metadata contains information about the assembly, and is also used to implement the
reflective programming capabilities of .NET Framework.

Security
.NET has its own security mechanism with two general features: Code Access
Security (CAS), and validation and verification. Code Access Security is based on
evidence that is associated with a specific assembly. Typically the evidence is the source
of the assembly (whether it is installed on the local machine or has been downloaded
from the intranet or Internet). Code Access Security uses evidence to determine the
permissions granted to the code. Other code can demand that calling code is granted a
specified permission. The demand causes the CLR to perform a call stack walk: every
assembly of each method in the call stack is checked for the required permission; if any
assembly is not granted the permission a security exception is thrown.
When an assembly is loaded the CLR performs various tests. Two such tests are
validation and verification. During validation the CLR checks that the assembly contains
valid metadata and CIL, and whether the internal tables are correct. Verification is not so
exact. The verification mechanism checks to see if the code does anything that is 'unsafe'.
The algorithm used is quite conservative; hence occasionally code that is 'safe' does not
pass. Unsafe code will only be executed if the assembly has the 'skip verification'
permission, which generally means code that is installed on the local machine.
26
.NET Framework uses appdomains as a mechanism for isolating code running in
a process. Appdomains can be created and code loaded into or unloaded from them
independent of other appdomains. This helps increase the fault tolerance of the
application, as faults or crashes in one appdomain do not affect rest of the application.
Appdomains can also be configured independently with different security privileges.
This can help increase the security of the application by isolating potentially unsafe code.
The developer, however, has to split the application into subdomains; it is not done by
the CLR.
Class library
Namespaces in the BCL
System
System. CodeDom
System. Collections
System. Diagnostics
System. Globalization
System. IO
System. Resources
System. Text
System.Text.RegularExpressions

Microsoft .NET Framework includes a set of standard class libraries. The class library
is organized in a hierarchy of namespaces. Most of the built in APIs are part of either
System.* or Microsoft.* namespaces. It encapsulates a large number of common
functions, such as file reading and writing, graphic rendering, database interaction, and
XML document manipulation, among others. The .NET class libraries are available to all
.NET languages. The .NET Framework class library is divided into two parts: the Base
Class Library and the Framework Class Library.The Base Class Library (BCL)
includes a small subset of the entire class library and is the core set of classes that serve
as the basic API of the Common Language Runtime. The classes in mscorlib.dll and
some of the classes in System.dll and System.core.dll are considered to be a part of
the BCL. The BCL classes are available in both .NET Framework as well as its
alternative implementations including .NET Compact Framework, Microsoft Silverlight
and Mono.The Framework Class Library (FCL) is a superset of the BCL classes and
refers to the entire class library that ships with .NET Framework. It includes an expanded
set of libraries, including WinForms, ADO.NET, ASP.NET, Language Integrated
Query, Windows Presentation Foundation, Windows Communication Foundation
27
among others. The FCL is much larger in scope than standard libraries for languages like
C++, and comparable in scope to the standard libraries of Java.

Memory management
The .NET Framework CLR frees the developer from the burden of managing
memory (allocating and freeing up when done); instead it does the memory management
itself. To this end, the memory allocated to instantiations of .NET types (objects) is done
contiguously from the managed heap, a pool of memory managed by the CLR. As long
as there exists a reference to an object, which might be either a direct reference to an
object or via a graph of objects, the object is considered to be in use by the CLR. When
there is no reference to an object, and it cannot be reached or used, it becomes garbage.
However, it still holds on to the memory allocated to it. .NET Framework includes a
garbage collector which runs periodically, on a separate thread from the application's
thread, that enumerates all the unusable objects and reclaims the memory allocated to
them.
The .NET Garbage Collector (GC) is a non-deterministic, compacting, mark-and-
sweep garbage collector. The GC runs only when a certain amount of memory has been
used or there is enough pressure for memory on the system. Since it is not guaranteed
when the conditions to reclaim memory are reached, the GC runs are non-deterministic.
Each .NET application has a set of roots, which are pointers to objects on the managed
heap (managed objects). These include references to static objects and objects defined as
local variables or method parameters currently in scope, as well as objects referred to by
CPU registers. When the GC runs, it pauses the application, and for each object referred
to in the root, it recursively enumerates all the objects reachable from the root objects
and marks them as reachable. It uses .NET metadata and reflection to discover the
objects encapsulated by an object, and then recursively walk them. It then enumerates all
the objects on the heap (which were initially allocated contiguously) using reflection. All
objects not marked as reachable are garbage. This is the mark phase. Since the memory
held by garbage is not of any consequence, it is considered free space. However, this
leaves chunks of free space between objects which were initially contiguous. The objects
are then compacted together, by using memcpy to copy them over to the free space to
make them contiguous again. Any reference to an object invalidated by moving the
28
object is updated to reflect the new location by the GC. The application is resumed after
the garbage collection is over.
The GC used by .NET Framework is actually generational. Objects are assigned
a generation; newly created objects belong to Generation 0. The objects that survive a
garbage collection are tagged as Generation 1, and the Generation 1 objects that survive
another collection are Generation 2 objects. The .NET Framework uses up to Generation
2 objects. Higher generation objects are garbage collected less frequently than lower
generation objects. This helps increase the efficiency of garbage collection, as older
objects tend to have a larger lifetime than newer objects. Thus, by removing older (and
thus more likely to survive a collection) objects from the scope of a collection run, fewer
objects need to be checked and compacted.
Versions
Microsoft started development on the .NET Framework in the late 1990s
originally under the name of Next Generation Windows Services (NGWS). By late 2000
the first beta versions of .NET 1.0 were released.

Fig 7.2 The .NET Framework stack.

Version Version Number Release Date


1.0 1.0.3705.0 2002-01-05
1.1 1.1.4322.573 2003-04-01
29

2.0 2.0.50727.42 2005-11-07


3.0 3.0.4506.30 2006-11-06
3.5 3.5.21022.8 2007-11-09

Fig 7.3 Different versions

7.2 C#.NET

The Relationship of C# to .NET:


C# is a new programming language, and is significant in two respects:
• It is specifically designed and targeted for use with Microsoft's .NET Framework
(a feature rich platform for the development, deployment, and execution of
distributed applications).

• It is a language based upon the modern object-oriented design methodology, and


when designing it Microsoft has been able to learn from the experience of all the
other similar languages that have been around over the 20 years or so since
object-oriented principles came to prominence

One important thing to make clear is that C# is a language in its own right. Although
it is designed to generate code that targets the .NET environment, it is not itself part
of .NET. There are some features that are supported by .NET but not by C#, and you
might be surprised to learn that there are actually features of the C# language that are not
supported by .NET like Operator Overloading.

However, since the C# language is intended for use with .NET, it is important for us
to have an understanding of this Framework if we wish to develop applications in C#
effectively.

The Common Language Runtime:


Central to the .NET framework is its run-time execution environment, known as
the Common Language Runtime (CLR) or the .NET runtime. Code running under the
control of the CLR is often termed managed code.
30
However, before it can be executed by the CLR, any source code that we develop (in C#
or some other language) needs to be compiled. Compilation occurs in two steps in .NET:
1. Compilation of source code to Microsoft Intermediate Language (MS-IL)
2. Compilation of IL to platform-specific code by the CLR

At first sight this might seem a rather long-winded compilation process. Actually,
this two-stage compilation process is very important, because the existence of the
Microsoft Intermediate Language (managed code) is the key to providing many of the
benefits of .NET. Let's see why.

Advantages of Managed Code:


Microsoft Intermediate Language (often shortened to "Intermediate Language",
or "IL") shares with Java byte code the idea that it is a low-level language with a simple
syntax (based on numeric codes rather than text), which can be very quickly translated
into native machine code. Having this well-defined
Universal syntax for code has significant advantages.

Platform Independence
First, it means that the same file containing byte code instructions can be placed
on any platform; at runtime the final stage of compilation can then be easily
accomplished so that the code will run on that particular platform. In other words, by
compiling to Intermediate Language we obtain platform independence for .NET, in much
the same way as compiling to Java byte code gives Java platform independence.

You should note that the platform independence of .NET is only theoretical at
present because, at the time of writing, .NET is only available for Windows. However,
porting .NET to other platforms is being explored (see for example the Mono project, an
effort to create an open source implementation of .NET, at http://www.go-mono.com/).

Performance Improvement
Although we previously made comparisons with Java, IL is actually a bit more
ambitious than Java byte code. Significantly, IL is always Just-In-Time compiled,
whereas Java byte code was often interpreted. One of the disadvantages of Java was that,
31
on execution, the process of translating from Java byte code to native executable resulted
in a loss of performance (apart from in more recent cases, here Java is JIT-compiled on
certain platforms).
Instead of compiling the entire application in one go (which could lead to a slow
start-up time), the JIT compiler simply compiles each portion of code as it is called (just-
in-time). When code has been compiled once, the resultant native executable is stored
until the application exits, so that it does not need to be recompiled the next time that
portion of code is run. Microsoft argues that this process is more efficient than compiling
the entire application code at the start, because of the likelihood those large portions of
any application code will not actually be executed in any given run. Using the JIT
compiler, such code will never get compiled.

This explains why we can expect that execution of managed IL code will be
almost as fast as executing native machine code. What it doesn't explain is why
Microsoft expects that we will get a performance improvement. The reason given for this
is that, since the final stage of compilation takes place at run time, the JIT compiler will
know exactly what processor type the program will run on. This means that it can
optimize the final executable code to take advantage of any features or particular
machine code instructions offered by that particular processor.
Traditional compilers will optimize the code, but they can only perform
optimizations that will be independent of the particular processor that the code will run
on. This is because traditional compilers compile to native executable before the
software is shipped. This means that the compiler doesn't know what type of processor
the code will run on beyond basic generalities, such as that it will be an x86-compatible
processor or an Alpha processor. Visual Studio 6, for example, optimizes for a generic
Pentium machine, so the code that it generates cannot take advantages of hardware
features of Pentium III processors. On the other hand, the JIT compiler can do all the
optimizations that Visual Studio 6 can, and in addition to that it will optimize for the
particular processor the code is running on.

Language Interoperability
How the use of IL enables platform independence, and how JIT compilation
should improve performance. However, IL also facilitates language interoperability.
32
Simply put, you can compile to IL from one language, and this compiled code should
then be interoperable with code that has been compiled to IL from another language.

Intermediate Language
From what we learned in the previous section, Intermediate Language obviously
plays a fundamental role in the .NET Framework. As C# developers, we now understand
that our C# code will be compiled into Intermediate Language before it is executed
(indeed, the C# compiler only compiles to managed code). It makes sense, then, that we
should now take a closer look at the main characteristics of IL, since any language that
targets .NET would logically need to support the main characteristics of IL too.

Here are the important features of the Intermediate Language:

• Object-orientation and use of interfaces


• Strong distinction between value and reference types
• Strong data typing
• Error handling through the use of exceptions
• Use of attributes

Support of Object Orientation and Interfaces


The language independence of .NET does have some practical limits. In
particular, IL, however it is designed, is inevitably going to implement some particular
programming methodology, which means that languages targeting it are going to have to
be compatible with that methodology. The particular route that Microsoft has chosen to
follow for IL is that of classic object-oriented programming, with single implementation
inheritance of classes.
Besides classic object-oriented programming, Intermediate Language also brings
in the idea of interfaces, which saw their first implementation under Windows with
COM..NET interfaces are not the same as COM interfaces; they do not need to support
any of the COM infrastructure (for example, they are not derived from I Unknown, and
they do not have associated GUIDs). However, they do share with COM interfaces the
33
idea that they provide a contract, and classes that implement a given interface must
provide implementations of the methods and properties specified by that interface.

Object Orientation and Language Interoperability


Working with .NET means compiling to the Intermediate Language, and that in
turn means that you will need to be programming using traditional object-oriented
methodologies. That alone is not, however, sufficient to give us language
interoperability. After all, C++ and Java both use the same object-oriented paradigms,
but they are still not regarded as interoperable. We need to look a little more closely at
the concept of language interoperability.
An associated problem was that, when debugging, you would still have to
independently debug components written in different languages. It was not possible to
step between languages in the debugger. So what we really mean by language
interoperability is that classes written in one language should be able to talk directly to
classes written in another language. In particular:
• A class written in one language can inherit from a class written in another
language
• The class can contain an instance of another class, no matter what the
languages of the two classes are
• An object can directly call methods against another object written in
another language
• Objects (or references to objects) can be passed around between methods
• When calling methods between languages we can step between the
method calls in the
debugger, even where this means stepping between source code written in
different languages
This is all quite an ambitious aim, but amazingly, .NET and the Intermediate
Language have achieved it. For the case of stepping between methods in the debugger,
this facility is really offered by the Visual Studio .NET IDE rather than from the CLR
itself.

Strong Data Type


34
One very important aspect of IL is that it is based on exceptionally strong data
typing. What we mean by that is that all variables are clearly marked as being of a
particular, specific data type (there is no room in IL, for example, for the Variant data
type recognized by Visual Basic and scripting languages). In particular, IL does not
normally permit any operations that result in ambiguous data types.
For instance, VB developers will be used to being able to pass variables around
without worrying too much about their types, because VB automatically performs type
conversion. C++ developers will be used to routinely casting pointers between different
types. Being able to perform this kind of operation can be great for performance, but it
breaks type safety. Hence, it is permitted only in very specific circumstances in some of
the languages that compile to managed code. Indeed, pointers (as opposed to references)
are only permitted in marked blocks of code in C#, and not at all in VB (although they
are allowed as normal in managed C++). Using pointers in your code will immediately
cause it to fail the memory type safety checks performed by the CLR.

You should note that some languages compatible with .NET, such as VB.NET,
still allow some laxity in typing, but that is only possible because the compilers behind
the scenes ensure the type safety is enforced in the emitted IL.
Although enforcing type safety might initially appear to hurt performance, in
many cases this is far outweighed by the benefits gained from the services provided
by .NET that rely on type safety. Such services include:
• Language Interoperability

• Garbage Collection

• Security

• Application Domains

Common Type System (CTS)


This data type problem is solved in .NET through the use of the Common Type
System (CTS). The CTS defines the predefined data types that are available in IL, so
that all languages that target the .NET framework will produce compiled code that is
ultimately based on these types.
35
The CTS doesn't merely specify primitive data types, but a rich hierarchy of
types, which includes well-defined points in the hierarchy at which code is permitted to
define its own types. The hierarchical structure of the Common Type System reflects the
single-inheritance object-oriented methodology of IL, and looks like this:

Fig 7.4 Common type system

Common Language Specification (CLS)


The Common Language Specification works with the Common Type System to
ensure language interoperability. The CLS is a set of minimum standards that all
compilers targeting .NET must support. Since IL is a very rich language, writers of most
compilers will prefer to restrict the capabilities of a given compiler to only support a
subset of the facilities offered by IL and the CTS. That is fine, as long as the compiler
supports everything that is defined in the CLS.

Garbage Collection
The garbage collector is .NET's answer to memory management, and in
particular to the question of what to do about reclaiming memory that running
applications ask for. Up until now there have been two techniques used on Windows
36
platform for deal locating memory that processes have dynamically requested from the
system:
• Make the application code do it all manually

• Make objects maintain reference counts

The .NET runtime relies on the garbage collector instead. This is a program whose
purpose is to clean up memory. The idea is that all dynamically requested memory is
allocated on the heap (that is true for all languages, although in the case of .NET, the
CLR maintains its own managed heap for .NET applications to use). Every so often,
when .NET detects that the managed heap for a given process is becoming full and
therefore needs tidying up, it calls the garbage collector. The garbage collector runs
through variables currently in scope in your code, examining references to objects stored
on the heap to identify which ones are accessible from your code – that is to say which
objects have references that refer to them. Any objects that are not referred to are
deemed to be no longer accessible from your code and can therefore be removed. Java
uses a similar system of garbage collection to this.

Security
.NET can really excel in terms of complementing the security mechanisms
provided by Windows because it can offer code-based security, whereas Windows only
really offers role-based security.
Role-based security is based on the identity of the account under which the
process is running, in other words, who owns and is running the process. Code-based
security on the other hand is based on what the code actually does and on how much the
code is trusted. Thanks to the strong type safety of IL, the CLR is able to inspect code
before running it in order to determine required security permissions. .NET also offers a
mechanism by which code can indicate in advance what security permissions it will
require to run.
The importance of code-based security is that it reduces the risks associated with
running code of dubious origin (such as code that you've downloaded from the Internet).
For example, even if code is running under the administrator account, it is possible to use
code-based security to indicate that that code should still not be permitted to perform
certain types of operation that the administrator account would normally be allowed to
37
do, such as read or write to environment variables, read or write to the registry, or to
access the .NET reflection features.

.Net Framework Classes


The .NET base classes are a massive collection of managed code classes that
have been written by Microsoft, and which allow you to do almost any of the tasks that
were previously available through the Windows API. These classes follow the same
object model as used by IL, based on single inheritance. This means that you can either
instantiate objects of whichever .NET base class is appropriate, or you can derive your
own classes from them.
The great thing about the .NET base classes is that they have been designed to be
very intuitive and easy to use. For example, to start a thread, you call the Start() method
of the Thread class. To disable a TextBox, you set the Enabled property of a TextBox
object to false. This approach will be familiar to Visual Basic and Java developers,
whose respective libraries are just as easy to use. It may however come as a great relief
to C++ developers, who for years have had to cope with such API functions as
GetDIBits(), RegisterWndClassEx(), and IsEqualIID(), as well as a whole plethora of
functions that required Windows handles to be passed around.

Name Spaces
Namespaces are the way that .NET avoids name clashes between classes. They
are designed, for example, to avoid the situation in which you define a class to represent
a customer, name your class Customer, and then someone else does the same thing (quite
a likely scenario – the proportion of businesses that have customers seems to be quite
high).
A namespace is no more than a grouping of data types, but it has the effect that
the names of all data types within a namespace automatically get prefixed with the name
of the namespace. It is also possible to nest namespaces within each other. For example,
most of the general-purpose .NET base classes are in a namespace called System. The
base class Array is in this namespace, so its full name is System.Array.
38
Creating .Net Application using C#
C# can be used to create console applications: text-only applications that run in a
DOS window. You'll probably use console applications when unit testing class libraries,
and for creating Unix/Linux daemon processes. However, more often you'll use C# to
create applications that use many of the technologies associated with .NET. In this
section, we'll give you an overview of the different types of application that you can
write in C#.

Creating Windows Forms


Although C# and .NET are particularly suited to web development, they still
offer splendid support for so-called "fat client" apps, applications that have to be
installed on the end-user's machine where most of the processing takes place. This
support is from Windows Forms.
A Windows Form is the .NET answer to a VB 6 Form. To design a graphical
window interface, you just drag controls from a toolbox onto a Windows Form. To
determine the window's behavior, you write event-handling routines for the form's
controls. A Windows Form project compiles to an EXE that must be installed alongside
the .NET runtime on the end user's computer. Like other .NET project types, Windows
Form projects are supported by both VB.NET and C#.

Windows Control
Although Web Forms and Windows Forms are developed in much the same way,
you use different kinds of controls to populate them. Web Forms use Web Controls, and
Windows Forms use Windows Controls.
A Windows Control is a lot like an ActiveX control. After a Window control is
implemented, it compiles to a DLL that must be installed on the client's machine. In fact,
the .NET SDK provides a utility that creates a wrapper for ActiveX controls, so that they
can be placed on Windows Forms. As is the case with Web Controls, Windows Control
creation involves deriving from a particular class, System.Windows.Forms.Control.

Windows Services
A Windows Service (originally called an NT Service) is a program that is
designed to run in the background in Windows NT/2000/XP (but not Windows 9x).
39
Services are useful where you want a program to be running continuously and ready to
respond to events without having been explicitly started by the user. A good example
would be the World Wide Web Service on web servers, which listens out for web
requests from clients.
It is very easy to write services in C#. There are .NET Framework base classes
available in the System.ServiceProcess namespace that handle many of the boilerplate
tasks associated with services, and in addition, Visual Studio .NET allows you to create a
C# Windows Service project, which starts you out with the Framework C# source code
for a basic Windows service.

The Role of C# In .Net Enterprise Architecture


C# requires the presence of the .NET runtime, and it will probably be a few years
before most clients – particularly most home machines – have .NET installed. In the
meantime, installing a C# application is likely to mean also installing the .NET
redistributable components. Because of that, it is likely that the first place we will see
many C# applications is in the enterprise environment. Indeed, C# arguably presents an
outstanding opportunity for organizations that are interested in building robust, n-tiered
client-server applications.
40
41

Fig 7.5 .Net Enterprise Architecture

CHAPTER 8

SYSTEM DESIGN
8.1 Introduction

Software design sits at the technical kernel of the software engineering process
and is applied regardless of the development paradigm and area of application. Design is
the first step in the development phase for any engineered product or system. The
designer’s goal is to produce a model or representation of an entity that will later be
built. Beginning, once system requirement have been specified and analyzed, system
design is the first of the three technical activities -design, code and test that is required to
build and verify software.
The importance can be stated with a single word “Quality”. Design is the place
where quality is fostered in software development. Design provides us with
representations of software that can assess for quality. Design is the only way that we
can accurately translate a customer’s view into a finished software product or system.
Software design serves as a foundation for all the software engineering steps that follow.
Without a strong design we risk building an unstable system – one that will be difficult
to test, one whose quality cannot be assessed until the last stage.
During design, progressive refinement of data structure, program structure, and
procedural details are developed reviewed and documented. System design can be
viewed from either technical or project management perspective. From the technical
42
point of view, design is comprised of four activities – architectural design, data structure
design, interface design and procedural design.

8.2 SYSTEM WORKFLOW

MODULE DESCRIPTION:
1) RGB Projections:

The RGB color model is an additive color model in which red, green, and blue
light are added together in various ways to reproduce a broad array of colors. The name
of the model comes from the initials of the three additive primary colors, red, green, and
blue. The main purpose of the RGB color model is for the sensing, representation, and
display of images in electronic systems, such as conventional photography.In this
module the RGB Projections is used to find the size of the image vertically and
horizontally.

2) Image Utility:

Whenever minimizing the error of classification is interesting for CBIR, this


criterion does not completely reflect the user satisfaction. Other utility criteria closer to
this, such as precision, should provide more efficient selections.
3) Comparable Image:
In this module a reselection technique to speed up the selection process, which
leads to a computational complexity negligible compared to the size of the database for
the whole active learning process. All these components are integrated in our retrieval
system, called RETIN and the user gives new labels for images, and they are compared
to the current classification. If the user mostly gives relevant labels, the system should
propose new images for labeling around a higher rank to get more irrelevant labels.
4) Similarity measure:
The results in terms of mean average precision according to the
training set size (we omit the KFD which gives results very close to
inductive SVMs) for a set. One can see that the classification-based
methods give the best results, showing the power of statistical
methods over geometrical approaches, like the one reported here
(similarity refinement method).
43
5) Graph:
This module is used to determine relationships between the two Images. The
precision and recall values are measured by simulating retrieval scenario. For each
simulation, an image category is randomly chosen. Next, 100 images are selected using
active learning and labeled according to the chosen category. These labeled images are
used to train a classifier, which returns a ranking of the database. The average precision
is then computed using the ranking. These simulations are repeated 1000 times, and all
values are averaged to get the Mean average precision. Next, we repeat ten times these
simulations to get the mean and the standard deviation of the MAP.

CHAPTER 9

UML DIAGRAMS

9.1 Use Case Diagram


44

Fig 9.1 Use case diagram

9.2 Class Diagram


45
Fig 9.2 Class Diagram

9.3 Activity Diagram


46

Fig 9.3 Activity Diagram

9.4 Sequence Diagram


47

Fig 9.4 Sequence Diagram


48

CHAPTER 10

SYSTEM TESTING

10.1 Introduction
Software testing is a critical element of software quality assurance and represents
the ultimate review of specification, design and coding. In fact, testing is the one step in
the software engineering process that could be viewed as destructive rather than
constructive.

A strategy for software testing integrates software test case design methods into a
well-planned series of steps that result in the successful construction of software. Testing
is the set of activities that can be planned in advance and conducted systematically. The
underlying motivation of program testing is to affirm software quality with methods that
can economically and effectively apply to both strategic to both large and small-scale
systems.

10.2 Strategic approach to software testing


The software engineering process can be viewed as a spiral. Initially system
engineering defines the role of software and leads to software requirement analysis
where the information domain, functions, behavior, performance, constraints and
validation criteria for software are established. Moving inward along the spiral, we come
to design and finally to coding. To develop computer software we spiral in along
streamlines that decrease the level of abstraction on each turn.

A strategy for software testing may also be viewed in the context of the spiral.
Unit testing begins at the vertex of the spiral and concentrates on each unit of the
software as implemented in source code. Testing progress by moving outward along the
spiral to integration testing, where the focus is on the design and the construction of the
software architecture. Talking another turn on outward on the spiral we encounter
validation testing where requirements established as part of software requirements
analysis are validated against the software that has been constructed. Finally we arrive at
system testing, where the software and other system elements are tested as a whole.
49

UNIT TESTING

MODULE TESTING

Component Testing SUB-SYSTEM


TESING

SYSTEM TESTING
Integration Testing

ACCEPTANCE
TESTING
User Testing

Fig 10.1 Testing cycle

10.3 Unit Testing

Unit testing focuses verification effort on the smallest unit of software design, the
module. The unit testing we have is white box oriented and some modules the steps are
conducted in parallel.

1. White box testing

This type of testing ensures that


• All independent paths have been exercised at least once
• All logical decisions have been exercised on their true and false sides
• All loops are executed at their boundaries and within their operational bounds
• All internal data structures have been exercised to assure their validity.
To follow the concept of white box testing we have tested each form .we have
created independently to verify that Data flow is correct, All conditions are exercised to
check their validity, All loops are executed on their boundaries.
50
2. Basic path testing
Established technique of flow graph with Cyclomatic complexity was used to
derive test cases for all the functions. The main steps in deriving test cases were:
Use the design of the code and draw correspondent flow graph.
Determine the Cyclomatic complexity of resultant flow graph, using formula:
V(G)=E-N+2 or
V(G)=P+1 or
V(G)=Number Of Regions
Where V(G) is Cyclomatic complexity,
E is the number of edges,
N is the number of flow graph nodes,
P is the number of predicate nodes.
Determine the basis of set of linearly independent paths.

3. Conditional testing
In this part of the testing each of the conditions were tested to both true and false
aspects. And all the resulting paths were tested. So that each path that may be generate
on particular condition is traced to uncover any possible errors.

4. Dataflow testing
This type of testing selects the path of the program according to the location of
definition and use of variables. This kind of testing was used only when some local
variable were declared. The definition-use chain method was used in this type of testing.
These were particularly useful in nested statements.

5. Loop Testing
In this type of testing all the loops are tested to all the limits possible. The following
exercise was adopted for all loops:
• All the loops were tested at their limits, just above them and just below them.
• All the loops were skipped at least once.
• For nested loops test the inner most loop first and then work outwards.
• For concatenated loops the values of dependent loops were set with the help of
connected loop.
51

• Unstructured loops were resolved into nested loops or concatenated loops and tested
as above.
Each unit has been separately tested by the development team itself and all the
input have been validated.
52

CHAPTER 11

SAMPLE SCREENS

Fig 11.1 Main window


53

Fig 11.2 Opening the file menu


54

Fig 11.3 On clicking select folder to search


55

Fig 11.4 Browsing folder from the system


56

Fig 11.5 Browsing Image file from the folder


57

Fig 11.6 On clicking the search button


58

Fig 11.7 View graph


59

VARIATIONS BETWEEN SOURCE AND DESTINATION CAN BE


OBSERVED THROUGH THE GRAPH:

Fig 11.8 Variation between two graphs


60

Fig 11.9 Variation between two graphs


61

Fig 11.10 Variation between two graphs


62

Fig 11.11 Variation between two graphs


63

Fig 11.12 Variation between two graphs


64

CONCLUSION

The classification framework for CBIR is studied and powerful classification techniques
for information retrieval context are selected. The main contributions concern the
boundary correction to make the retrieval process more robust, and secondly, the
introduction of a new criterion for image selection that better represents the CBIR
objective of database ranking. Other improvements, as batch processing and speed-up
process are proposed and discussed. Our strategy leads to a fast and efficient active
learning scheme to online retrieve query concepts from a database. Experiments on large
databases show that it gives very good results in comparison to several other active
strategies. The framework introduced in this article may be extended. We are currently
working on kernel functions for object classes retrieval, based on bags of features: each
image is no more represented by a single global vector, but by a set of vectors.
65

REFERENCES

[1] R. Veltkamp, “Content-based image retrieval system: A survey,” Tech.Rep., Univ.


Utrecht, Utrecht, The Netherlands, 2002.

[2] Y. Rui, T. Huang, S. Mehrotra, and M. Ortega, “A relevance feedback architecture


for content-based multimedia information retrieval systems,”in Proc. IEEE Workshop
Content-Based Access of Image and Video Libraries, 1997, pp. 92–89.

[3] E. Chang, B. T. Li, G. Wu, and K. Goh, “Statistical learning for effective visual
information retrieval,” in Proc. IEEE Int. Conf. Image Processing, Barcelona, Spain,
Sep. 2003, pp. 609–612.

[4] S. Aksoy, R. Haralick, F. Cheikh, and M. Gabbouj, “A weighted distance approach to


relevance feedback,” in Proc. IAPR Int. Conf. Pattern Recognition, Barcelona, Spain,
Sep. 3–8, 2000, vol. IV, pp. 812–815.

[5] J. Peng, B. Bhanu, and S. Qing, “Probabilistic feature relevance learning for content-
based image retrieval,” Comput. Vis. Image Understand., vol. 75, no. 1-2, pp. 150–164,
Jul.-Aug. 1999.

[6] N. Doulamis and A. Doulamis, “A recursive optimal relevance feedback scheme for
CBIR,” presented at the Int.Conf. Image Processing, Thessaloniki, Greece, Oct. 2001.

[7] J. Fournier, M. Cord, and S. Philipp-Foliguet, “Back-propagation algorithm for


relevance feedback in image retrieval,” in Proc. Int. Conf. Image Processing,
Thessaloniki, Greece, Oct. 2001, vol. 1, pp. 686–689.

[8] O. Chapelle, P. Haffner, and V. Vapnik, “Svms for histogram based image
classification,” IEEE Trans. Neural Netw., vol. 10, pp.1055–1064, 1999.

[9] N. Vasconcelos, “Bayesian models for visual information retrieval,” Ph.D.


dissertation, Mass. Inst. Technol., Cambridge, 2000.

[10] S.-A. Berrani, L. Amsaleg, and P. Gros, “Recherche approximative de plus proches
voisins: Application la reconnaissance d’images par descripteurs locaux,” Tech. Sci. Inf.,
vol. 22, no. 9, pp. 1201–1230, 2003.

[11] N. Najjar, J. Cocquerez, and C. Ambroise, “Feature selection for semi supervised
learning applied to image retrieval,” in Proc. IEEE Int. Conf.Image Processing,
Barcelona, Spain, Sep. 2003, vol. 2, pp. 559–562.
66
[12] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-supervised learning using gaussian
fields and harmonic functions,” presented at the Int.
Conf. Machine Learning, 2003.
67

APPENDIX-A
( D.Mahidhar and M.V.S.N.Maheswar, “A Novel Approach for Retrieving an Image
Using CBIR”, IJCSIT,September,2010,vol.02, No.06,2010, 2160-2162 )

Вам также может понравиться