Академический Документы
Профессиональный Документы
Культура Документы
1 Introduction 1
1.1 Need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Literature Survey 3
2.1 Related Work Done . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1
5.2 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 Application 34
8 Conclusion 35
9 Future Scope 36
10 Glossary 37
2
List of Figures
1 Working of JSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Layers of .Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Data Flow Diagram (Level 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5 Data Flow Diagram (Level 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6 ER Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8 Use-case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9 Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
11 Component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12 Deployment Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
13 State Chart Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
14 Communication Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
15 Test Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
16 Test Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
17 Test Case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
18 Test Case 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
19 Test Case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
20 Login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
21 Screen 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
22 Screen 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
23 Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
24 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3
BE(IT) Procuring ER Diagram using NLP
List of Tables
1 Hardware Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Software Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1 Introduction
1.1 Need
Database modelling can be a daunting task to both students and designers alike due to its abstract
nature and technicality. Much research has attempted to apply natural language processing in
extracting knowledge from requirements specifications with the aim to design databases. How-
ever, research on the formation and use of heuristics to aid the construction of logical databases
from natural language has been scarce.
Entities : An entity refers to the set of entities or real-world objects of the same type that
share the same properties. An entity is represented by a rectangle containing the entity’s name.
Attributes : Attributes are descriptive properties for an entity type or a relationship type.
Attributes are classified as simple or composite, single-valued or multivalued. They are repre-
sented by ovals and are connected to the entity with a line. Each oval contains the name of the
attribute it represents. Attributes also have a domain. A domain is the attributes set of possible
values.
The first step in designing a database application is to understand what information the
database must store. This step is known as requirements analysis. The information gathered
in this step is used to develop a high-level description of the data to be stored in the database.
This step is referred to as conceptual design, and it is often carried out using the ER model. ER
models are built around the basic concepts of entities,attributes, relationships and cardinality.
An entity is an object that exists in the real world and is distinguishable from other objects.
These are typically derived from nouns. Examples of entities include the following: a “stu-
dent”, an “employee”and a “book”. A collection of similar entities is called an entity set. An
entity is described using a set of attributes. The attributes of an entity reflect the level of de-
tail at which we wish to represent information about entities. Attributes may be derived from
adjectives and adverbs. For example, the Student entity set may have “ID_number”, “Name”,
“Address”, “Course”and “Year”as its attributes. A relationship is an association among two or
more entities. Relationships can be typically derived from verbs. For example, we may have a
relationship from this sentence: A student may “take”many courses. “take”implies a relation-
ship between the entity “student”and “course”. Cardinality represents the key constraint in a
relationship. In the previous example, the cardinality is said to be many-to-many, to indicate
that a student can take many courses and a course can be taken by many students. In an ER
diagram, an entity is normally represented by a rectangle. An ellipse usually represents an at-
tribute meanwhile a diamond shape shows a relationship. Cardinality is represented by 1 for the
one-sided and M for the many-sided.
2 Literature Survey
2.1 Related Work Done
ANNAPURNA[5] is project aimed to provide a computerized environment for semi-automatic
database design from knowledge acquisition up to generating an optimal database schema for a
given database management system. ANNAPURNA concentrated on the phases concerned with
acquiring the terminological rules. The first step in acquisition of the terminological knowledge
involves extracting the knowledge from queries and rules that have the form of natural lan-
guage expressions. The knowledge obtained would then be put into the form of S-diagrams.
An Sdiagram is a graphical data model which can be used to specify classes (for example room
and door), subclass connections between classes (for example rooms and doors are physical
objects) and attributes. The limitation of the above work is that the use of S-diagrams performs
best when the complexity is small.
DMG[10] is a rule based design tool which maintains rules and heuristics in several knowl-
edge bases. A parsing algorithm which accesses information of a grammar and a lexicon is
designed to meet the requirements of the tool. During the parsing phase, the sentence is parsed
by retrieving necessary information from the grammar, represented by syntactic rules and the
lexicon. The parsing results are processed further on by rules and heuristics which set up a
relationship between linguistic and design knowledge. The DMG has to interact with the user
if a word does not exist in the lexicon or the input of the mapping rules is ambiguous. The
linguistic structures are then transformed by heuristics into EER concepts. Though DMG pro-
posed a large number of heuristics to be used in the transformation from natural language to
EER models, the tool has not yet been developed into a practical system.
CM-Builder is a natural language based CASE tool which aims at supporting the analysis
stage of software development in an object-oriented framework. The tool uses natural lan-
guage processing techniques to analyze software requirements documents and produces initial
conceptual models represented in Unified Modelling Language. The system uses discourse in-
terpretation and frequency analysis in producing the conceptual models. CM-Builder still has
some limitation in the linguistic analysis. For example, attachment of post-modifiers such as
prepositional phrases and relative clauses is limited. Other shortcomings include the state of the
knowledge bases which are static and not easily update-able nor adaptive.
ER-Converter, is a tool which transforms a natural language text input into an ER model.
This is a heuristics-based tool which employs syntactic heuristics during the transformation.
In order to achieve the desired result, new and existing heuristics are applied during the pro-
cess. Though this is a semi-automatic transformation process, the tool aims to provide minimal
human intervention during the process.
• Only valid must be input.Fault Tolerance: Data should not become corrupted in case of
system crash or power failure. Internet Connection is required. A skilled person to give
the input. Human intervention is necessary at the time of ambiguous situation.
2. Functionality Testers :
3. Project Testers :
Java is a set of several computer software and specifications developed by Sun Microsys-
tems, later acquired by Oracle Corporation, that provides a system for developing application
software and deploying it in a cross-platform computing environment. Java is used in a wide
variety of computing platforms from embedded devices and mobile phones to enterprise servers
and supercomputers. While less common, Java applets run in secure, sandboxed environments
to provide many features of native applications and can be embedded in HTML pages.
JavaServer Pages (JSP) is a technology that helps software developers create dynamically
generated web pages based on HTML, XML, or other document types. JSP may be viewed as
a high-level abstraction of Java servlets. JSPs are translated into servlets at runtime, each JSP
servlet is cached and re-used until the original JSP is modified.JSP allows Java code and certain
pre-defined actions to be interleaved with static web markup content, with the resulting page be-
ing compiled and executed on the server to deliver a document. The compiled pages, as well as
any dependent Java libraries, use Java bytecode rather than a native software format. Like any
other Java program, they must be executed within a Java virtual machine (JVM) that integrates
with the server’s host operating system to provide an abstract platform-neutral environment.
.NET
.NET is an integral part of many applications running on Windows and provides common func-
tionality for those applications to run. This download is for people who need .NET to run an
application on their computer. For developers, the .NET Framework provides a comprehensive
and consistent programming model for building applications that have visually stunning user
experiences and seamless and secure communication.
.NET Framework (pronounced dot net) is a software framework developed by Microsoft that
runs primarily on Microsoft Windows. It includes a large class library known as Framework
Class Library (FCL) and provides language interoperability (each language can use code writ-
ten in other languages) across several programming languages. Programs written for .NET
Framework execute in a software environment (as contrasted to hardware environment), known
as Common Language Runtime (CLR), an application virtual machine that provides services
such as security, memory management, and exception handling. FCL and CLR together consti-
tute .NET Framework
Oracle 11g
The Oracle RDBMS stores data logically in the form of tablespaces and physically in the
form of data files ("datafiles"). Tablespaces can contain various types of memory segments,
such as Data Segments, Index Segments, etc. Segments in turn comprise one or more extents.
Extents comprise groups of contiguous data blocks. Data blocks form the basic units of data
storage.
Bootstrap
Bootstrap is a free and open-source collection of tools for creating websites and web appli-
cations. It contains HTML- and CSS-based design templates for typography, forms, buttons,
navigation and other interface components, as well as optional JavaScript extensions.Bootstrap
provides a set of stylesheets that provide basic style definitions for all key HTML components.
These provide a uniform, modern appearance for formatting text, tables and form elements
Bootstrap is modular and consists essentially of a series of LESS stylesheets that implement
the various components of the toolkit. A stylesheet called bootstrap.less includes the compo-
nents stylesheets. Developers can adapt the Bootstrap file itself, selecting the components they
wish to use in their project.
• The details related to the product, customer and service transaction provided manually.
Dependencies:
Feasibility study is made to see if the project on completion will serve the purpose of the
organization for the amount of work, effort and the time that spend on it. Feasibility study lets
the developer foresee the future of the project and the usefulness. A feasibility study of a sys-
tem proposal is according to its workability, which is the impact on the organization, ability to
meet their user needs and effective use of resources. Thus when a new application is proposed
it normally goes through a feasibility study before it is approved for development.
The document provides the feasibility of the project that is being designed and lists various
areas that were considered very carefully during the feasibility study of this project.
Technical Feasibility:
The system must be evaluated from the technical point of view first. The assessment of this
feasibility must be based on an outline design of the system requirement in the terms of input,
output, programs and procedures. Having identified an outline system, the investigation must
go on to suggest the type of equipment, required method developing the system, of running the
system once it has been designed.
Technical issue raised during the investigation is: Does the existing technology sufficient for
the suggested one? The project should be developed such that the necessary functions and per-
formance are achieved within the constraints. The project is developed within latest technology.
So there are minimal constraints involved with this project. The system has been developed us-
ing various technologies and the project is technically feasible for development.
Economic feasibility looks at the financial aspects of the project. Economic feasibility con-
cerns with the returns from the investments in a project. It determines whether it is worthwhile
to invest the money in the proposed system. It is not worthwhile spending a lot of money
on a project for no returns. To carry out an economic feasibility for a system, it is necessary
to place actual money value against any purchases or activities needed to implement the project.
Risk Management
Figure no. 3 depicts the architecture of ER-Converter. The natural language processing
involved in the process of translating the database specifications into ER elements is purely
based on syntactic analysis. The process begins by reading a plain input text file containing
a requirements specification of a database problem in English. For this purpose, a parser is
required to parse the English sentences to obtain their part-of-speech (POS) tags before further
processing. Part of speech tagging assigns each word in an input sentence its proper part of
speech such as noun, verb and determiner to reflect the word’s syntactic category [1]. The
parser used here is Memory-Based Shallow Parser (MBSP) [4,12]. The parsed text is then be
fed into ER-Converter to identify suitable data modeling elements from the specification.
The task requires several steps to be carried out in order to achieve the desired ER model
from the natural language input, each of which is listed as follows:
Algorithm:
1. START
• Registration form includes fields like FirstName, LastName, Email Id, password.
• After filling all the details user must click on the submit button.
3. Now the registered user can log in using the login page.
4. The logged in user then can give his problem statement as input .
7. Log out.
8. STOP
1. Description: The user interface must be capable of sending the data on intended destina-
tion.
2. Criticality : This issue is essential to the overall system. All the modules provided with
the software must fit into this graphical user interface and accomplish to the standard
defined.
3. Technical issues: In order to satisfy this requirement the design should be simple and all
the different interfaces should follow a standard template.
• Only authorized users should have access rights to add, delete, and modify the data.
Security The factors that protect the software from accidental or malicious function, use, Mod-
ification, destruction, or disclosure Security can be ensured as the project involves Authenticat-
ing the users.
Maintainability
All the modules must be clearly separate to allow different user interfaces to be developed in
future. Through thoughtful and effective software engineering, all steps of the software devel-
opment process will be well documented to ensure maintainability of the product throughout its
life time. All development will be provided with good documentation.
Portability
Portability defines the ability of the system to run under different computing environments. The
environment types can be either hardware or software, but is usually a combination of hardware
or software.
Supportability
The system will be able to support to transfer different types of file format.
4.4.3 ER Diagram
Description : An entity–relationship model is a systematic way of describing and defining a
business process. The process is modeled as components (entities) that are linked with each
other by relationships that express the dependencies and requirements between them.Entities
may have various properties (attributes) that characterize them. Diagrams created to represent
these entities, attributes, and relationships graphically are called entity–relationship diagrams.
Figure 6: ER Diagram
The implementation stage involves careful planning, investigation of the existing system and
it’s constraints on implementation, designing of methods to achieve changeover and evaluation
of changeover methods.
Implementation is the process of converting a new system design into operation. It is the
phase that focuses on user training, site preparation and file conversion for installing a candidate
system. The important factor that should be considered here is that the conversion should not
disrupt the functioning of the organization.
Testing is a process of executing a program with the intend of finding an error. A good test is
one that uncovers an as yet undiscovered error. Testing objectives are:
• Unit testing
• Integration testing
• System testing
• Validation testing
Unit Testing
Unit testing enables a programmer to detect error in coding. A unit test focuses verification
of the smallest unit of software design. This testing was carried out during the coding itself. In
this testing step, each module going to be work satisfactorily as the expected output from the
module.
Project aspect:
The front end design consists of various forms. They were tested for data acceptance. Simi-
larly, the back-end also tested for successful acceptance and retrieval of data.
Integration Testing
Through each program work individually, they should work after linking together. This is
referred to as interfacing. Data may be lost across the interface; one module can have ad-
verse effect on the other subroutines after linking may not do the desired function expected by
the main routine. Integration testing is the systematic technique for constructing the program
structure while at the same time conducting test to uncover errors associated with the interface.
Using integrated test plan prepared in the design phase of the system development as a guide,
the integration test was carried out. All the errors found in the system were corrected for the
next testing step.
Project Aspect
After connecting the back-end and the front-end as whole module, the data entered in the
front-end once submitted were successfully entered in the Database. On request, data were suc-
cessfully retrieved in to forms.
System Testing
After performing the integration testing, the next step is output testing of the proposed sys-
tem. No system could be useful if it doesn’t produce the required output in a specified format.
The outputs generated are displayed by the user. Here the output format is considered in to two
ways. One in on screen and other in printed format.
Project aspect:
Validation Testing
The user has to work with the system and check whether the project meets his needs. In the
validation checking the user works with the beta version of the software.
Project aspect:
User enters the appropriate data and results was checked and validated.
User acceptance of a system is a key factor of the success of any system. The system under
consideration was tested for user acceptance by running a prototype of the software.
Project aspect:
The primary objective of test case design methods is to derive a set of test that has of highest
likelihood of uncovering the defects. To accomplish this objective, two categories of test case
design techniques are used Black box testing and white box testing.
White box testing is a set case design method that uses the control structure of the procedural
design to derive test cases. Using white box testing methods, we can derive test cases that
• Guarantee that all independent paths within a module have been exercised at least once
• Exercise all logical decisions on their true and false sides
• Execute all loops at their boundaries and within their operational bounds
• Exercise internal data structures to ensure their validity
Black box testing
Black box testing methods focus on the functional requirements in the software. That is,
black box testing enables us to derive sets of input conditions that will fully exercise All func-
tional requirements of the program Black box testing attempts to find errors in the following
categories:
6.1.2 Registration
The unregistered user is redirected to this page after clicking the new user link. Here the user
has to fill up his details and then click on submit button and get himself registered.
6.1.3 ER-converter
This page consist of 3 fields:- name ,sentence and attribute. The name of the ER-diagram to
be build is given in the name field. The main input of the project is given through this page.
The problem statement is divided into two sub sections. The problem statement related to the
relations is to be written in the sentence field and the problem statement related to the attributes
is to be written in the attribute field. The provided problem statement is then checked using the
CHECK button.
6.1.4 Diagram
This page shows the output of the problem statement i.e. ER Diagram.
6.1.5 History
This page is used by the user to view the already created ER-diagrams.
7 Application
The development of a tool, ER-Converter, which transforms a natural language text input
into an ER model. This is a heuristics-based tool which employs syntactic heuristics during the
transformation. In order to achieve the desired result, new and existing heuristics are applied
during the process. Though this is a semi-automatic transformation process, the tool aims to
provide minimal human intervention during the process. This tools can also be used in industries
and colleges to reduce the tedious work of Drawing ER-diagram.
8 Conclusion
ER Converter tool aims at helping data base designer in the database development
process. It makes use of analysis technique to translate sentences to LF which are then used as a
basis for identifying entities and relationships. It gives a semi-automated approach to generate
ER model. ER Converter is a tool that is aimed at aiding Database designer in the Database
development process. The developed model produced very satisfying results and features not
provided by existing commercial CASE tools. Among these features auto analysis of user
requirement using NLP and custom model database generation.
9 Future Scope
Future research aims to develop a system that can handle complex sentences and
more structured objects like tables. Same type of approach can be used for procuring UML
diagrams using the NLP. We can also take voice input to generate ER diagram. Further work is
required for auto recognition of ternary relationships, Composite and derived attributes, Rela-
tionship attributes for detecting errors of an ER diagram, and Specialization/generalization.
10 Glossary
• NLP - Natural Language Processing
• ER - Entity Relationship
References
[1] Brill, E.: A Simple Rule-Based Part of Speech Tagger. In: Proceedings of the Third Con-
ference on Applied Natural Language Processing, ACL, Trento, Italy (1992) 152-155
[2] Buchholz, E., Cyriaks, H., Dusterhoft, A., Mehlan, H., and B. Thalheim.: Applying a Nat-
ural Language Dialogue Tool for Designing Databases. In: Proceedings of the First Work-
shop on Applications of Natural Language to Databases (NLDB’95), Versailles, France
(1995) 119- 133
[3] Chen, P.P.: English Sentence Structure and Entity-Relationship Diagram, Information Sci-
ences, Vol.1, No. 1, Elsevier (1983) 127-149
[4] Daelemans, W., Zavrel, J., Berck, P. and Gillis, S.: MBT: A memory-based part of speech
tagger generator. In: Ejerhed, E. and Dagan, I. (eds.), Proc. Of Fourth Workshop on Very
Large Corpora, Philadelphia, USA (1996) 14-27
[5] Eick, C. F. and Lockemann, P.C.: Acquisition of Terminology Knowledge Using Database
Design Techniques. Proceedings ACM SIGMOD Conference, Austin, USA (1985) 84-94
[6] Gomez, F., Segami, C. and Delaune, C.: A system for the semiautomatic generation of
ER models from natural language specifications. Data and Knowledge Engineering 29 (1)
(1999) 57-81
[7] Grishman, R. and Sundheim, B.: Message Understanding Conference-6: A Brief History.
Proceedings of the 16th International Conference on Computational Linguistics, Copen-
hagen, Denmark (1996) 466-471
[8] Harmain, H.M. and Gaizauskas, R. CM-Builder: A Natural Language-Based CASE Tool
for Object-Oriented Analysis. Automated Software Engineering 10 (2) (2003) 157-181
[9] Storey, V.C. and Goldstein, R.C.: A Methodology for Creating user Views in Database
Design. ACM Transactions on Database Systems 13 (3) (1988) 305-338
[10] Tjoa, A.M and Berger, L.: Transformations of Requirements Specifications Expressed in
Natural Language into an EER Model. Proceeding of the 12th International Conference
on Approach, Airlington, Texas, USA (1993) 206-217
[11] Zanakis, S.H. and Evans, J.R.: Hueristic ‘Optimization’: Why, When and How to use it.
Interfaces 11(5) (1981) 84-91
[12] Zavrel, J. and Daelemans, W.: Recent Advances in Memory-Based Part-of-Speech- Tag-
ging In: Actas del VI Simposio Internacional de Communicacion Social, Santiago de
Cuba, Cuba (1999) 590-597
[13] J. Allen, "Natural Language Understanding", Addison Wesley, 2nd Edition, 1994.
[14] A. Cockburn, “Using Natural Language as a Metaphoric Basis for Object-Oriented Mod-
eling and Programming”, IBM Technical Report TR-36.0002, 1992.
[15] B. Azar, “Fundamentals of English Grammar”, 2nd Edition, Prentice Hall, 1992.
[16] G. Booch, J. Rumbaugh, I. Jacoson, “Unified Modeling Language User Guide”, Addison-
Wesley Professional 1999.
[18] E. Buchholz, A. Dusterhoft, “Using Natural Language for Database Design”, Proceedings
Deutsche Jahrestagung für KI 1994 – Workshop. Reasoning about Structured Objects:
Knowledge Representation meets Databases, 1994.