Академический Документы
Профессиональный Документы
Культура Документы
Project Advisor
Noman Hasany
Assistant Professor
SSUET
Submitted by
Sadaf Nasim 2005-CE-187
Mariam Firdous 2005-CE-182
Faiza Urooj 2005-CE-189
Mahvash Iftikhar Querashi 2005-CE-204
Business Intelligent Reporting System is used to facilitate the HR department which keep and
manage the records of employees and follow the recruitment system as well.BRIS not only speed
up the HR processing but it also provide convenience to candidate to enjoy easiness for applying
the job and it also facilitate the top level management to take the advantage of Natural language.
BIRS makes it easy to enquire data base and to get successive result from search because it stops
users to write hectic data base queries. A manager can easily access any personnel information and
ACKNOWLEDGMENT
First of all, we cordially thank full to the Almighty Allah who praised us with the ability
and strength of character to complete this project which are assigned to do. Secondly, we
must be grateful to our internal Mr.Noman Hasany who helped us in this project.
And of course all of our teachers who have been great help and enormous source of
inspiration. During the project when we got in awkward positions, where nothing seemed to
make sense, our class lectures and lab instructions led us through those immensely dark
tunnels. Most of our respected teachers helped us when we needed it most. This is just not
concerning the project, it has been so over the last four years and we will never be able to
thank them enough. We also acknowledge our departmental staff, university staff or other
then this.
Last but not least we would like to thank to all our well wishers.
CERTIFICATE OF COMPLETION
_____________
Name of Advisor
Designation SSUET
ABSTRACT
This project describes a Natural Language interface to databases which consists of two parts: a
Natural Language Processing (NLP) and a data base application program. The NLP is a general
purpose language which builds a formal representation of the meaning of the English utterances it
is given. The database application is a module which builds a query in order to extract information
from the database. This approach yields an interface which is both extremely robust and portable.
A major benefit of using natural language to access the information in a data base is that it stops
Project aims to build a platform from which structured and textual information about HR domain
TABLE OF CONTENTS
PREFACE ii
ACKNOWLEDGEMENTS iii
ABSTRACT vi
LIST OF FIGURES ix
LIST OF TABLES x
Table of Contents
Introduction .................................................................................................................................... 15
6.1.3 Design.............................................................................................................................. 73
LIST OF FIGURES
Chapter 1
Figure 1.1 ........................................................................................................................................ 21
Chapter2
Chapter3
Chapter4
Chapter5
Figure5.3 ......................................................................................................................................... 62
Chapter6
Figure6.3 ......................................................................................................................................... 67
Chapter7
Figure7.3 ......................................................................................................................................... 86
Chapter8
[Type your address] [Type your phone number] [Type your e-mail address]
E
LIST OF TABLES
Chapter 8
Introduction
To maintain DB simply.
System must be convenient, easy set up and efficiently stored. It should not take
upon more memory space and provide fast access. The system must be low
enough in cost to be acceptable to most business men.
This system must require no technical expertise to run that expected to beyond of
anyone able to manage set up easily. The preparation of system for a specific
domain must require basic knowledge.
Obviously the BIRS must aim to have the property that any business
application should:
Integrity
Security
Reliability
Flexibility
BIRS is actually domain based software which need to capture the demand and provided a
solution on the market requirement. It facilitates the user by taking Natural language as
input in free form text which shows user requirement and also provides a convenient access
to applicant to submit their resume. We aim to provide manager a syntax free environment
where s/he can easily inquire any thing related to the HR database due to this BIRS a
layman can also take advantage of stored information.
The value to our society of being able to communicate with computers in everyday
"natural" language cannot be overstated. Imagine asking your computer "Does this
candidate have a good record on the environment?" or "When is the next televised
National League baseball game?" Or being able to tell your PC "Please format my
homework the way my English professor likes it. Commercial products can already
do some of these things, and AI scientists expect many more in the next decade.
market research and of course HR. Get in touch to learn more about the many
ways NLP can help you and your organization.
Oftentimes Google and other search engines receive queries in forms of questions,
for example: “what is a mortgage backed security” and “where should I go on
vacation”. We research question specific key phrases relevant to our domain to
answer such questions, providing value and relevancy to user which automatically
passes query to BIRS.
A major factor for the use of Object-Oriented approach is to remove some of the
flaws encountered with the procedural approach. In OO, data is treated as a critical
element and does not allow it to flow freely. It bounds data closely to the functions
that operate on it and protects it from accidental modification from outside
functions. OO allows decomposition of a problem into a number of entities called
objects and then builds data and functions around these objects. A major
advantage of OOP is code reusability.
Person object to hold the data related to a person and even provide some
functionality that this person may be capable of.
Object Oriented Programming has long been used in games to represent the
objects such as a User or an Enemy, or even a Weapon. This amazing way of
programming has proven just as useful in software and web development.
Inheritance
Polymorphism
Above are the basic concepts of OOP here are some more advanced
techniques [1]:
Serialization
PHP doesn't support persistent objects; in OOP persistent objects are
objects that keep its state and functionality across multiple invocations of the
application. This means having the ability to save the object to a file or
database and then load the object back. The mechanism is known as
serialization.
PHP has a serialize method which can be called for objects, the serialize
method returns a string representation of the object. However serialize
saves the data members of the object but not the methods.
In PHP4 if you serialize the object to string $s, then destroy the object, then
un-serialize the object to $obj you might still access the object methods! It is
not recommended because:
The documentation doesn't guarantee this behavior so in future versions it
might not work.
This might lead to 'illusions' if you save the serialized version to disk and exit
the script. In future runs of the script you can't un-serialize the string to an
object and expect the methods to be there because the string representation
doesn't have the methods.
o OOP provides a clear modular structure for programs which makes it good
for defining abstract data types where implementation details are hidden and
the unit has a clearly defined interface.
o OOP makes it easy to maintain and modify existing code as new objects can
be created with small differences to existing ones.
o OOP provides a good framework for code libraries where supplied software
components can be easily adapted and modified by the programmer. This is
particularly useful for developing graphical user interfaces.
Figure 1.1
1.5.2 Tools
We used different software tools for BIRS representation and integration of the modules.
List of tools are as follows:
1. Xampp
Operating system
Apache server
MySQL database
PHP
XAMPP
Sir Syed University of Engineering And Technology Page 21
BUSINESS INTELLIGENT REPORTING SYSTEM
XAMPP is a free and open source cross-platform web server package; Apache is
primarily used to serve both static content and dynamic Web pages on the World
Wide Web. Many web applications are designed expecting the environment and
features that Apache provides. Apache is the web server component of the popular
LAMP web server application stack, alongside MySQL, and the PHP/Perl/Python
programming languages consisting mainly of the
Features:
Virtual hosting allows one Apache installation to serve many different actual
websites.
MySQL
MySQL, the most popular Open Source SQL database management system, is
developed, distributed, and supported by MySQL AB. MySQL AB is a commercial
company, founded by the MySQL developers. It is a second generation Open
Source company that unites Open Source values and methodology with a
successful business model [3].
A relational database stores data in separate tables rather than putting all
the data in one big storeroom. This adds speed and flexibility. The SQL part
of ―MySQL‖ stands for ―Structured Query Language.‖ SQL is the most
common standardized language used to access databases and is defined by
the ANSI/ISO SQL Standard. The SQL standard has been evolving since
1986 and several versions exist. In this manual, ―SQL-92‖ refers to the
standard released in 1992, ―SQL: 1999‖ refers to the standard released in
1999, and ―SQL: 2003‖ refers to the current version of the standard. We use
the phrase ―the SQL standard‖ to mean the current version of the SQL
Standard at any time.
Open Source means that it is possible for anyone to use and modify the
software. Anybody can download the MySQL software from the Internet and
use it without paying anything The MySQL software uses the GPL (GNU
General Public License), to define what you may and may not do with the
software in different situations. If you feel uncomfortable with the GPL or
If that is what you are looking for, you should give it a try. MySQL also has a
practical set of features developed in close cooperation with our users. You
can find a performance comparison of MySQL with other database
managers.
Fully multi-threaded using kernel threads. It can easily use multiple CPUs if
they are available.
SQL functions are implemented using a highly optimized class library and
should be as fast as possible. Usually there is no memory allocation at all
after query initialization.
Sql Database
Integrity
This section provides looks at the concept by SQL to restrict the information that
can be added to the database. Restrictions are usually thought of as negative
(constraints, limitations, confines etc).When they are applied to data integrity, they
do positive job.i.e.that of ensuing you does not inadvertently add junk data to the
database. Data integrity restriction in effect, acts as policemen for database. They
are responsible for protecting the overall integrity of the database from rogue data
that may be introduced by INSERT and UPDATE [5].
Security
Most SQL based system operate in a multi-user environment .this means that at
any time, several different user‘s can access the same database to query ,insert,
update or delete data. Such an environment requires safety devices that are both
built into the DBMS itself and than prevent users from inadvertently computing
data.
Features of PHP
Speed
Open Source
PHP is open source, open source is one where the user is given a free license to
remodel or recode PHP, according to their wish. It is open source remember.
Multi Platform
PHP supports various platforms, which mean PHP can be installed on almost every
operating system, like the window−x, Linux, etc.
Easy Syntax
PHP syntax is quite easy to code, all the syntax are similar to the C language
syntax, if you are very new to the programming environment then it
The biggest advantage of PHP over Perl is that PHP was designed for scripting for
the web where Perl was designed to do a lot more and can because of this get very
complicated. The flexibility / complexity of Perl make it easier to write code that
another author / coder has a hard time reading. PHP has a less confusing and
stricter format without losing flexibility. PHP is easier to integrate into existing HTML
than Perl. PHP has pretty much all the 'good' functionality of Perl: constructs,
syntax and so on, without making it as complicated as Perl can be. Perl is a very
tried and true language, it's been around since the late eighties, but PHP is
maturing very quickly.
ASP is not really a language in itself, it's an acronym for Active Server Pages, and
the actual language used to program ASP with is Visual Basic Script or JScript. The
biggest drawback of ASP is that it's a proprietary system that is natively used only
on Microsoft Internet Information Server (IIS). This limits it's availability to Win32
based servers additional components.
PHP is a relatively simpler language to use than ASP.net. PHP is has much
better support for the database management system, MySQL. In fact, the
very popular blogging platform, Word Press uses the formidable
combination of PHP coding on MySQL for its content management system,
which includes about hundreds of thousands of blog posts every single day.
Another very popular and frequently updated service that uses the
combination of PHP and MySQL is Wikipedia. ASP.net can also support
MySQL, but PHP is unanimously hailed, by the masses and classes alike,
for its great support for this database management system.
People who use both PHP and ASP.net also maintain their opinion that
PHP is better for embedded support with another database management
system SQLite is described as a relational database management system
and since it is contained in a C programming library, PHP can provide
PHP has also a very good support for object oriented programming, on
which whole scripting languages are being built nowadays. ASP.net also
provides capable support to OOP.
When it comes to support, PHP wins over ASP.net. The main reason for this is
that PHP is open source. Hence, the support can come freely from all over the
world. In most cases, PHP fixes are made instantly.
PHP can use the command line to perform many everyday activities. Some
of the things that the PHP command line is useful for is for manipulating
across many files and for putting files into multiple directories at once.
These are just some of the important features that PHP's command line is
used for.
ASP.net is compiled into memory in binary code. So, when ASP.net is used
for coding, it is evident that it takes much longer time to process since the
codes need to be retrieved from memory. However, PHP is not compiled
into memory like ASP.net is. It is interpreted at runtime. That is the reason
why PHP coding leads to better speed and even efficiency. However, it
must be said that both PHP and ASP.net can run at supreme speeds and
efficiency when they are coded expertly.
Since PHP is older, there are many people who claim that it is much more
A trend with a lot of new Web Designers is to blindly go for ASP or JSP for creating
dynamic and interactive websites.
PHP is indeed a great choice for dynamic websites and a lot of popular websites do
use PHP as a scripting language.
PHP runs on different platforms such as Windows, Linux, UNIX, etc. PHP is easy
PHP is compatible with almost all servers used today (Apache, IIS, etc.). PHP
supports many databases such as MySQL, Oracle, and PostgreSQL etc. PHP with
MySQL database and Apache Server is a very good and popular combination. PHP
Microsoft's ASP and Sun Microsystems JSP. PHP is perfectly suited for Web
development and can be embedded directly into the HTML code. PHP is often used
together with Apache (web server) on various operating systems. It can be also
Usage scenario
Scenario 1:
This scenario details the login steps of the manager. it is necessary to gain the
Scenario 2:
The session is created through the scenario one. The end user has two perspective
of using the system .one from the searching perspective through NL and another
Scenario 2.1
End user adds the applicant through the new applicant panel.
working experience and the reference. The database saves the detailed information
of the applicant.
Scenario 2.2
The manger after the creation of the session inquires for the information of the HR.
the manager will type the NL onto the query text box and then searches for the
Scenario 3:
The manager can take following three actions on the resume of the applicant
1. Approve
2. Interview schedule
3. Reject
Scenario 4:
An applicant creates an electronic profile gives not only the common contact
information but also record the qualification, work experience and other relevant
details.
Managers
New applicant
Since Manager is one who controls the resources and expenditures of the
company and hence he is the authorized user of the system. The user will be
verified as a manager when the user id and passwords verifies. The manager will
be able to access the BIRS from both admin panel and the applicant panel. Only
the managers are authorized to access the admin panel where the manager types
the N.L to retrieve the required output. The managers can keep an eye on the
Applicant who is not the authorized person can not access the admin panel. The
only new applicant panel is available for the applicant. The applicants will only fill
the C.V.
Manager and applicant are the only intended users of the system.
2.2 Use-cases
Login
login
«uses»
*
Verification
verifying
System
«extends»
<<include>>
manager
valid invalid
Figure 2.1
Description:
This use case defines the session creation scenario. The manager creates the
session by login the user id and password. The manager given information is then
verified through the verification system. After verifying the user will either be valid or
invalid. The unauthorized person won‘t be able to access the functionality of the
system.
Session started
query text
* *
*
*
* Data
base
manager/applicant
new applicant
*
*
Figure 2.2
Description:
Now when the session is created the manager will be able to access the functions
of the BIRS. The manager can enter the NL query to inquire about the employee.
The applicant fills the C.V without creating the session. The manager after inquiring
Entering query
query text
*
* <<Include>>
Keyword
extractor and
search
Query
manager * *generator
System
Figure2.3
Description:
The manager types the NL in the query box. The query is then taken up by the
keyword extractor when the search function works. The NL query is converted into
the query and the query retrieves the output from the database.
Applicant form
personal
information
* *
user
work experience
* *
* *
reference
Figure 2.4
Description:
The applicant fills the C.V without the creation of the session. The data is then
Manager’s action
take actoin
* *
Storage
<<Include>> <<include>>
* <<include>> * system
manager
Figure2.5
Description:
The action taken by the manager for the applicants on behaves of the applicant‘s
skills and/or educational details. The manager can approve, reject or schedule the
Usage specification
Limitation
Server is required for the BRIS as it‘s the desktop based application.
The whole natural language (English) can not be covered as it‘s a huge
Our system is mainly design to facilitate the applicant and HR domain users so
BIRS focused to provide easiness to both. System is modeled in to two halves one
is Admin panel and other is applicant panel. The Administrator logins to BIRS
account by authorized password and user id then it will redirect the administrator to
the control panel where he can ask query related to the HR database and can
update, insert, delete and retrieve the data. While applicant can directly upload
his/her resume to the HR department manager can find the desired resume by
Administrator will give free form text input to ask the query or to extract information
from the data base. From this text keyword will be extracted on behalf of user‘s
focused or intention that what he needs to do. Then by taking the keywords which
stored in database to give the SQL related words to generate appropriate SQL
query which then executed and show the response of the query in the form of
reports.
BIRS architecture is given below which defines concrete working of BIRS and its
CV
Natural language
APPLICANT BIRS
Administrator
Figure 3.1
BIRS provide an interface for both Applicant and Administrator. It actually has two
precepts one is from Applicant end and other is from Admin end.
Admin Panel:
The higher management of any organization can use the BIRS in order to retrieve
information and manage the database. He/she can ask query in free form text or in
Applicant panel:
Any one can submit the resume by using BIRS in very convenient and easy way by
Keyword extraction
Response from DB
Report generation
Figure 3.2
input
It will be the text input in normal English form also called free form text input. User
can ask the desired query or perform intended action for manage a database.
Keyword extraction
From the input text we extracted the keywords which contain action, field, table and
Response from DB
Report generation
The modules are simple when user enter his query it first proceed by module
named Keyword Extraction for taken out specific keywords related to the
requirement of the user these keywords than further pass on to next module SQL
Query Generation module which used these words to generate SQL query. Once
these queries are created they are executed by SQL Query Execution module to
select out required database tables and fields. In end the Report Generation
Database,
Name ,
Table name,
Field name.
Database name: The database name is the major part of the System.
Key words: These are words that are required by our Algorithm. These are
the table names, field name, indicators like condition or can be any
aggregate word.
Table name: The extracted information from the query will contain the table
name to verify that the query required by the manager exist in the Database.
Columns name: Once the table name is verified the internal flow will search
Figure 3.3
BIRS is server based system so it can easily interface with outside the world. We
compiled modules which extend the core functionality. Virtual hosting allows one
Module Description
In this chapter we defined in detail about the different module of the BIRS. 4.1 is
about the keyword extraction which define that how from the input text query
specific keywords will be taken out, 4.2 include description about how we generate
SQL query that is based on matching of keywords extracted from first module with
our query bank, 4.3 describe the execution of SQL queries, and in 4.4 we try to
A typical information retrieval (IR) system responds to the user‘s query by selecting
IR system is able to filter out extraneous information and return only relevant
supply enough information for the system to determine what the user is looking for,
for example: queries for a World Wide Web search engine almost never exceed 4
words. An IR system well suited for general use would be able to process very
natural form queries. [6] We designed a system that started with a query and gives
almost precisely selected database tables and fields that were relevant to the
original query. This System extracts keywords from the original query and chooses
tables and fields that precisely matched the user‘s expectations. Real examples
often have many lines of query, and these are usually stored in a database-
The module selects specific text or key words that summarize the original query.
Extracting keywords requires that you recognize the character that are input, and
remove unwanted punctuation, symbols, numbers, and stop words. Our very first
task was to take out specific words from the user query. Since our user can enter
any query in natural language (English) as far as it‘s related to Human resource.
Our aim is to pick out related word from that input. Keyword extraction starts by
reading into a text string. When the user enters his query in text form it immediately
special symbol characters, currency signs, and other math characters. However,
that stripping away punctuation joins sentences together into a single long string of
words. This is fine for extracting keywords, but not phrases. The words at the end
of one sentence will blend into those at the start of the next sentence, creating odd
word pairings that can give unexpected phrase searching results. Removing
Keyword extraction module use natural language queries using PHP on Apache
For example, when user writes a query ―view me the salaries of employees along
We define a function that read this query as a text string and it saves this as an
array of words, then this array is searched for words like View, Salary, Employees,
Name and value A. keywords are extracted accordingly to build syntax for database
to create SQL command. We can see that View can be consider as SELECT,
Salary & Name is like Field name and Employee is same as table in database.
This selection is due to a table name query bank in which we save more than 40
words; these words are select in bases of part of speech tag, we collect some
distinctive word like select, view, display for manipulating table, employee,
applicant for selecting table and name, salary, email id etc, as field name.
Before that we will remove the blank space from left and right of the search sting by
$ltable=trim($ltable," ");
$rtable=trim($rtable," ");
Wouldn't it be nice if user can simply understood SQL and could pose questions to
the database themselves? Obviously, not many users are going to invest the time
necessary to learn SQL, but we offers the next best thing -- the ability to execute
The components for English keyword extraction – the query bank table and the
runtime engine. The query bank (QB) table makes it possible for user to enter
different keywords into the query. The QB table allows you to test it against queries
you think users will pose. For example, you might try asking the database ―display
statements that define the meaning of the question and identify the corresponding
database objects.
The flow for searching query for a keyword might be shown as:
if the search fails at 3, then the user is redirected back to the search screen at
step1
Text query
User / admin
Keyword
extraction
Selected keywords to
next module
Figure 4.1
Language, also since our domain is specific one can only ask question related to
processed.
use of controlled query bank. Database comes with a root dictionary which already
bank that can define as much word so that our restriction can be overcome. We
design our query bank on the bases of ontological foundation that is consisting of
same type word that give similar meanings for example education, qualification,
When this collection is performed we need to make SQL query. So far, we have
extracted specific text to get answers from database. To do this, we enter some
small text queries that we typed directly into query pane. We will apply the search
to an existing table. The visitors will enter one or more than one keywords in the
search box and we have to develop a query based on the selection of the visitor.
Here we will discuss how to create a sql query to apply to the database with
When the keyword are located query generation module is next. In this module
basic SQL query is generated through these words. For this we would need
somehow to tell the module where to find the equivalent SQL syntax. We do this by
defining a special kind of table named IsKeyword into the query bank.
Since the visitor limited to search for exact match or any where match on the table.
For that we use our QB table and apply the search to the word in IsKeyword table
of our MySQL table. Based on the selected type of keyword then function look
whether it is a table name in IsTable column or a field name in Isfield column after
When the words have exact match as table name or field name then create the
query using simple where condition by searching for the SQL equivalent word in
sqlword table
function findop($data)
$result=mysql_query($sql);
if(mysql_num_rows($result)>0)
$data=mysql_result($result,0,'sqlword');
//echo "<script>alert('".$data."');</script>";
else
$data="";
return $data;
function istable($data)
$data1=false;
$result=mysql_query($sql);
if(mysql_num_rows($result)>0)
if(mysql_result($result,0,'istable')==1)
{
Sir Syed University of Engineering And Technology Page 53
BUSINESS INTELLIGENT REPORTING SYSTEM
$data1=true;
//echo "<script>alert('table');</script>";
$table=mysql_result($result,0,'sqlword');
else
$data1=false;
return $data1;
Else if asked for different matching of keywords then read the search term and
breaks it into array of keywords using split command. Then loop through all the
element of the array of words and create the sql command using like command for
each word or the element of the array. Here is the code for this.
while(list($key,$val)=each($kt)){
if($val<>" " and strlen($val) > 0){$q .= " name like '%$val%' or ";}
we have broken the text using split command and then looped through the
keywords. Here using one if condition we have taken care that blank space are
$q=substr($q,0,(strLen($q)-3));
In the above line we have first calculated the length of the string by using strlen and
then used that value inside the substr function after subtracting 3 from it. The 3 is
subtracted as length of OR with one blank space is 3. This way we will get the
function generateQuery($pieces)
Figure 4.2
the translation from text does not handle truly random ad hoc questions. Our
implementation has focused more on providing greater usability through the use of
Our limitation here is that we have difficulty to implement update and insert query.
Also query which have inner joint in there generation are complex to achieve.
example if user asks for selecting many tables and field for updating them.
Here we execute the SQL query, when the query generated it pass on to this
module. We collect records or data from MySQL database by using PHP. Here we
use some exclusive PHP function to interact and collect data from MySQL db.
These functions will be available for use if MySQL support is enabled in PHP
We use PHP functions to interact with SQL. The connection to SQL is already
established .We use the function mysql_query() to execute the SQL select query.
function getprimaryfield($ltable,$rtable)
$result=mysql_query($sql);
mssql_fetch_array() function will return the array of records. We will use while loop
Here is the complete code to display the records from a MySQL table.
$dbusername='';
$dbpassword='';
$servername='';
$link= mysql_connect($servername,$dbusername,$dbpassword);
$qt=mysql_query($query);
while($nt=mysql_fetch_array($qt)){
echo "<br>";
If there are more records in a table ( say more than 50) then it will not look nice to
display all the records in one page and asking visitors to scroll down to see all the
records. This will also slow the process of loading of records. So it is better to break
all the collected records into different pages with a fix number of records per page
(say ten records per page) .We have to give navigational link at left and right side
saying previous and next page. We also have to give links at the center so the
The below menu is further improved if you have more records. For more than 1000
records and if you have 10 records per page then you have to show 100 links at the
bottom of the page. That can be further simplified in advanced script to make links
in groups.
The BIRS takes the natural language input when the authorized user is log in. there is no
access for the applicant or for the employee from the security point of view. The applicants
will be hired or approved on the bases of their skills and education by the managers. the
applicant will submit the resumes for the required job. The managers will select, invite the
applicants for the interview, can also approach for the second interview if needed. Then the
manger will decide to hire and will decline the applicants if the applicant is not interested
for the job the managers will search for the applicants who is fit for the job through his own
end means he doesn’t has to wait for the other person to inform. The manager can search
directly through the BIRS through natural language. The manager can also spy the
employees. About their salaries, their department, their designations, their attendance and
leave etc.
5.1.1 Events/interrupts
The events that will cause behavioral change within the system are represented from two
perspective .one from the admin point of view and other from the applicant point of view
Login
Logout
New applicants
Personal information
Educational detail
Work experience
Reference
5.1.2 States
Take action
Login
New applicant
Entering NL
Control panel
Logout
State transition diagram contains two modules one from the manager’s perspective and
second from the applicant’s perspective.
Manager’s perspective:
Figure 5.1
Control
panel
Not verified
Logout
Login
When the login event is triggered user id and password state will arise. When the
verification is done for the user then the manager will type the natural language query in the
query text box event. When the search event is triggered the output state will occur. When
the user logins fail due to the invalid user id or password the user has to login again. The
enter query state can take multiple input for the several time for the searching and to
provide the output.
Figure#5.2
Not a member
No query panel
When the control panel is triggered it verifies for the authentic user if verified so the query
panel will be displayed. If not a member then the query panel will not be showed. The
applicant has access only to the new applicant event
Applicant’s perspective:
The applicant will fill his or her C.V through the new applicant event. the flow can be
represented as follows.
Figure5.3
Applicant’s
personal
information
submit
When the applicant trigger over the new applicant event the system will come into the
applicant’s personal information state . here the applicant will fill his/her name , fathe’s
name ,address , country, email , phone ,city , POB, religion etc . then after the submit event
the system will be in the educational detail’s state
Figure5.4
Educational
detail
Go to step
add
3
Add data
In the educational detail the applicant will enter his/ her educational history like degree
name , degree, name of the institute, year of passing , grade. The applicant trigger the add
event to add the history and can also proceed to fill the other educational details if any else
will trigger the go to step 3 where the system will be in the work experience state
Figure#5.5
Work
experience
add Go to step
4
Add data
In work experience state the applicant will fill his/her working experience like what is the
period of work, where did he/she work, what was the name of the company what is the
reason of leaving the company. What is the salary last drawn. After gathering these
information the applicant can trigger the add event and can also add further experience if
having else the applicant will trigger the go to step 4 event and the system will be in the add
reference system.
Add reference
Figure#5.6 done
View
information
In the add reference state the applicant will give the reference of the person who referred
him/her .the name phone number , email id etc of the referred person. At the end when the
applicant triggers the done event the system comes into the view information state where
the applicant can view his/her C.V.
Figure5.7
Applicant’s
personal
information
submit
Educational
detail
Go to step 3
add
Work
experience
Add data
add Go to step 4
Add reference
Add data
done
View
information
Each and every part of the processing is managed by the system. Only the input is required
from the authorized user. The front end is controlled by the login event and the new
applicant event. The members can only access the login panel by providing the valid user id
and the password. The mangers can access the new applicant also from the admin panel
also. The applicant have no right to access the login panel .only the new applicant panel is
available for the applicants.
We used Linear Sequential Model for BIRS, some time it also called ―classical Life
approach to software development that begins at the system level and progresses
when the BIRS interface with other elements such as hardware, people and other
resources. System is the basic and very critical requirement for the existence of
BIRS. To extract the maximum output we re-engineered and spruced up the BIRS.
their search process in their specific domain that is HR in our case. Proper
understanding of all the factors made easy to achieve our goal. We started with
segmentations within HR. Our approach is to find keywords with high search
continuous process.
extraction and also analyzed that how they maintain their databases since our
department of any organization. This analysis helped us more in order to have the
traditional onsite and offsite project consulting in all major HR subject areas, interim
1. Personnel hiring
3. Employee Management
Figure 6.5
Personnel Hiring.
Figure 6.2
Figure 6.6
Employee Management:
review his/her record with different aspect. He can manage the employee on the
behalf of their personal information which he can view easily by using Natural
language.
A record showing for each employee his or her gross pay, deductions, and net pay.
The payroll may also include details of the employer's associated employment
costs.
6.1.3 Design
BIRS design is actually a multistep process that focuses on four distinct attributes
Data structure:
Database name
Table‘s name
Columns name
Key words
BRIS architecture:
In the architecture we defined the whole flow of BIRS from taking Natural language
input and process it to the user required output.NL broke in chunks then it will be
used to generate a SQL query ,this query will be then executed and extracts the
Interface representation.
User interface of BIRS is designed on PHP and HTML which follows object oriented
style .Due to this interface even a non specialist person can easily use the software
to get the required output. Interface of BIRS with outside world is represented by
BUSINESS INTELLIGENT
REPORTING SYSTEM
ADMINISTRATER
Figure 6.7
Procedure.
procedure for breaking a natural language which actually follows the following:
Free form natural language text must break in small pieces like action, field, table
and condition (if there will be).the sequence may alter its not consistent. Field or
BIRS contains a QUERY BANK which stores the words and correspond SQL word
it‘s a big task to maintain such a table because natural language is itself very
ambiguous one word can have many different meaning. To make the query flexible
we didn‘t restrict any special syntax but since to capture natural language at this
This HR database could be used in many different ways. We have created queries
The first type is simple Query which can show the all fields of database on
The second type is conditional which can show the ―where‖ clause Queries.
Both are used to provide user interface. But PHP is also used for key word
2. JSP
of the same ideas found in Java, the compiled object-oriented programming derived
from C++. JavaScript code can be imbedded in HTML pages and interpreted by the
Although best known for its use in websites (as client-side JavaScript), JavaScript
Prototype-based
Prototypes
Functions double as object constructors along with their typical role. Prefixing a
function call with new creates a new object and calls that function with its local this
keyword bound to that object for that invocation. The function's prototype property
Functions as methods
definition and a method definition. Rather, the distinction occurs during function
method of an object, the function's local this keyword is bound to that object for that
<?php
$con = mysql_connect("localhost","root","");
if (!$con)
else
mysql_close($con);
?>
// Create table
mysql_select_db("my_db", $con);
FirstName varchar(15),
LastName varchar(15),
Age int
)";
// Execute
<?php
$con = mysql_connect("localhost","peter","abc123");
if (!$con)
mysql_select_db("hr_db", $con);
Sir Syed University of Engineering And Technology Page 78
BUSINESS INTELLIGENT REPORTING SYSTEM
while($row = mysql_fetch_array($result))
mysql_close($con);
?>
<?php
$con = mysql_connect("localhost","peter","abc123");
if (!$con)
mysql_select_db("my_db", $con);
WHERE FirstName='Peter'");
while($row = mysql_fetch_array($result))
?>
6.1.5 Testing
We performed all required testing to verify the working of BIRS .we perform
Like.
Display the record of the employees having age > (‗50‘) years and calculate his
total salary from his joining date and find out his total attendance of the year
(‗2003‘).
Lexical ambiguity.
It means that query is not specifying clearly that what user requires.
Here it includes the semantic ambiguity .This is often the case, for example, with
idiomatic expressions whose definitions are rarely or never well-defined, and are
required data.
Sir Syed University of Engineering And Technology Page 80
BUSINESS INTELLIGENT REPORTING SYSTEM
For designing the prototype of the BIRS we required the tools which provide a
convenient interface to the user here we used PHP and HTML for prototyping.
This section provides cost estimates for the project. Cost and time analysis is a
very important phase of the project and different organizations have different
methods of calculating cost of the project. The experience of experts available is
also important in order to calculate the true cost of project. We have tried to
calculate cost of our project on the basis of two measures.
Time: Time to complete the project
Personals Working: No of Developers available to work on project
1. Cost of Utilized Resources
Following are the estimations made before the start of the project:
Total Estimated Time to complete the project = 26 weeks
(Note: This time includes programming + research + finding solution to our
problems by consulting experts.)
Total no of hours that all the developers are working on project/week =
15 hrs
Total no of Working hours for entire project duration = 26 * 15 = 390 hrs
Total No of Developers available to develop the project = 4 persons
Therefore all the workers will collectively work for 375*4 = 1500 hrs
Estimated amount of money while considering all three measure such
loc, time, and cost of utilized resources.
Therefore, the total Cost of the project is Rs.25,000 out of which 40 %
cost is of utilized resources that includes electricity, systems software,
transportation etc. 60% is the cost of other expenses that we might have
in the course of project For the software project estimation, effort
estimation is as important as the cost and time estimation. Effort is
measured in person-months
The third phase involved deployment of the design space, evaluation of potential
solutions and refining these solutions as necessary until we were satisfied with the
demonstrated improvements.
Software cost and effort estimation can never be an exact science. Too many
variables i.e. human, technical, environment, political affected the ultimate cost of
software and effort applied to develop it. The model we used for cost and effort
estimation is COCOMO 2 [3] model. We use COCOMO model as a cost estimation
technique. The COCOMO cost estimation model is used by thousands of software
project managers, and is based upon a study of hundreds of software projects.
Unlike other cost estimation models, COCOMO is an open model, so all of the
details are published, including:
The underlying cost estimation equations
Every definition (e.g. the precise definition of the Product Design phase of a
project)
Because COCOMO [3]is well defined, and because it doesn't rely upon proprietary
estimation algorithms,
on proprietary models
Figure 7.2
Figure 7.3
Figure 7.4
Figure 7.5
Figure 7.6
Figure7.7
Figure 7.8
Figure 7.9
Figure 7.10
Our actual results are same as we estimated before, because we did not use any
external resources.
HARDWARE
Ram,
Hard disk,
Processor speed
Operating system
SOFTWARE
XAMPP server
SQLyog
Test Plan
We want to produce bug free product. And make sure that there is no defect in our
project. So we spending large amount of time in testing. We provide description
procedure and Strategy of the testing.
Overview
system.
The following contingencies exist in order to successfully
complete the testing activities as outlined in this document:
Table 8.1
This section provides an overview of the entire test document. In BIRS we need to
test each individual module either can run properly so we perform unit testing we
have three modules
Keyword Extraction
Sql query generation
Query execution and repot generation
The above modules were tested in unit testing.
After completion of unit testing we have successfully done integration testing to
integrate these individual modules.
At recruitment side we performed validation testing to ensure that all fields in
curriculum vitae should taken valid values.
Security testing is applicable in administration panel because we working over
database for security of database we restrict it .provide user name and password
only administrator can login and perform action.
Stress testing is also applicable at administration panel when he is login for access
data base insert natural language query so we put maximum length of query and
analyze how many words it should be taken at a time.
This is the format which we apply and it extracts the keywords according to the
following pattern.
Value: Value:
executive executive
manager manager
Value: Value:
manager manager
lastname,, lastname,,
dateofbirth dateofbirth
Table 8.2
2 Action: display select salary from select salary from Here display
employees where employees where against display
Table
designation is designation is it generate
:employee
executive manager executive manager select query.
Field: salary
Value:
executive
manager
3 Action: show Select *from hired Select *from hired Show also
applicants; applicants; indicates
Table: applicant
select query ,
Value: hired it select hired
applicants applicants
are absent
then run select
or display the
names .
Value: manager
lastname,,
dateofbirth
Table 8.3
masters
Table 8.4
Phase 1
First we integrate the keyword extraction and query generation modules. For
analyzing the required output we create test cases and alternate cases either these
two modules integrate properly. Input of this phase is Natural Language and output
is SQL query.
Table 8.5
Phase 2:
After successfully integrated of these two modules we further processed this
process and integrate this with third module information extraction from database
and report generation. Input of this phase is Natural Language query and output is
report generation.
manager
Table 8.6
assign
integer or
special
character `–
and follow
following
pattern
aaa-
aa@hotmail.c
om
Table 8.7
Figure 8.1
Figure 8.2
5 Multi query run Display Only show Only show the pass
at a time by the salary the salary of salary of
using of manager manager
innerjoin.first executive
search and manager
display
6 Simple select List down Search the Search the cv‘s pass
query display the cv‘s of cv‘s of hired of hired
all records from hired applicants applicants
Table 8.8
We did not use any testing soft ware, or any special tool for testing. We have
decided to use simple method for the testing. Each programmer test the
components or functions created by him separately and hand over to the lead
tester. Lead tester test each component and make a note of the result in test result
table. Once the product is completely developed we all member of the software
projectteam test the software with combined effort.
Table 8.9
We did not use any testing tool. All necessary testing we perform manually or at the
time of implementation.
For the average reader, the workings of keyword search engines are fairly obvious.
NLP and KB engines are similar in that they both are semantic search techniques
and they understand the meaning of a search query. However, they differ in their
approaches. NLP engines use natural language processing to understand
meaning, while KB engines apply a broader suite of knowledge with NLP being a
potential part of it. It is most similar to how we as people understand the meaning of
a sentence. We use a variety of knowledge bases, some vertical or domain specific
and others general. We might use our knowledge of banking services, alternate
fuels, basic math, or conversion rules (like three feet is one yard), and so on.
Before we start analyzing each of the techniques let us understand what really
matters in the business of search. Three things that top the list are
(3) The scope or how broadly it can be applied across multiple domains.
Let‘s compare the keyword, NLP and KB search techniques across these key
metrics.
With regard to learning, major search engines have over time come to use a variety
of sophisticated techniques like searching for word clusters, making spelling
suggestions, looking for phrases, and so on to produce very good search results
compared with a basic keyword search. Such enhancements can continue to
improve the quality of search results, but not significantly more so than the current
poor-quality results that are widely agreed to be under 5% relevant.
NLP Engines
NLP-based search engines can certainly improve the quality of search results by
using computational linguistics to understand the meaning of the search query and
match it to content. It is realistic to expect that the quality of results can go into the
20-25% range.
Although NLP search engines cannot be universally applied like keyword engines,
they can cover as much as 50-60% of current search content.
As for learning, NLP search engines will not offer any better search results over
time using the same linguistics processors. It‘s not like our knowledge of language
will be so much better next year and the year after.
Overall, given the leap in quality of results and the broad applicability across
language-based content, NLP engines have the capability to produce the next big
leap in general Internet search.
KB engines will most closely emulate human capabilities and can continually
improve on search results as the knowledge base grows through community input.
KB engines are constrained by the lack of readily available general and vertical-
focused knowledge bases and the time it takes to develop them. Lastly, like NLP
engines, KB engines do not have any advantage over keyword engines when the
search content or user queries are just keywords or unconnected words.
The Future
Look for NLP engines to offer the next big leap in general Internet search given
their broad applicability and improved quality of results over keyword search
engines. Also, in the near future, KB search engines will offer the highest-quality
search results across verticals. With knowledge bases built and shared across
more verticals, KB search engines can become the best of the new generation of
search tools.
We can be certain that future search engines will be driven not just by keyword
search techniques but by a combination of keyword, NLP and KB search
techniques. KB engines will be used when knowledge bases are available. NLP
engines will kick in when content is language-based but no knowledge base is
available. Finally, keyword engines will be used for all other content and serve as
the search engines of last resort.
Conclusion / Summary
We concluded that by designing a software like BIRS we can provide open mindedness to
the end user to do any thing by using natural language we also focused business domain
where the high level management needs to manage their data base so provide a syntax free
environment to enquire and manage the database more over we also facilitated the applicant
We observed during the designing of BIRS that it is difficult to capture the whole language
since NL is itself very ambiguous and versatile so a point came where we need to limit the
free form text in order to get more efficient result. More ever there must take about the
semantics and syntax of language because it is the most important feature to describe any
A.APPENDICES
I. Project Schedule
Sadaf Nasim: she has done chapter 1, 3, 6,9,10 and starting pages
Mahvash Iftikhar Qurashi: chapter 2, 3, and 5, taken the snap shots of the system
and class flow diagram
2.
start
User login
NO
verifying
YES
Text input
Keyword
extraction
NO SQL generation
correct
YES
SQL
execution
output
start
Personal HR
informatio
database
Educational HR
details
database
Work HR
experience
database
reference
HR
database
submit
B. References
[1] www.devarticles.com/c/a/PHP/Object-Oriented-Programming-in-PHP/4/
[2] http://searchsoa.techtarget.com/sDefinition/0,,sid26_gci212681,00.html
[3] MySQL 5_1 Reference Manual 1_3_1 what is MySQL.mht
[4] http://www.mysql.com/company/legal/licensing/.
Techniques
[7] http://www.fhwa.dot.gov/programadmin/mega/cefinal.cfm
[8] http://en.wikipedia.org/wiki/Project_plan
[9] http://www.softstarsystems.com/cocomo2.htm
[10] http://www.softstarsystems.com/overview.htm
[11] http://hissa.nist.gov/HHRFdata/Artifacts/ITLdoc/235/chapter7.htm(int)
[12]http://searchsoftwarequality.techtarget.com/sDefinition/0,,sid92_gci12434
30, 00.html
[13] http://infolab.stanford.edu/~burback/watersluice/node22.html
[14] http://www.extremeprogramming.org/rules/unittests.html
[15] http://diveintopython.org/unit_testing/index.html
[16] http://www.waldentesting.com/backup/services/validation.htm
[17]http://searchsoftwarequality.techtarget.com/topics/0,295493,sid92_tax306128,0
0.html
[18] http://www.testingeducation.org/k04/TestMatricesExamples.html
[19] http://www.sqaforums.com/showthreaded.php?Number=302637
C. Glossary
Ambiguity
Statements or arguments used in a work that may have more than one meaning or
interpretation. Ambiguity refers to the ways words or phrases can connote a range
of meanings. Ambiguity points to the openness of language to different.
Artificial intelligence
applies to a computer system that is able to operate in a manner similar to that of
human intelligence; that is, it can understand natural language and is capable of
solving problems, learning, adapting, recognizing, classifying, self-improvement,
and reasoning.
Bug
A software bug is an error, flaw, mistake, failure, or fault in a computer program that
prevents it from behaving as intended (e.g., producing an incorrect or unexpected
result).
Chunks
A part of something that has been separated; A representative of a substance at
large, often large and irregular; To break into large pieces.
Cocomo ii
The COCOMO cost estimation model .COCOMO is an open model, COCOMO is
well defined, and because it doesn't rely upon proprietary estimation algorithms,
COCOMO estimates are more objective and repeatable . COCOMO can be
calibrated to reflect your software development environment, and to produce more
accurate estimates.
Configuration
an arrangement of parts or elements. To set up or arrange something in such a
way that it is ready for operation for a particular purpose.
Costar:
Data Base
Sir Syed University of Engineering And Technology Page 132
BUSINESS INTELLIGENT REPORTING SYSTEM
Data types
In programming languages a data type is an attribute of a datum which tells the
computer (and the programmer) something about the kind of datum it.
Domain
A domain is a set of allowable values for one or more attribute.
Event
Something that happens at a given place and time.
Frame work
A framework is a basic conceptual structure used to solve or address complex
issues. This very broad definition has allowed the term to be used as a buzzword,
especially in a software context. A software framework is "the skeleton of an
application that can be customized by an application developer‖.
Hardware
Human resources
Interface.
An interface defines the communication boundary between two entities, such as a
piece of software, a hardware device, or a user. The point of interconnection
between two systems or subsystems.
Key word
Keyword is a word which occurs in a text more often than we would expect to occur
by chance alone. A significant word or phrase in the title, subject headings,
contents notes, abstract, or text of a record in an online catalog or database which
can be used as a search term in a free-text search to retrieve all the records
containing it. Keywords are searched in any order.
Knowledge base
The facts, relationships, and procedures that constitute the knowledge about a
given domain or task; the database of an expert (or knowledge based) system.
Model
A physical model is a smaller or larger physical copy of an object. A representation
of a set of components of a process, system, or subject area, generally developed
for understanding, analysis, improvement, and/or replacement of the process.
Object oriented
In programming, a combination of code, which is a sequence of instructions
referred to as functions, along with data units, referred to as structures. A design
methodology decomposing problems into objects rather than procedures.
Open source
Computer software for which the source code is freely available. Source code of a
computer program that is disclosed to the public. Software registered as open
source can be accessed, modified and adapted to new.
Part-of-Speech
One of the traditional categories of words intended to reflect their functions in a
grammatical context
Personnel
Persons collectively in the employ of a business.
Performance testing
In the computer industry, software performance testing is used to determine the
speed or effectiveness of a computer, network, software program or device.
Platform
A platform describes some sort of hardware architecture or software framework.
Procedure
Query
A search request submitted to a database or search engine. Used to find specific
content and files. a stored question about information in a database; when you
create a query, you ask a computer to quickly find information that answers a
question that you specify.
Resume
A curriculum vitae; an account of one‘s employment history and qualifications (often
for presentation to a potential future employer when applying for a job.).
Scripting language
A scripting language differentiates itself from other typical languages in that they
are usually simpler to learn and use as well as not needing to be compiled. The
language is interpreted at run-time so you can execuate instructions immediately.
Search engines
web sites which allow users to query a database of other sites, eg Google, Yahoo,
MSN Search.
Semantics
A relationship between words, phrases or any other allowable constraint and their
actual meaning.
Server
A computer or application, that provides a service to client software on other
computers. Servers are used for web hosting and other web applications.
Software
Software is a general term for the various kinds of programs used to operate
computers and related devices.
Software testing
SQLyog.
its an software for designing sql data base.
Stress testing
Process to determine that an IS (Information System) protects data and maintains
functionality as intended. The six basic security concepts that need to be covered
by security testing are: confidentiality, integrity, authentication, authorization,
availability and non-repudiation.
State
The way something is with respect to its main attributes.
Syntax
Systematic
A set of orderly, structurally inter-related steps based on a network of concepts,
principles and rules.
Technique
XAMPP server
XAMPP is a free and open source cross-platform web server package, consisting
mainly of the Apache HTTP Server, MySQL database, and interpreters for scripts
written in the PHP and Perl programming languages.