Вы находитесь на странице: 1из 8

ARTHA: Artificially Responding THinking Agent

ARTHA
(Artificially Responding THinking Agent)
An automated call centre response system
Category: Artificial Intelligence

N. Bhanu Teja Akash.Sankpal


Semester: 6th Semester: 6th
KLS Gogte Institute of Technology, KLS Gogte Institute of
Technology,
Belgaum-590008. Belgaum-590008.
Email:nicky.rocky@gmail.com
Email:akash.sankpal25@gmail.com
Think no: Think no:

1. ABSTRACT: Intelligence Engine is the core of


ARTHA, which makes computer to first
Imagine a day when you are able to “ask” “learn” the product specifications and then
your computer the solution to a problem. suggest the answer to a problem.
The computer would be able to “think”
and give you the solution. “Asking” a This technology if perfected would turn
computer doesn’t mean typing a query. spin offs in a variety of areas such as
Nor it means a string given to search automobile, aeronautics, medicine and the
engine. It means “speaking” with the list never ends.
computer through a microphone and
getting a vocal answer.
2. Keywords:
This white paper aims to give a good
overview of Artificial Intelligence applied
to
Artificial
call
Intelligence,
centre
Natural Language
applications. If
Processing,
implemented successfully, this can have
huge business turnoffs as it helps in
completely automating the customer call
centre. But the complexity involved in 3. Why THINKING
implementation calls for a huge
development effort and lots of research in
machines?
related areas. The paper only discusses the
overview of the architecture and not the
detailed design. The answer is the requirement of the
problem itself. The questions asked by
The paper intends to concentrate on the the customers at a call centre are never
call centre application of such a system. the same. I.e. for the same problem
The “Knowledge Server” built with there can be N ways of asking it. Also
ARTHA (Artificially Responding
the users who are using the product
THinking Agent) architecture, will answer
vocal queries by the clients verbally. may range from novice to a very
expert user. The questions may range
The ARTHA architecture proposes to from “How do I turn my PC on?” to
make this possible by combining already “How do I fix Exception No. 2010
existing speech processing techniques with raised by the system?” So the KM
an Intelligence Engine (BRAIN). The
ARTHA: Artificially Responding THinking Agent

should be able to derive the


“meaning” of a particular question. Learner: The learning mechanism for
This is not possible by a simple the ARTHA. Learner is first fed with
“programmed” machine that we are technical details of the product to be
currently using. A thinking machine handled by the ARTHA.
could “understand” the question and
try to answer that. Monitor: A module, which helps in
monitoring the “mental status” of the
ARTHA. This keeps track of
intelligence level and other book
4. Overview of ARTHA keeping information about the
ARTHA.

The ARTHA is an architecture, which


couples a voice communication with Customer query
Text representation
“knowledge”. Main components of Hello there.
Speech recognition
the ARTHA are:-
Text to grammar tree
The speech path: This is
conventional voice path, which could
be realized with real-time protocols Hello

already available such as RTP. there


Speech to text: Converting the voice Text to speech
to text. This is a key module as it Grammar understanding

involves successfully resolving


differences in slang used at different
geographic locations. Hi, may I help you?
Text output
BRAIN
Artificial Intelligence

Text to machine understandable


grammar: the computer does not
understand Text as it. To derive a MONITOR
Knowledge
Base
“meaning” from the text, it has to
know the grammar of that language TUTOR
Product info
LEARNER
and figure out the intended meaning.
This is known as “corpus”.
ARTHA Architecture

BRAIN: The heart of entire ARTHA.


This module “thinks”. It has to Fig 1. Shows the block diagram of the
understand the corpus and then derive architecture.
the meaning of the question asked by
customer. Then it has to think the
answer for that question based on its
learning. Then it has to answer the 5. Description
question giving a grammatically
correct answer.
The speech path:
Text to speech: This is the answering
part. The machine output is to be
Libraries can be built to support
converted into speech again. different mechanisms of speech
inputs. Some of the possible ways are:
Knowledge Base: This is the
• Speech input
“memory” or database of all the
through the network
information that Knowledge Server
through RTP.
needs to know about the product.
ARTHA: Artificially Responding THinking Agent

• Speech input +BREATH+ +COUGH+ +LAUGH+


through +SMACK+ +UH+ +UHUM+ +UM+.
microphone.
• Just a simple typed A dictionary can be formed based on
text query. the phone sets. The dictionary may
• Speech path through contain many ways of pronouncing
telephone. the word.

Speech to text:
E.g.
Spoken language is collection of
acoustic signals, which can be 1. ELEVEN AX L
digitized. These samples can be
EH V AX N
converted into a sentence of distinct
2. ELEVEN(2) IY L
words by matching segments of the
EH V AX N
incoming signals with a stored library
3. EXIT EH G
of phonemes. There are readily
available libraries, which can do this. Z AX T
“Dragon Speech” is one such set of 4. EXIT(2) EH K
library. But these libraries have some S AX T
limitations too. They need to take a 5. EXPLORE IX K
sample of speaker voice before S P L AO R
properly converting the speech. But 6. FIFTEEN F IH F
in a typical call centre scenario it is T IY N
required to convert the speech
immediately from the first word itself. Here is a sample dictionary. With this,
This can be overcome if some using phone sets words are derived.
probability of error is accepted for
first few sentences.
Text to grammar:
The phoneset is the list of 'phones', or
speech sounds, that the engine can The computer does not understand
recognize. When you build acoustic any text, which is obtained from
models and pronunciations for words, speech conversion. For that we need
they can be made to use any set of to have a way of extracting the
units, but they must be the same units. meaning of the text. Given text is
The acoustic models will search for broken down into words. These words
the speech sounds (phones), and the can be matched with a predefined tree
word pronunciations are also given in of grammar and the machine can
terms of the phones in the phone set. arrive at its meaning.

The default phoneset for American To achieve this we first represent the
English that comes with Sphinx2 text into annotated form. This
contains the following phones: AA annotated form is called a corpus. In
AE AH AO AW AX AXR AY B CH principle, any collection of more than
D DH DX EH ER EY F G HH IH IX one text can be called a corpus,
IY JH K L M N NG OW OY P R S (corpus being Latin for "body", hence
SH T TH UH UW V W Y Z ZH. a corpus is any body of text). But the
term "corpus" when used in the
context of modern linguistics tends
There is also the silence phone, SIL, most frequently to have more specific
and a number of "noise" phones: connotations than this simple
ARTHA: Artificially Responding THinking Agent

definition. The more specific one is a Table -n - noun


machine-readable corpus. On - pr - preposition
The - det - determiner
Machine-readable corpora possess the
following features: Too.Bimanous. The sentence is made
of noun phrase and verb phrase:
“The pen” - noun phrase
• They can be searched and
“on the table” – verb phrase
manipulated at speed.
• They can easily be enriched The grammar for a sentence is given
with extra information. as :

For example, the form "gives" s  np, vp


contains the implicit part-of-speech np  det, n
information "third person singular
vp  [is], pp
present tense verb" but it is only
pp  pr, np
retrieved in normal reading by
recourse to our pre-existing
Expanding the given sentence
knowledge of the grammar of English.
according to the above grammar
However, in an annotated corpus the
yields (Fig2):
form "gives" might appear as "gives
VVZ", with the code VVZ indicating
that it is a third person singular s
present tense (Z) form of a lexical
verb (VV). Such annotation makes it
quicker and easier to retrieve and
analyze information about the
language contained in the corpus. np vp

Assume that we need to understand pr


det n
the sentence “Sun gives light”. In one np

of the corpora available named the pen on


“SUSANNE Corpus”, we have it
det
defined as n
the
Sun N - noun tabl
e
Light N - noun Fig 2.
Gives VVZ - third person singular Representing a sentence in tree form
present tense help in determining the correctness of
the sentence also.
Based on this a lexical tree can be
built suitably to arrive at the meaning. We will now see how this tree can be
used to answer questions.
Let us take an example of forming a
grammar tree. Suppose it is required The corpora above-mentioned are
to form a tree for the sentence “The added with some more information
pen is on the table” and we expect the such as:
response system to give answer as
“Remove the pen.” pen -n – noun --
Object
Suppose our basic corpora contains table -n - noun --
following entries: Object
Pen -n – noun
ARTHA: Artificially Responding THinking Agent

on - pr - preposition --
meaning Pattern AI Handler
the - det - determiner

The words added in bold are for the


custom corpus, which can help in
response systems. Object1 – on – object2 AI_handler1 ()

By looking in to the tree and Object1 – with –object2 AI_handler2 ()


comparing with the corpus we can
write the sentence as:
The pen is on the table. Table2:
Meaning
AI_handler1 ()
Object1 Object2 {..
---BRAIN---
This reduces to object1-on-object2. ..}
The AI handler can be implemented in
For such sentence the answer given on of the different methods available.
would be remove the object1.
I.e. remove the pen.
The BRAIN:
This is a simple case consisting of few
words. To handle a large combination BRAIN stands for Binary
of such conditions we can implement Reconfigurable Artificial Intelligence
a handler driven application as shown: Network. This is a collection of
Table1 different AI modules, which makes it
to “understand” the sentence. There
Pattern Handler are thousands of different AI
techniques, which are in use and are
Object1 – on – object2 Remove_it () constantly being improved. Lots of
research is being done on this area to
Object1 – with –object2 Send_it () improve the techniques.

Some of the techniques which are


used in intelligent agents are: Belief
The handler could be implemented as
Network (also Bayesian Network),
shown:
Backward Chaining, Fuzzy Logic,
Remove_it ()
Heuristic, Information Extraction,
{
Data Mining etc.
Print (“remove the “,
object1);
This paper does not go into details of
}
any of these techniques, as that would
Producing the output: “remove the
be a long discussion and is out of
pen”
scope. However for the
implementation of the BRAIN some
This was a case where there was no
or all of the above methods could be
need of AI. If the same response is to
used. Implementation of such a
be generated by using an AI, we
software/hardware may require huge
implement the handler accordingly as
amount of man-hours.
shown:
The BRAIN uses Knowledge Base for
querying the stored information about
ARTHA: Artificially Responding THinking Agent

the product. BRAIN tries to get the • Automatic modeling of


optimal answer for the question asked complex problems without
by the customer. If it fails to find an prior knowledge of the level
answer, then it will produce a negative of complexity
reply to the customer and log the • Ability to extract key
discussion as failed with MONITOR. findings much faster than
many other tools
If the BRAIN finds a solution for the
question, then it gives a textual output. The interpreter of the BRAIN queries
This can be converted back into DBMS (Data Base Management
speech of just displayed as a text. System) using common query
Grammar languages such as SQL. The generic
tree interface helps in improving the
AI kernel handlerflexibility and scalability of the
Belief
Heuristics s knowledge base.
networks
Propagati
Backward
chaining
on Data Mining
Fuzzy Networks
logic Search
Query algorithms
Resul
Interpreter
ts Data
Mining
Search
algorithms Data acquisition
Interpreter
Data
Grammar acquisition Query
tree Result
MONITO
R
Fig 3. The design of BRAIN

The “knowledge” is the stored in a


form, which is understood by the DBMS
BRAIN. Neural networks are well
suited for data mining tasks due to Fig 4. The knowledge Base
their ability to model complex, multi-
dimensional data. As data availability
has magnified, so has the The LEARNER:
dimensionality of problems to be
solved, thus limiting many traditional
techniques such as manual This module “learns” about the
examination of the data and some product. Different types of learning
statistical methods. Although there are methods are: batch, incremental, on-
many techniques and algorithms that line, off-line, deterministic, stochastic,
can used for data mining, some of adaptive, instantaneous, pattern,
which can be used effectively in constructive, and sequential learning.
combination, neural networks offer Based on the complexity a particular
the following desirable qualities: type of technique can be selected.
Learner learns the product
• Automatic search of all specification, and stores it in the
possible interrelationships knowledge base for the retrieval by
among key factors BRAIN. The input to learner can
again be through speech or pure text.
ARTHA: Artificially Responding THinking Agent

At later stages the LEARNER can be application, the processing time needs
improved to just accept the datasheet to be within few seconds. But the
of the product, and prepare the neural networks and other techniques
knowledge base. used in ARTHA involve huge
computational complexities. Hence
suitable methods should be designed
MONITOR: to reduce the complexity of the
computations
This is a book-keeping module, which
keeps track of ARTHA’s 7. Conclusions
performance. Any unanswered query
gets logged in the monitor,which can By looking at the above examples, it
be viewed by the administrator and is clear that such Automatic Response
inputs can be given to the LEARNER systems are not far from being
about it. The MONITOR will also implemented in large scale in near
monitor the “health” and future. But doing a generic system
“intelligence” level of the BRAIN. involves lots of innovation and huge
amount of man-hours for the
6. Further Enhancements implementation.

The ARTHA could further be However nothing is impossible if we


improved and used for research apply thought, and be innovative in
purposes such as automated our approach.
designing. In this the ARTHA is
asked to design or enhance the
capabilities of a product based on 8. References:
current design. Or it is asked to build http://www.cogs.susx.ac.uk/users/ge
a completely new system altogether. offs/RSue.html
However that needs lot of research in SUSANNE Corpus and Analytic
this field. Scheme.

http://www.sls.lcs.mit.edu/sls/whatw
edo/architecture.html
Spoken Language Systems, MIT.
Related work
http://www.ling.gu.se/~lager/taglog.
The research on Artificial Response html
systems started long back, leading to A LOGICAL APPROACH TO
some of the very capable systems. COMPUTATIONAL CORPUS
Major research institutes which are at LINGUISTICS.
the forefront are: MIT, CMU and
Stanford Universities. There is lot of http://www.aaai.org/AITopics/index
research going on in military .html
applications as well. American Association for Artificial
Intelligence.
Some of the similar systems built
are: VOYAGER, JUPITER. http://www.cogs.susx.ac.uk/users/ge
offs/ChrisDoc.html
The performance issues CHRISTINE Corpus, Stage I:
Documentation.
The speed of processing is a key issue
in ARTHA. As it is a call centre
ARTHA: Artificially Responding THinking Agent

http://www.cs.washington.edu/resea
rch/jair/home.html
Journal of Artificial Intelligence.

ftp://ftp.sas.com/pub/neural/FAQ.ht
ml
Neural Networks FAQs, GOOGLE.

http://www.zsolutions.com/sowhy.ht
m
Neural Networks and Data Mining, Z-
Solutions.

Вам также может понравиться