Вы находитесь на странице: 1из 5

Natural Language Processing, the jewel on the crown

Abstract:
Often coined the jewel on the crown of AI, Natural Language Processing is an important

field of study in Computer Science with far reaching results critical to future human

development. Natural language processing is a method through which computers can find and

analyze meanings from human language in a way beneficial towards humans. It enables,

engineers to complete tasks including translation, summarization, text analysis, and allows

machines to understand how human’s speak. This article attempts to answer questions such as

how do basic natural language processing such as processing and synthesizing speech works,

why are they important for developers and users, and what are their current limitations.
The birth of computers and electronics has completely revamped modern humans’ lives,
and too often we find ourselves taking the deep and complex thoughts and implications
behind these technologies for granted. Thus, this article attempts to educate the reader on
Natural Language Processing (NLP) – its fundamentals, applications, and limitations – a
research area in Computer Science which makes really achieves machine intelligence and
benefit our day to day lives through machine translation and automated responses such as Siri
and Cortana.
First off, what is natural language processing? NLP is a very important sector of computer
science and artificial intelligence. NLP focuses on developing communication between
human language (or natural language), and a language that computers can understand. It is
worth mentioning that NLP is a interdisciplinary subject, which involves both humanities and
science, notably linguistics, computer science, and math. While NLP is closely correlated
with linguistics, rather than studying the use of language between humans, NLP focuses most
on achieving communication between humans and computer. i
Ensuing, what are the implications of this technology? It is undoubtful that humans need
to communicate with computers for many needs and wants. However, the medium through
which humans communicate is very disparate from that of humans with computers. Harvard
professor of psychology Steven Pinker reveals in his article The Logic of Indirect Speech the
purpose of language. He believes that, “language has two functions: to convey information
and to negotiate the type of relationship holding between speaker and hearer.” ii Pinker’s
succinct summary demonstrates the exact challenges lying behind natural language
processing: computers has to be able to extract the message the speaker is conveying, and
filter out or use the negotiation to enhance its own calculations and response. Thus, NLP has
critical importance to developing fulling intelligent machines capable of interpreting human
emotions, one of the most complex ideas to have existed. Moreover, NLP has a bright future
with broad applications across the board. For instance, humans rely on word recognition, text
verification, speech understanding, and translating on a regular, daily basis. These are often
laborious tasks that, if done effectively by a computer, will greatly enhance human
productivity. The latter paragraphs will go more in depth about the kinds of occasion where
NLP comes in handy.
Next, we will establish the challenges and limitations which natural language processing
currently faces. For example, these difficulties may include the hazy, ambiguous definition of
words and sentences, how the same word may have several different meanings when placed
alongside different words, and finally human proneness to error, or how to deal with error in
input or human mistakes. These challenges arise mostly from the complexity of human
languages, and must be overcome in order to make the technology accessible and efficient for
human use. Common solutions that are already put into action includes a technique named
“production” created by the study of computational linguistics. A production is a rule in which
a recursively performed symbol substitution can be utilized to create new symbol sequences.
A finite set of productions specifies a formal grammar, or rather, a general grammar. Doing so
both solves the problem of sentence ambiguity and input mistakes, because by aligning each
word with its grammatical tags, the computer can better understand what the speaker is
attempting to convey. Furthermore, abstract techniques from linear algebra can also help
computer scientists deal with these problems, using tools such as Laplace Smoothing, Good-
Turing Smoothing, and Interpolation Smoothing.iii Published in a paper by Chen and
Goodman, smoothing techniques are relatively effortless mathematical tools that can be used
to help with data sparsity, specifically when test cases have not appeared in training data,
which can be a huge problem due to human proneness to error. Chen and Goodman argue
that:
Whenever data sparsity is an issue, smoothing can help performance, and data sparsity is
almost always an issue in statistical modeling. In the extreme case where there is so
much training data that all parameters can be accurately trained without smoothing, one
can almost always expand the model, such as by moving to a higher n-gram model, to
achieve improved performance. With more parameters data sparsity becomes an issue
again, but with proper smoothing the models are usually more accurate than the original
models. Thus, no matter how much data one has, smoothing can almost always help
performance, and for a relatively small effort.” iv
With challenges dealt with, now we can explore the applications of natural language
processing, which includes typing and word splitting (in languages such as Chinese), word
search, responding to questions, text summarization and filtering (useful in emails or
shortening contracts and papers and etc.), tracing keywords, analyzing emotions, and machine
translation.
Taking a look at a specific product, Siri. Siri stands for Speech Interpretation & Recognition
Interface, and is a product released by Apple that can respond to users’ voice and carrying out
simple commands such as setting an alarm, creating a reminder, or telling the weather. While
Siri is not a killer app that is able to disruptively enhance the users’ productivity, it does
incorporate several technologies which have far reaching implications.
First and foremost is speech recognition. Simply put, speech recognition is the ability for
a software to recognize speech in any or one specified language. Speech is a complex
phenomenon that is both dynamic and difficult to distinguish. Speech recognition technology
often can be used to perform an action based on the command instructed by the user.
However, speech recognition systems need to be trained by storing speech patterns and
vocabulary of their language into the system. Theoretically, with enough training, a speech
recognition system can be trained to understand anything a speaker speaks. Speech
recognition's first step is breaking down the speech in to its bare structure, filtering out
phones, diphones, triphones, and fillers and utterances. Then the recognition process becomes
a mathematical process with heavy involvement in linear algebra, for example finding
features and feature vectors, and then matching these features with acoustic model, phonetic
dictionary, and a language model. Other concepts that have also proven to be useful include a
lattice, N-best lists, word confusion networks, speech database, and text databases, and etc.. v
Next is speech synthesis. Speech synthesis is a tool especially valuable for Chinese
language users. Speech synthesis is a form of output where a computer or some other machine
(called a speech synthesizer) creates an artificial production of human speech and reads it
aloud, and this technology is also often called text-to-speech. As a same word may be read
differently under different circumstances, a computer has to learn to synthesize speech and
arrive at the most likely correct output through statistical probability techniques (Hidden
Markov Models) or neural networks.vi
Siri’s searching capabilities originates from Wolfram Alpha’s search engine, developed
by Wolfram Alpha LLC, a subsidiary of Wolfram Research. Wolfram Αlpha is not a regular
search engine such as Google or Baidu; instead it is a computation intelligence capable of
computing answers towards factual questions through external curated data from for example
the CIA factbook.vii When classical search engines are used, such as Google, the feedback the
user receives are webpages found by crawlers, indexing, and ranking the available web pages.
One example is that if you asked Wolfram Alpha what is augmented D major, the search
engine will reply by playing a the chord. Another one is if you asked Wolfram Alpha where is
the International Space Station (ISS), or when will the next Solar eclipse occur, Wolfram
Alpha can compute the location or time, respectively. viii
Wolfram Alpha embodies two powerful technologies that are critical to Siri’s success and
provides a useful model for future NLP software. The first being knowledge base. Wolfram
Alpha develops its own knowledge base in which it curates and updates data gathered from
various sources, converts it into information helpful to its computations, and is a repository
for knowledge, data, and algorithm in every field. ix The second technology is answer
recommendations. While technically not a technology, recommendations matches the user’s
question type with an answer, which best matches the question asked. However, one key
difference between Wolfram Alpha and other Natural Language Processing systems is that
Wolfram Alpha processes queries in language fragments rather than grammatical sentences.
So in short, Wolfram Alpha has a good model for searching, computing, and giving a
response, and can become even more powerful when combined with a natural language
processing speech recognizer and speech synthesizer, as demonstrated by Siri.
Natural language capabilities also means that computer science and artificial intelligence
will becoming increasing powerful in more and more fields, reaching into fields such as Law,
English (and other languages), literature, history, philosophy, and etc.. One of the most
powerful tool is summarization. As citizens of the 21 st century, the large amount of
information presented to an average human being is clearly too much to be all absorbed.
Therefore, the demand for information quality over quantity, brevity, and conciseness is
becoming increasingly important. Few users of digital devices ever read user license
agreements, and from an ethical standpoint, the fact that license agreements are unnecessarily
long makes them unreadable and hurtful towards to users’ rights. However, with the aid of
text summarizations, natural language processing systems can extrapolate features most
important towards an individual and summarize the terms of agreement in a few bullet points,
thereby protecting both the interest of the consumer and the producer. Another application is
using text summarizations extrapolate key information from publications and papers as a
useful and efficient way for readers to extract information. When one wants to know the
conclusion of an experiment, understanding the methodology and experimental result analysis
is not only distracting to understanding, but also an unnecessary waste of effort. In a nutshell,
Natural Language Processing has very promising applications in almost every academic field
and is a very worthwhile technology that may become a great boon to humanity.
It is no question that computer is one of the greatest inventions of the century. Artificial
Intelligence, one of the recent, popular technologies is developing at rapid pace, attempting to
rid mankind of many laborious and challenging duties. As described by Xuedong Huang,
“Speech and language is the crown jewel of AI,” x Natural Language Processing is an exciting
field with a great number of applications, implications, and future potential. Due to the vast
number of challenges it faces, NLP is rather limited currently as compared to the fad Machine
Learning and Deep Learning. This also means that there is more opportunity lying in Natural
Language Processing for future engineers.

Reference
i
Natural Language Processing, Wikipedia; Zhifang Sui, 计算语言学导论, 信息科学技术导论.
ii
Steven Pinker, The logic of indirect speech, Proceedings of the National Academy of Sciences of the United
States of America, 2007.
iii
Stanley F. Chen and Joshua Goodman, An Empirical Study of Smoothing Techniques for Language Modeling,
Harvard Computer Science Group Technical Report TR-10-98.
iv
Ibid.
v
CMUSphinx, Basic concepts of speech recognition, Tutorial Concepts, 2019.
vi
Chris Woodford, Speech synthesizers, Computers, Explain That Stuff, 2018.
vii
Chris Pollette, How Wolfram Alpha Works, How Stuff Works, 2009.
viii
Erick Schonfeld, After Being Upstaged By Google Wolfram Alpha Fires Back With A Leaked Screenshot, Tech
Crunch, 2009.
ix
Wolfram Knowledgebase, Wolfram Alpha.
x
Microsoft blog editor, Microsoft Research Podcast, 2019.

Вам также может понравиться