The Answer To Anything: Tom Arnoldussen Diederik Beker Celine Van Der Geer

42
The Answer to Anything

Tom Arnoldussen
University of Amsterdam Diederik Beker Celine van der Geer
tom.arnoldussen@gmail.com University of Amsterdam University of Amsterdam
diederik.beker@live.nl celine.nvdg@gmail.com
Didi de Hooge
University of Amsterdam
Thomas Hoornstra
dididehooge@gmail.com University of Amsterdam
thomas@hoornstra.org
ABSTRACT forums are mostly used within specific domains, thus people are
This report is proposing a mobile Q&A application based on user only able to ask specific questions as the members only have
features. The app is designed in order to create a platform where knowledge about that domain. This can lead to a hunt for the right
people are able to ask questions which will be forwarded to forum that can take quite some time and it is uncertain how long it
people that are in the possession of the knowledge related to the will take for someone to answer the question and whether the
query. question will be answered at all.
The application is a smart mobile systems that is making use of The application that this paper proposes is a mobile question and
natural language processing. It creates tags from every query answer application based on user features. The application can be
which will then be matched to people based on location and their used for question asking and answering within every domain by
skills and interests provided in their profile. the use of speech and/or text queries.
This report presents the related work that is relevant for the The system creates a profile by extracting likes from social media
application. It describes the detailed approach, methods used, accounts. When a question is asked by a user the system will
interaction design and system description. Lastly, salient elements generate different tags from the question. Subsequently, the
of the discussion are examined and suggestions for further question will be pushed to a specific number of people whose
research are provided. skills and interests (likes) match the query the best.
Keywords
Q&A, application, mobile system, natural language processing,
intelligent interactive systems
2. RELATED WORK
2.1 Q&A online tools/applications
Question and answer (Q&A) sites such as Yahoo! Answers,
1. INTRODUCTION Quora.com, and Answerbag.com, are places where communities
When people are looking for a specific piece of information, most of volunteers respond to questions. Everyday a ton of questions
of the time search engines are consulted. However, a typical are asked and answered on these websites. A variety of questions
search engine is not designed to come up with answers to specific is asked, while some search for technical information others
questions as they dont have the capability to combine several search for a good present for their one-year anniversary. Q&A
pieces of information located in different parts of the knowledge sites allow people to ask and answer questions in a broad range of
base (Zadeh, 2003). Also the results from search engines are often topics. Harper et al. (2008) identify three types of Q&A sites:
biased, advertised, and not personal or specific enough. Looking digital reference services, ask expert services, and
for the right answer to a specific question can take quite some community Q&A. All have the same goal of helping users in
time. People often turn to their friends, families, and colleagues answering and asking questions; they achieve it in different ways.
when they have questions, which results in using social network Digital reference services is about delivering information
status messages to ask questions (Morris, Teevan, & Panovich, online, provided by library professionals, to users who cannot or
2010). However, there are certain questions people dont want to do not want to have face-to-face communication. Its about
share with their social network or the correct knowledge might not helping people find or providing them with useful information, for
be available within their own network. As an alternative, example in the form of emailing. An ask and expert service is an
questions can be asked on online forums where people have online community, working with staffed experts in a particular
conversations in the form of posted messages. Unfortunately, topic area. The interactions are topic-oriented discussions. For
1
example, some systems divide questions into categories results found by the algorithm and sponsored, which is about
beforehand, which determine what expert will respond; others will information that is paid to be on top of the ranking. (Croft,
allow experts to declare to which questions to answer, while Metzler and Strohman, 2009) The user wants to see the results
community Q&A has little structure or role-based organization. that are the most relevant to their query at the highest rank of the
Some sites have moderators or users with earned privileges based returned document list. The list returned by the search engine
on earlier contributions. On these sites, users often have their consists of different kinds of results such as documents, pictures,
profiles and usernames and have off-topic discussions where they texts, articles, and more. For most search engines the result page
can reply to one another. contains a ranked list of document summaries, with links to the
actual document of web pages (Kurland & Krikon, 2011). An
algorithm that uses significance factor to select the top sentences
Types of questions for a summary ranks the summaries. This technique started with
There are many ways of asking a question. Harper et al. (2009) H.P Luhn in the 1950s (Luhn, 1958) and from there on further
describes two ways: informational questions and conversational developed. It calculates the significance factor for a sentence
questions. The first one is where the user wants to receive some based on an occurrence of significant words. Significant words are
factual or advise-oriented information. While the second one is words of medium frequency in a document, which means that the
more about an intention of starting a discussion, where users are frequency score is between two thresholds of low and high
interested in an opinion or they have the need to express frequencies. Sentences containing these words are taken into
themselves. In 2010, Harper et al. are taking types of questions to consideration for the summary. However, per sentence exist a
another level. By looking at the three species of the rhetorical threshold, mostly not more than four significant words in one
framework from Aristotle: deliberative (future-focused), sentence. The results of a search engine can vary widely,
epideictic (present-focused) and forensic (past-focused). Their especially when they have to deal with ambiguous query
taxonomy divides every species of this framework in two formulation. The results can present a different interpretation of
subdivisions where they found out that the questioner seeks for the given query. This can be frustrating for the users if it takes a
guidance, quality, or facts when asking a question that fits in the long time to find the preferred topic or solution in the lists. A
framework. clustering technique is needed to resolve this problem. By
clustering the results that are similar to one another, the user can
quickly scan through the results and find relevant topics to their
Influence of anonymity query. Clustering needs to be done efficient and specific to each
Paskuda and Lewkowicz (2016) did a research about how query, based on the top-ranked documents for that query. The
anonymity influences user participation in an online question-and- algorithm should be easy to understand and cluster using labels,
answer platform, in this case for the site Quora.com. They for example, a monothetic classifier (Croft, Metzler and
developed a model with factors that influence participation and Strohman, 2009).
asked users through a survey about their use of the anonymity
feature in Quora. The main finding was that, with anonymous
answers, social appreciation correlated with the answers length. A 2.3 Speech search queries
second result of the study was that they found a link between Since the use of smartphones and cloud based computing the user
anonymity and politeness; anonymous comments were often expectations and needs changed. People are accessing and using
interpreted as not polite by the receivers. While the questioner has information in a different way and expect to have a constant
other reasons to do so, one user said: I go anonymous when Im access to the information and services the web has to offer.
revealing something that my family members wouldnt want other Given the nature of delivery devices and the increased range of
people to know about.' Also, Paskuda and Lewkowicz (2016) used usage scenarios, speech technology has taken on new importance
the model to get a better perspective of the quality of anonymous in accommodating user needs for ubiquitous mobile access - any
answers that were given by users. The receivers found them not time, any place, any usage scenario, as part of any type of
polite and half the survey users did not appreciate the anonymous activity (Neustein, 2010). In the past thirty years speech
answers. On the other hand, users who wanted to stay anonymous recognition got improved and became more accurate, which
for responding to a question wouldnt have answered if the resulted in a more effectively human-machine interaction (Deng
anonymous feature had not existed. However, the results of the & Huang, 2004).
survey led to a higher score in normal answers, without using the Speech recognition systems can be used for different applications.
anonymity feature. An example of such an application is spoken queries within a
search engine. Voice search is the technology underlying many
spoken dialog systems (SDSs) that provide users with the
2.2 Matching search queries with results information they request with a spoken query (Franz & Milch,
Search engines such as Google, Yahoo!, and Bing often provide 2002). A lot of different companies have developed speech-to-text
direct valuable knowledge to users and organize information in conversion system, in order to transcribe human speech (Bijl &
such a way that it is easy to navigate. The focus for these Hyde-Thomson, 2001). Over the last few years the number of
commercial search engines lies in one-way single-user scenarios. spoken queries to search through the web increased. At the
The information retrieval process begins when the user enters a Linguistic Society of Americas meeting in 2016 Timothy Tuttle,
query. A query is a formal statement of information needs and the Founder and CEO of MindMeld, stated that within one year
appears in various types; the algorithm needs to identify several the use of spoken queries went from a statistical zero to 10 percent
objects that will match the query. The search engine result page of all search volume. Also Google stated that 20 percent of all
(SERP) presents results in two ways: organic which is about the searches were voice searches (Young, 2016).
2
2.4 Natural language processing 3. INTERACTION DESIGN
Natural language processing (NLP) entails an area of research and This section will describe the human interaction with the system..
application that explores how to analyze and understand human This will include the intended users, concept and application area
language in text or speech. Computational techniques analyze and and the scenario and use cases.
represent naturally occurring texts at one or more levels with the
purpose of performing desired tasks. Computational techniques
entail techniques from which to choose to accomplish a particular 3.1 Human-system interaction
part of language analysis. The naturally occurring text is in the A user is expected to interact with the system in a few ways. The
range of any language, mode, genre, etc. The text is either written user is able to use text or speech for a query. When using text the
or spoken, in the latter case it needs to be a language used by user has to type in the question. For speech the user can use the
humans to communicate with one another. The text needs to be as microphone and record the question.
naturally as possible, and not specifically designed for the purpose
When registered to the app, the user will receive notifications.
of this analysis. The notion of levels in the analysis is about the
There are different situations in which the user will receive
various types of linguistic analysis; it is known that there are
notifications. For example, when a new question is asked that
multiple types of language processing during the produce of
matches the user's profile or when an asked question has been
language by humans. Usually people utilize these levels by
answered by another user.
themselves. However, NLP systems utilize different levels or a
subset of these levels. Therefore, some NLP systems can be The user is able to create a profile within the app. When
considered weak, while others conveying more levels are registering information will be extracted from social media
considered strong (Liddy, 2001). Basic techniques in the linguistic accounts. This information can be manually edited by the user.
processing in NLP applications such as tokenization, Part of
Speech (POS) tagging, Named Entity Recognition (NER) and
classification, can undertake the complex tasks. These complex 3.2 Intended users
tasks require being as accurate and efficient as possible. For The application 42 can be used by everyone with a smartphone.
example, POS is the process of assigning part-of-speech tags to The app does not address a specific gender or age group.
words in a text. Typically the tag set includes tags such as nouns, However, looking at application usage statistics the age group that
verbs, and adjectives. While the NER finds these nouns and spends the most time on apps are people between the age of 18
determines to what type of entity it refers. This function is useful and 241. The intended users are people that are curious and want
for location tagging. Brusting et al., (2016) proposed a high questions to be answered by others and/or like to answer questions
precision GeoTextTagger using NLP approach. Working with text that matches their own interests and skills.
and speech taggers as well as NAM, they found blocks of text that
referred to locations. By use of a knowledge base, a list with the
possible location is conceived. Finally, a location is chosen, by 3.3 Concept and application area
assigning distance-based scores to every block of text. 42 is an application that offers users a platform for asking
questions with the use of text or speech queries. The system is
making use of natural language processing in order to create tags
2.5 Gamification related to the question asked. By making use of these tags the
Gamification refers to implementing game mechanisms and system will search for a specific amount of users, with the skills
techniques in a non-game environment to engage and motivate and interests related to this topic or within the area of the
users to participate in (online) voluntary activities. Often it generated tags. The users interests and skills will be extracted
involves a rewarding system, which encourages the users as they from their LinkedIn as they have to log in with their account when
progress through levels, to get a higher score or level. However, they register for the app. The users who receive the question are
there is limited research that argues gamification works in the able to answer or decline the question. When the question is
long term (Cavusoglu, Li, and Huang, 2015). Hamari et al. (2014) answered, the user who asked the question, is able to select the
argue what the psychological and behavioral outcomes are with best answer. If a user declines to answer a question, the system
the use of gamification and what kinds of motivational searches for the next best suitable user within the area of interest.
affordances are needed. The main results suggest that The user is able to decline a question if they dont know the
gamification does works, however, the phenomenon itself is more answer or when they dont feel like answering. When they dont
manifold than most studies often assumed. Influences on the know the answer, they can choose for the option to receive the
outcomes involve the role of the context being gamified and the answer that is selected as best answer. The system tracks all the
qualities of the users. Most of the studies did evaluate behavioral actions of the user within the application, this is saved and used to
and psychological outcomes by use of questionnaires or automatically update their profile in order to improve their skills
interviews, which often resulted positive. In the context of and interests. To conclude, the application provides a platform
motivational affordances, most studies tested a large variety of where people anonymously can ask anything, anywhere and at
different elements used in reward systems. Elements like points, any moment.
leaderboards, and badges were the most commonly used variants
in reward systems. For example, the Stack Overflow Q&A site
relies on a hierarchical badge system to engage the users in giving
meaningful contributions. Cavusoglu et al. (2015) confirm the
value of the badges effectiveness of gamification is stimulating 1
the voluntary participation. http://www.businessofapps.com/app-usage-statistics-2015/
3
In order to create the prototype for the application described, the
Ionic framework was used2. This was done in order to create a
3.4 Scenario & use cases working mobile application with integration of the Facebook API,
3.4.1 Scenario 1: Asking a question as well as a self created API to handle database calls and in order
John is in Amsterdam with his girlfriend for two days. On friday to analyze the questions to create tags.
night they want to got out for dinner. Johns girlfriend has a To be able to understand the choices made within this section, a
lactose intolerance which means they can not just go to any short description of the process of using the application will first
restaurant. John opens the app store to install the app 42. John is be given. The application can be split into two parts, the first part
registering with his LinkedIn profile so his skills can be extracted, is where a user is able to ask questions, the second part is where a
he then manually adds some of his interests and topics he has user is able to answer questions. For both of these parts, the
knowledge about and saves his settings. After that he opens the application relies heavily on the profile of the user.
home page and writes the following text query: does anyone know
At the start of the application, the user has the option to login with
a romantic restaurant that knows how to deal with lactose
a social media account or with their email address. After logging
intolerance in the centre of Amsterdam? Within five minutes John
in with a social media account, the application will process all
receives a notification that there are three responses to his
their likes and history to make up a profile about what they are
question. All three responses recommend different restaurants.
interested in. If the user logs in without a social media account,
John selects the recommendation with a restaurant where
they are prompted with an option to select categories of interest.
everything is without lactose as the best answer.
Based on their interests, the application will make up a personal
3.4.2 Scenario 2: Asking a question that has already profile.
been answered
Peter invited all his friends for dinner on saturday evening. On
friday he already starts preparing the dinner. He wants to make a
special Thai dish, but does not exactly know which ingredients he
needs to use, he uses the internet to find a good recipe. When he
collected all the needed ingredients he notices that there is one he
never heard of and doesnt know where to buy it in Rotterdam. He
already has the app 42 installed, so he grabs his phone and opens
the app. He uses the speech function and asks where he can buy
that specific ingredient in Rotterdam. The app creates tags from
his question and recognizes the combination of tags from the
already asked questions. Then the app shows the related questions
to Peter and he finds the same question as he asked, immediately
gaining the answer..
3.4.3 Scenario 3: Answering a question
It is Wednesday night and Nina is watching TV on her couch. She
gets a notification of the app 42: there is a question that matches
her skills and interests. Nina opens the app and looks at the new
question. She knows the answer so uses text to give an answer on
the asked question. She looks at her tv again, feeling satisfied to
have helped someone.
Figure 1. Login using social media to automatically generate your
3.4.4 Scenario 4: Not being able to answer a question interests.
Claire is on her way to work when she gets a notification about a
new question in the 42 app that matches her skills and interests.
When Claire opens the question she realises that she is not able to After setting up the personal profile, the user is able to ask
answer the questions but would be interested in the answer. She questions. When the user asks a question, the system analyzes the
selects the option to skip the questions and to receive the answer. question, using natural language processing, and creates relevant
A few minutes later she receives a notification that the question tags that belong to the question. The user is able to remove
has been answered, an interesting answer! generated tags if they are irrelevant to the question. Once the user
saves the question, the created tags are run through a comparison
algorithm to see if similar questions have already been asked. If a
4. SYSTEM DESCRIPTION question (or multiple) that appears similar has already been asked,
This section will describe the system and its components. This the user is presented with the question and asked if this is what
will include the architecture, design, data structures, APIs, and they want to know. This question comparison is an important step,
and requires proper analysis of the tags, therefore this algorithm
algorithms that are used. As one might quickly notice, this
will be explained later in this section. When the question is new
application requires a very thorough analysis of the questions and
and the user actually asks the question, the system determines
answer provided by the users. For the prototype shown, this
which users would be able to best answer the question and only
analysis is still in its basic form, however, the application will be
described with the components that would be implemented in a
2
final system. http://www.ionicframework.com/
4
pushes the question to the people that are relevant (so the people The other part of the application is where users answer a question.
that may have knowledge about that question, based on their In between these two parts is where the magic of the application
profile). The way this matching is done is explained later in this happens and this will be explained later in this section. When a
section. user is selected to answer a question, they get a notification on
their smartphone. When they open the application, they can view
the questions that they have been selected to answer. If they know
the answer to a question, they can fill it in and the answer is
pushed to the user who asked the question. It can of course
happen that the user does not know the answer, and thus they can
select so. When doing this, the user can select the option to get
pushed the answer as well, giving the user the opportunity to learn
from others in turn. During the use of the application, user profiles
are updated constantly, however, this will be explained later in
this section.
Figure 2. Asking a question, tags are automatically generated.
In order to make the system fair, at sign up, people are allowed to
ask only two questions. After that, they have to perform tasks like
answering questions to gain points. Points are allocated per
category, thus if a user answers a question regarding gardening,
the user gains points on that category only. The combined points
of all categories, however, determines your level in the Figure 4. Users can answer questions that they are given.
application, and thus the amount of questions you are allowed to
ask. When a question gets answered, the user who asked the question
gets notified, including any users who have selected to be notified,
as stated in the previous paragraph. The user can wait to gain
more responses, or can select the best answer from the answers
given.
Figure 3. Users gain points per category, points can be earned as

seen in the figure.
Figure 5. A user can accept an answer as the best answer. On
5
the overview page, a user can view his/her activity. do I listen to Snow Patrol at night? should yield different tags
and select different users to ask the question to.
Another important aspect of analyzing the content of the question
4.1 Analyzing the question within this process, is to determine the amount of people that
As the previous section shows, there is a lot that needs to be receive the question. It is important to differentiate between
thought of while creating such an application. The most difficult questions that should reach just a few people that would know the
part is to understand the questions the users are asking, however, answer, or a question that should reach anyone in a vicinity. As an
this is also the most important part. Over time and with better example examine the question: Who has locked their bike to
technology, the understanding of the questions being asked can mine?, this question is not answerable by a person that knows a
improve, improving the application. An important understanding lot about bikes or locks, but rather the single person who has
lies in actually understanding the content of a question. This locked their bike to another. This question should therefore be
understanding is used in multiple components of the application, pushed to anyone that could have done this, such as the entire
as can be seen in the algorithms shown in the next sections. neighbourhood.
Another way the application uses the analysis of the question is by One more important aspect to consider is the response time
asking for missing information when a user asks a question. The required by the question. A question could warrant an immediate
question How can I fix my television? lacks a lot of response, and the users that receive the question should therefore
information, such as what type of television is it, what is currently be able to answer swiftly. A way to do this could be to analyze
not working, etc. By understanding the content of the question, response times of previous actions of users. This could determine
the questions can be improved by the user before asking it. moments where they respond quickly, and moments where they
The content of the question is also important to determine respond slowly. Another way is to analyze the calendars of the
questions that warrant a warning, such as I cough up blood and users in order to determine if they would be able to answer
have a headache, what should I do?, the user should be warned immediately. The question could also be pushed to more users, as
that the answer he/she receives is in no way a reassurance or this would increase the chance to gain a faster response.
sound medical advice, and that they should visit a doctor. All of the above is taken into consideration while selecting users
that will receive the question, but the most important part is to
make sure the user would actually be able to answer the question.
4.2 Already asked questions To make sure that likelihood is as big as possible, the tags
Within this process, it is important to analyze the question generated for the question are compared to the categories in a
properly. There are multiple things to consider, first of all, users profile. Since a user gains points per category, their
location. As questions may be asked that are location specific, it is proficiency in a category is known. This can be used to select the
important to distinguish questions that look similar but relate to a best matching users for the question. An example could be:
specific location. The question Where can I find a romantic Where can I find the best italian food near me?, generating the
restaurant in Amsterdam? looks pretty similar to Where can I tags: italian food", near Amsterdam Noord. All users that have
find a romantic restaurant in New York?, however, the answer to the categories italian food and Amsterdam Noord are
the one question is obviously not going to be the same as for the retrieved, and their score on both categories is added together. A
other. user could have a score of five in italian food and ten in
Another important aspect is temporal, since a question asked Amsterdam Noord, another could have a score six in italian
previously may no longer be relevant. Take, for example, the food and sixteen in Amsterdam Noord, their respective scores
question How can I watch a movie on the internet?. Five years would be 15 and 22. The users are then ranked according to these
ago this question would generate different answers than today, scores and filtered according to the different aspects mentioned in
thus this needs to be analyzed as well. There are multiple options the paragraphs above. The top X users, as determined, are given
for handling this, one could be to simply state that an answer is the question to answer, and receive a notification.
valid for a specific amount of time. A second option could be to
implement a machine learning algorithm that learns from the
system by having people that answer the question, or people that 4.4 Updating user profiles
ask similar questions mark the answers as outdated. We chose, During the use of the application, the user profiles are constantly
however, to analyze the content of the question to see if it entails a updated when a user performs an action. Especially when
quickly changing field, and dependant on that decide the timeout answering questions this is of great importance. As the system
of an answer. As we can determine that the internet is a selects users based on their knowledge, it is important to update
relatively quickly changing field, an answer given five years ago this knowledge along the way. If a user is unable to answer a
is almost certainly outdated today. question, it is important to refrain from asking a user a question
like that again. However, this is not very easy as similar questions
might be answerable by a user. The combination of certain tags
4.3 Linking questions to users needs to be analyzed in order to determine which part of the
In order to send the right questions to matching users who should question is an unknown area for the user, and thus in which area
be able to answer the question, the question needs to be analyzed their level needs to be adjusted. To do this it is, again, important
and compared to the user interests. As mentioned, first the tags of to understand the content of the question.
the question are determined. Within this determination it is
important to consider the actual meaning of the question. A
question like How do I patrol in the snow at night? and How
6
4.5 Categories & Facebook API also needs to deeply understand the asked question. To prototype
The ultimate goal for the application is to combine data of various this, Googles Cloud Natural Language API4 is used. Our system
social media platforms to make the best possible profile of the feeds the asked question to Googles API, whereafter important
users. For the prototype, Facebook was used to build a profile. keywords, categories and places are returned. So for those
The profile consists of categories that are generated by scraping keywords, not only the specific keyword, but also the
all the likes of the user. These likes are only for the pages and superordinate category is returned. So when a user asks a question
items that the user liked. In addition to likes, the users name, about pasta, the keyword italian food is also returned. This
gender and profile picture are scraped. In an ideal application helps our system to assign relevant tags to the asked question,
additional features such as: comments, likes on messages, work gaining better knowledge about the context of the question. The
history, videos, user tagged places, religion, relationships, photos, generated tags are then compared to the categories of users like
posts, location, hometown, events, education and their birthday; explained earlier, linking questions to relevant users.
could be used to build a more complete profile. The scraping of
information has been done through the use of the Facebook Graph
API3. The Facebook API enables users to login and connect 5. DISCUSSION
Facebook to the prototype. The prototype automatically fetches While designing the system and testing the prototype a few points
their likes and these will be added to their profile as categories. of discussion came up. An obvious point to consider is the privacy
The categories are searchable for all the users, and by doing so; concern that comes into play. As the system scrapes and uses
the database generates categories (a collection) that could be used information from social media accounts, people could be
for all the users. In this prototype, the likes (and multiple terms) concerned for their privacy. The data, however, is always
will be split in separate categories. The same will be done for changeable by the user, as they can edit their categories. The
questions of users. The categories that were generated by asking a categories are used in the creation of a widespread category
question are also stored in the same searchable table as the profile database, however, this is not linked to a single person.
categories. Talking with users testing the system about this privacy concern,
the most mentioned aspect was anonymity. Two main topics arose
from this discussion, the first being that a user should be able to
4.6 Datamodel ask questions anonymously. For some users it was of importance
In order to save the data generated by users a database is used. to ask certain questions without others knowing who they were,
The communication to the database is done via an API that can this could be beneficial for questions regarding taboo topics in a
handle calls to save and retrieve the data. The database model is certain culture, topics that include private information, etc. The
shown below. In the application itself the data is handled as json second discussion point was centered around the idea of not
object. This makes it easily readable, editable where necessary, trusting anonymous answers. When asking users if they would
and it is easy to transfer between views. want the system to be fully anonymous the answer was a decisive
no. They expressed that they would not fully trust answers from
an anonymous person. For this reason, when improving this
prototype, users should get the option to ask and answer question
anonymously, however, users who answer anonymously should
be warned about the consequence that it lowers the trust in the
answer. Another important thing to note here is that the
anonymity can only be created towards other users. Anonymity
towards the system will only cause the system to perform badly,
as it relies heavily on understanding the question and the user
(categories) to perform properly.
One more thing that came up during testing was users trolling the
system. While trying to have people ask and answer questions,
they would put in fake answers, sometimes because it was for the
sake of testing, but sometimes just to be funny. This is a real
possibility within an actual working application as well.
Therefore, it is important to, again, analyze the content of the
answer to see if it is an answer to troll the system or whether it is
an actual real answer.
Figure 6. A scheme of the database. During the creation of the application, in a conversation with
people important to the project, the reward system first designed
for the application was discussed. This reward system was
4.7 Question analysis comprised of users gaining levels for their actions within the
Natural language processing plays an important role in the application. Just as explained in this paper, users would gain
question analysis of our system. For the system to determine levels by performing certain actions, which would allow them to
which users should be able to answer a certain question, the ask a certain amount of questions in the application. The levels
system not only needs to know everything about every user, but gained here, however, were not based on categories, not allowing
3 4
https://developers.facebook.com/docs/graph-api/ https://cloud.google.com/natural-language/
7
for the application to properly match questions to users. In the create great technological advances, helping people to find
discussion, the point was brought up, and the reward system was answers to questions quickly and efficiently. The current
changed, thus people gain levels per category, allowing for a more understanding of machines over such content is, however, still in
logical view of a users skills, as well as a better matching early stages. While this field improves and machines can gain a
between questions and users. better understanding of the content of text, more applications for
Something to consider when deploying this application in a real its use will come to light. In this paper we have proposed one such
life environment, are the costs for running such a system, as the application, however, future research could indicate much more
system uses and generates a lot of data. If the systems is used on a possibilities, answering questions directly from information found
large scale, the amount of data generated is vast. This does bring online could be a possibility, although this requires much work.
added benefits as this data can be used for all kinds of interesting One interesting possibility, after the creation of this application,
data analysis or research as described in Future Work. This could lies within its data. As much data is collected about user interests,
also be the side of the application that generates the revenue to knowledge, questions, and answers, this can be used to analyze
keep it up and running. and detect patterns. An example of this could be to see where
certain knowledge is lacking, if a certain location generally asks
questions about a certain topic, and has gained levels in certain
6. CONCLUSION categories, we can conclude that in a certain area, one category is
Typical search engines are not designed to come up with answers well known, while another is lacking. This could be interesting to
to specific questions. They are not able to combine several pieces examine, as it could, for example, show the knowledge
of information located in different parts of the knowledge base. diversification around the world and determine a cultures ability
As an alternative, forums can be used, but those are usually very to share this knowledge. The data collected by the system could
domain specific. In recent years question asking on social media be used in many other ways, and this would be a very interesting
increased but not all questions are suitable for such platforms, nor aspect to research.
is all knowledge available within ones network.
This report described an application that is providing a Q&A
platform for question asking and answering. Within the
8. REFERENCES
application questions concerning every domain can be asked. The [1] Bijl, D. & Hyde-Thomson, H. (2001). Speech to text
questions will be forwarded to a number of people with the proper conversion. US.
knowledge included in their profile, and thus should be able to
answer the question. Besides matching skills and interests, the [2] Brunsting, S., Dolman, R., Sterck, H.D., & Sprundel,
system is able to identify locations and forward questions to T.V. (2016). GeoTextTagger: High-Precision Location
people within the same area when requested by the query. Tagging of Textual Documents using a Natural Language
User profiles are being created by extracting likes from social Processing Approach. CoRR, abs/1601.05893.
media accounts. These skills and interests can also be manually
adjusted. However, the system also tracks all the actions of the [3] Cavusoglu, H., Li, Z., & Huang, K. (2015). Can
user within the application and is automatically updating user
profiles in order to improve the matching of questions and skills Gamification Motivate Voluntary Contributions?.
and interests. Proceedings Of The 18Th ACM Conference Companion On
Computer Supported Cooperative Work & Social Computing
In order to make sure users will not only ask questions without
helping other people by answering their questions, a reward - CSCW'15 Companion.
system is being used. By signing up, people are allowed to ask http://dx.doi.org/10.1145/2685553.2698999
two questions. After using the two questions, the system
motivates the users by making use of levels. Within the [4] Croft, W., Metzler, D. and Strohman, T. (2009).
application points are allocated per category. The combined points Information retrieval in practice. 1st ed. Upper Saddle River,
of all categories, determines the user level in the application and N.J.: Pearson Education.
the amount of questions a user is allowed to ask.
As shown in this paper, the application relies heavily on [5] Deng, L. & Huang, X. (2004). Challenges in adopting
understanding the content of the questions and answers provided
speech recognition. Communications Of The ACM, 47(1), 69.
by users. As the possibilities in this field grow, the application
becomes better at providing the user with what they need. http://dx.doi.org/10.1145/962081.962108
[6] Deterding, S., 2012. Gamification: designing for

7. FUTURE WORK motivation. interactions 19(4), 14-17. doi:
As people usually have a differentiation between personal and 10.1145/2212877.2212883
work accounts, linking a social media account will not provide all
their interests and knowledge. A possible extension of the [7] Franz, A. & Milch, B. (2002). Searching the Web by
application described could be to have a user link multiple
voice. Proceedings Of The 19Th International Conference
accounts, creating the possibility to generate categories over more
of their knowledge, providing a more extensive profile. On Computational Linguistics, 2.
http://dx.doi.org/10.3115/1071884.1071887
As shown in this paper, understanding the content of text, can
8
[8] Gazan, R. (2011). Social Q&A. Journal of the American [15] Liddy, E. D. (2001). Natural language processing. In
Society for Information Science and Technology, 62(12), Encyclopedia of Library and Information Science, 2nd Ed.
pp.2301-2312. NY. Marcel Decker, Inc.
http://surface.syr.edu/cgi/viewcontent.cgi?article=1043&cont
[9] Hamari, J., Koivisto, J., & Sarsa, H. (2014). Does ext=istpub
Gamification Work? A Literature Review of Empirical
Studies on Gamification. 2014 47Th Hawaii International [16] Luhn, H. P. (1958). The automatic creation of literature
Conference On System Sciences. abstracts. IBM Journal of Research and Development, 2(2),
http://dx.doi.org/10.1109/hicss.2014.377 159165.
[10] Harper, F., Moy, D., & Konstan, J. (2009). Facts or [17] Morris, M., Teevan, J., & Panovich, K. (2010). What do
friends?. Proceedings Of The 27Th International Conference people ask their social networks, and why?. Proceedings Of
On Human Factors In Computing Systems - CHI 09. The 28Th International Conference On Human Factors In
http://dx.doi.org/10.1145/1518701.1518819 Computing Systems - CHI '10.
http://dx.doi.org/10.1145/1753326.1753587
[11] Harper, F., Raban, D., Rafaeli, S., & Konstan, J. (2008).
Predictors of answer quality in online Q&A sites. Proceeding [18] Neustein, A. (2010). Advances in Speech Recognition
Of The Twenty-Sixth Annual CHI Conference On Human (1st ed.). Dordrecht: Springer.
Factors In Computing Systems - CHI '08.
http://dx.doi.org/10.1145/1357054.1357191 [19] Paskuda, M. and Lewkowicz, M. (2016). Anonymity
interacting with participation on a Q&A site. AI & SOCIETY.
[12] Harper, F., Weinberg, J., Logie, J. and Konstan, J.
(2010). Question types in social Q&A sites. First Monday, [20] Young, W. (2016). The voice search explosion and how
15(7). it will change local search. Search Engine Land. Retrieved 5
March 2017, from http://searchengineland.com/voice-search-
[13] Kurland, O. (2008). The opposite of smoothing. explosion-will-change-local-search-251776
Proceedings Of The 31St Annual International ACM SIGIR
Conference On Research And Development In Information [21] Zadeh, L. (2003). From search engines to question-
Retrieval - SIGIR '08. answering systems the need for new tools. The 12Th IEEE
http://dx.doi.org/10.1145/1390334.1390366 International Conference On Fuzzy Systems, 2003. FUZZ
'03.. http://dx.doi.org/10.1109/fuzz.2003.1206586
[14] Lewis, Z., Swartz, M., & Lyons, E. (2016). What's the
Point?: A Review of Reward Systems Implemented in
Gamification Interventions. Games For Health Journal, 5(2),
93-99. http://dx.doi.org/10.1089/g4h.2015.0078

The Answer To Anything: Tom Arnoldussen Diederik Beker Celine Van Der Geer

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

The Answer To Anything: Tom Arnoldussen Diederik Beker Celine Van Der Geer

Загружено:

Авторское право:

Доступные форматы

42

The Answer to Anything

Figure 2. Asking a question, tags are automatically generated.

Figure 3. Users gain points per category, points can be earned as

[6] Deterding, S., 2012. Gamification: designing for

Вам также может понравиться