Вы находитесь на странице: 1из 9


The ISTN 731 chatroom is a discussion forum on Hipchat hosted by Craig

Blewett and Rosemary Quilling, lecturers of special topics, a second semester
module for ISTN honours 2014 students of the University of KwaZulu-Natal,
Few weeks before then, lecturers Craig and Rose gave the class an assignment
which involves using twitter as a tool for knowledge management, personally, I
have had a twitter account since 2012 and had never considered it a tool for
knowledge management.
On Tuesday, 16th September, 2014, we held a chatroom session about the use
of microblogging for knowledge management, using twitter. Alan Kyeswa
opined that the good thing about Twitter is the possibility of sharing your
thoughts and knowledge across an unlimited audience who themselves are
twitters and bloggers; however, the limitation is the restriction of text
character to only 140 characters. While most students agreed that
microblogging is an effective tool for knowledge management, Dhevesh
Parumasur thought that microblogging may fail as a tool for knowledge
management in organizations because employees use their knowledge to get
ahead of each other for promotions and other advantages and may not be
willing to share their knowledge, therefore, formal knowledge management
systems may fail. He further said that if companies use microblogging sites
such as twitter, integrity might be an issue as anyone would be able to get
knowledge which should be confidential and could give a competitive
advantage to the company by just following the company but that taking a look
at enterprise solution in an organization could increase the integrity of
microblogging and confidentiality of information.
Another challenge which was unanimously agreed upon is that there are
streams of information on microblogging sites such as twitter and it could be
very hard to mine out relevant data when needed. The solution to this problem
is to make use of the # tag when posting information, however, the challenge
with using # tag remains that some individuals may not tag their post and tags
when used, are user curated.

Micro-blogs allow users to exchange small elements of content such as short
sentences, individual images, or video links.
1. Ambient Awareness: In combination, different tweets sent out over
time can paint a very accurate picture of a persons activities. Several
tweets together can generate a strong feeling of closeness and intimacy.
Due to ambient awareness, applications such as Twitter result in
relatively high levels of social presence, defined as the acoustic, visual,
and physical contact that can be achieved between two individuals
(Short, Williams, & Christie, 1976); and media richness, defined as the
amount of information that can be transmitted in a given time interval
(Daft & Lengel, 1986).
2. Push-Push-Pull Communication: Following implies that one authors
tweets are automatically pushed onto the Twitter main page of all
followers. Following another user creates an additional element of
convenience as it reduces the effort associated with accessing this
information. In some cases, the receiver of the message might find the
news so interesting and intriguing that they decide to give it an
additional push by re-tweeting it to their own followers. Specifically,
when the initial message has been sent out by a company, this
transformation of a commercial message into buzz can substantially
increase its impact and credibility (Katz & Lazarsfeld, 1955). Once the
message has been pushed and pushed again through the whole
network, it may motivate some user to go out and pull additional
information on the subject from other sources.
3. Virtual Exhibitionism and Voyeurism: Because tweets are public by
default, unless otherwise requested, any message sent through twitter
automatically becomes public knowledge within minutes of its
publication. This creates the perfect environment for virtual
exhibitionism and voyeurism, these two factors represent the third
reason behind growth of microblogging applications
1. Security: Some informants reported that they might share even more
work related updates and would like to be aware of similar things from
co-workers, if Twitter could be a safe place to post inside the companys
firewall. Although people hesitate to mention project-or client-specific
information on a public feed, some still worried whether it would be a
safe place to explicitly discuss business-sensitive information and care
need to be taken about what is said as one never knows in case
someone from outside the company spies at the conversation.
2. Privacy: In Twitter, subscribing to ones updates is open without
permission approval, and the system sends a users updates to all
his/her subscribers. An employee may have concerns about what to
update if his boss is in this subscription list. A manager may hesitate to
update because he may not want all his team members to know what he
has been doing.
3. Integration: People dont only use twitter for knowledge management,
some use it to keep up to date with their friends and family and even
business partners. So many people find it difficult to integrate these two
functions together.
4. Filtering and grouping: Another problem with the use of microblogs for
knowledge management is that some people could end up following too
many people; this could result in cognitive overload; where users have a
hard time monitoring a large amount of people and following their
updates. I recently had to unfollow a pastor on twitter because he
overloads my page with so much sermons and motivation I could hardly
see anything else
Experts have defined micro-blog as a group of Internet-based applications
that build on the ideological foundations of Web 2.0 and that allow the
creation and exchange of user generated content (Kaplan & Haenlein, 2010,
p.61) keeping users in contact with each other without necessarily incurring
excessive cost (Bughin, Chui &, Miller 2009) this enhances communication with
a wide array of audience and consequently yield increase in collaboration and
overall productivity in organizations (Hoong, Ming, Aripin & Aun, 2012)

1. Bughin J., Chui M.,Miller A., (2009), How companies are benefiting from
Web 2.0, McKinsey Global Survey Results.
2. Daft, R. L., & Lengel, R. H. (1986). Organizational information
requirements, media richness, and structural design. Management
Science, 32(5), 554571.
3. Hoong A.L. S., Ming L. T., Aripin R., Aun J. L. R.,(2012), A comparison
study on the use of knowledge management systems and enterprise
microblogging systems for organizational knowledge sharing, 2nd
international conference on management (2nd ICM 2012) proceeding.
4. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The
challenges and opportunities of social media Business Horizons, 53(1),
5. Katz, E., & Lazarsfeld, P. F. (1955). Personal influence: The part played by
people in the flow of mass communications. Glencoe, IL: The Free Press.
6. Short, J., Williams, E., & Christie, B. (1976). The social psychology of
telecommunications. Hoboken, NJ: John Wiley & Sons, Ltd.

Good things they say do not come easy, so is data mining. Just like gold is
mined out, refined and turned into asset, there is a need to dig out data from
the sea of knowledge that exist in the world of the web 2.0. As microblog grow
more popular, microblog services have become information provider on the
web scale ( Zhang & Sun, 2012).
Needless to say, microblog has become a popular social networking service in
recent time because of the following reasons:
Ease of use and convenience
Real-time platform to update personal status
Possibility to publish microblog over multiple delivery channels (web
interface, cell phones, etc.)
Hang & Lerman (2010) divided microblogs into three categories:
Conversations or
Retweet messages
Whatever the category however, how to effectively dig out latent topics and
internal semantic structure from large scale data is a very challenging issue
(Zhang & Sun, 2012).
More than ever before, organizations and business organizations have realized
that information is as much asset as the financial and infrastructural assets, if
not more, and that the overall business strategy and capability must be well in
alignment with the IT strategy and capabilities to produce value for all
stakeholders. Most of them have created an online presence and taken onto
social media and microblogging resulting in streams of information to form a
robust body of knowledge on the web 2.0. Though microblogs contain
structured information, they are short text (140 characters) and carry limited
information. This makes it very difficult to mine out relevant topics.
To solve this problem, the microblog tag feature, for example the #tag feature
on twitter has become of immense use (Gunther & Krasnova 2009). This has
made it easier to find relevant information when mining a microblog as against
the traditional knowledge mining. For example, I made use of the #tag feature
to mine out information about post by the 2014 ISTN honours class on twitter,
using #ukzn2014, and in seconds I had information regarding posts of ISTN
2014 honours students which has been tagged as below:

Zhang & Sun (2012) also proposed the use of a novel probabilistic generative
model based on LDA called MB-LDA. This takes contactor relevance and
document relevance into consideration to improve topic mining in microblog.
The traditional mining for knowledge management is usually in the form of
plain text and there are quite a few methods, models or algorithms used. The
clustering methods are early solution which transfer unstructured text data to
vectors by Vector Space Model (VSM) and do clustering with traditional
methods like K-means (Xu & Wunsch, 2005). Clustering results are often
considered sharing the same topic respectively. However, the major
disadvantage of clustering methods is that many of these algorithms depend
on distance functions for the pairwise distance measurements, which are
difficult to define in large scale corpora; besides, there lies no semantic
information in clustering results, which need further analysis to extract topics
Dimensionality reduction method like Latent Semantic Analysis (LSA) was
introduced to text mining by Deerwester et al (1990). By assuming that words
close in meaning will occur close together in text, LSA constructs a term-
document matrix in the popular tf-idf scheme (Salton & McGill, 1983) and use
singular value decomposition to capture its latent semantic in the concept
space. Although LSA can extract topics from corpora and find relations
between terms in a semantic way, the limitation is that the result of SVD is less
interpretable and LSA itself cannot capture polysemy. LSA is not suitable to
large scale text mining due to its high computing cost Probabilistic topic
models such as Latent Dirichlet Allocation (LDA) was introduced to text
modelling by Blei et al (1998). LDA is also a method to recover the latent topic
structure, which extends PLSA ( Hofmann,1999) by defining a complete
probabilistic generative model. The intuition behind LDA is that documents
exhibit multiple topics which are represented by distributions over words. In
the framework of LDA, words of documents are the observed variables while
topic structures are the hidden variables. Through probabilistic inference for
LDA, the hidden variables are inferred and topics can be discovered from the
corpus (Griffiths & Steyvers, 2006)
Different from plain text, microblog has its special symbols (i.e. @ and RT) to
characterize the relation between microblogs: @ indicates the contactor
relevance relation of microblogs and RT indicates the document relevance
relation of microblogs. Contactor relevance relation of microblogs refers to
that conversation message and its contactor (@) have latent semantic
relationships. In general, messages with the same contactor usually share the
related topics. This phenomenon is very common in conversation microblogs.
Document relevance relation of microblogs refers to that comment on retweet
message and its original content have latent semantic relationships. In general,
the comment part and original part share the related topics. This phenomenon
is very common in retweet microblogs.
In lay terms from my experience while mining information traditionally form
sites such as Google for my research proposal; I observed the following
difference between plain text or traditional data mining and microblog mining:
S/N Microblog Mining Traditional Mining

Rich in information from diverse
viewpoints, opinions and
perceptions because of
comments from different people
on the same topic or tweet
Information is usually a monopoly of
one mans viewpoint, opinion or

2. Information is usually on trending
issues and are up-to-date

Usually repositions of archived
knowledge which in most cases are not
are up-to-date as the microblog.
3. Information is short, concise and
straight to the point because of
the character limit to 140

Author can perambulate as much as he
deems appropriate, therefore, lots of
sieving needs to be done to get the
actual points needed

4 Microblogs could contain
shortened URL links to other sites
containing relevant information,
the site could shed light on doubt
and questions that the researcher
could have and even open up the
possibility to contact the author
for further clarification
Information could be unclear and
confusing, even then, there is hardly
any provision to contact author for

From the points raised above, I can safely conclude that it is easier and more
rewarding to make use of microblog mining rather than the traditional means
of mining for knowledge management.

1. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation.
The Journal of Machine Learning Research, 3: 9931022.
2. Deerwester, S., Dumais, S., & T. Landauer, T., et al (1990). Indexing by
latent semantic analysis. Journal of the American Society of Information
Science, 41(6): 391407.
3. Griffiths, T. & Steyvers, M. (2006) Probabilistic topic models. Latent
Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum.
4. Hofmann, T. (1999). Probabilistic latent semantic indexing. In
Proceedings of the 22nd annual International ACM SIGIR Conference on
Research and development in information retrieval, 50-57
5. Kang, J.H., Lerman, K., & Plangprasopchok, A. (2010). Analysing
Microblogs with Affinity Propagation. In Proceedings of the 1st KDD
workshop on Social Media Analytic, 67-70.
6. Landauer, T.K., Foltz, P. W., & Laham, D. (1998). Introduction to Latent
Semantic Analysis, Discourse Processes, 25: 259-284
7. Salton, G., & McGill, M. (1983). Introduction to Modern Information
Retrieval. New York: McGraw-Hill.
8. Xu, R., & Wunsch, D. (2005) Survey of clustering algorithms. IEEE
Transactions on Neural Networks, 16(3): 645678.
9. Zhang, C. & Sun, J. (2012). Large scale microblog mining using
distributed MB- LDA. International World Wide Web Conference
Committee (IW3C2).