Вы находитесь на странице: 1из 5

Comparative Study of Data Visualization Tools

Written during Data Scientist Enablement Program study


As assignment in Fast track to Data Science (DSE 400) module
March 2014
Data Visualization Tools, Comparative Study They say the eye is the window to our soul. Of course, the
eye is also the biggest communication channel into our brain. They also say picture is worth a thousand
words [1]. This is most important reason why we are using visual help in case we need to communicate
something what is the most important. Sometimes the simplification is needed; symbol is more than
complex draw.
VISUALIZATION AS JOURNEY TO SEEING, EXPLORING AND UNDERSTANDING
Visualization
Creating a picture form data even few arbitrary ordered records is helpful approach how to see
important things which we are unable to see directly. Only static visualization (picture, chart) brings data
into new information, but its limitation could be the static representation.
How can be visualization helpful? Following chart shows the subset of dataset about prostate cancer [2].
The example of visualization quickly reveals typo in measured values for weight:

Value in red circle represents value 6.1 for weight which means 449 g for prostate. Of course its simply
typo in decimal point and correct value 3.8 means 44.9 g.
Interaction
The static data visualization could be misleading when more details or additional explanation is
necessary. In the magnificent presentation from Hans Rosling [3] you can see many examples about data
representation with different level of details and how they change with interaction.
Bothe examples
1
are on chart representing Child survival in % dependent on country GDP in USD.
First example is described in 9:40 minute when he shows Sub-Saharan Africa which has the lowest Child
survival in % compare to other continents. But when its split into countries we can see that Sierra Leone
is the worst in Child survival in %, but Mauritius have Child survival in % better than other continents.
Another one is in 14:00 minute when he shows population split into groups by steps for 20% (the
poorest to the most riches) for Uganda, South Africa and Niger in 2003. It still Africa and even though
this countries in average are very close. So, it looks like one solution for whole Africa could be
presented. The opposite is true! Compare child survival in % for the poorest in Niger and the most riches
in South Africa reveal the big gap between countries.
In the interactive charts he compares also behaviors of measures during time and correlation between
countries. So, interaction is important when you want and need to see context.
Information
The most important from visualization and interaction is information and knowledge which we carry out
from this process. Understanding information and get experience via knowledge is the target which we
aim for.
According to different literature and researches e. g. from Cisco [4] or Hewlett-Packard [5] is obvious
that learn and understand what information means is better when we use a visual stimulation rather
than just a text description and it is much better when we have interaction (labels, annotations, zoom)
or we can work with it (filter specific data, drill down/up, slice and dice).
So, not only simple visualization, but also good description (axes, legend, units) or additional explanation
(annotation, labels, and sources) and interaction could turn data into information thru visualization.


1
I will go against theory described in next Information subchapter. Namely visualized information description is
better than text information description. Anyway, I havent been able to replicate graphs from video [3]. Even
though, I have found a SW [9] by which it has been created, including data and similar indices and I tried that very
hard.
TECHNICAL ASPECTS
The visualization tools are from technical aspects dividable into few categories
2
:
Programming languages: In each language you have some visualization ability, but most of this
power is released when some additional visualization library is used.
Visualization libraries: You can find many visualization JavaScript libraries [6], also R language
libraries, like ggplot2 [7] and many other libraries for other languages (PHP, Java, etc. [8]). In
the commercial and the non-commercial versions, where you can pay for licenses or support. It
could be available like just chart libraries or universal visualization tools for create anything you
like.
Online tools: Represents specific tolls for different purposes. Sometimes specifically limited like
LinkedIn relationships [10] or Dipity [11]. Sometimes more flexible like Many Eyes [12] or
Wolfram Alpha [13]. Anyway, they provide online service which is always useful when you need
something quickly to solve without install some software for it.
Applications: There exist applications for specific use like Weka 3 [14] or Gapminder [15] or
more universal applications like Gephi [16] or GraphViz [17] and much more. It is better to look
to the internet if there wouldnt be the specific one for you purpose.
Geospatial tools: Geographic information included directly into a map makes more sense and
simplify understanding of visual data representation. This group of tools is subset from the
group in bullet point above and these tools have mostly limitations for visualize just and only
geographic information.
Complex BI Reporting Tools [20][21]: When you need real reporting tool with data visualization
and interaction, step into this group. Many companies offer their online complex BI/Analytical
tools as SaaS and allow you to process data in cloud online and report them over web in
complex dashboards with different charts which interact to each other and bring complex
overview about data for future decision support.
BUSINESS ASPECTS
In the technical aspect of describing of visualization tools we care about what kind of specific tools we
have. What kind of visualization and additional functionality they offer.
From the business aspect we need to ask about [18]:
When? Situation when the visualization is used.
What? Purpose we want to use particular visualization.
Why? Benefits of this visualization.
How? Format of visualization.

2
As a side effect of this comparative study, was created a presentation from the visualization tools study which I
came thru. It is separated into categories corresponding to the technical aspects categories and it contains more
than 100 different visualization tools [22].
Also, what kind of data we want to load and in which complexity we want to visualize them (one chart or
dashboard, interactivity). Also what insight [19] the tool allows us to see in the visualized data (proper
representation, references, evaluation and prediction). In last step we need to ask about sharing final
representation and/or collaboration about it with other people.
HOW TO VISUALIZE?
The answer to this question should be concluded in following questionnaire which you need to ask
yourself and based on the answers for the questions you will got you are able to distinguish what tool is
most suitable for your actual purpose
Purpose
o What do you want to exactly study?
o What visualization you want to see?
o What kind of chart you will use?
Data
o What kind of data you have?
o What format of data do you have?
o Do you need to preprocess them?
Complexity
o Do you need one (one time) chart or dashboard?
o Do you need interactivity?
o Do you need technical and/or implementation support?
Insight
o Does the tool able to represent data according to your needs?
o Does the tool allow seeing connections and references between different sources
(customer - purchase) and time periods (Year to Date, Year over Year)?
o Does the tool enumerate, calculate, simulate and/or predict trend?
Output
o Do you need collaboration and sharing of charts?
o Do you need exportable or printable output?
o Do you have specific requirements (privileges, offline usage, pre-calculation, etc.)?
This is just a simple list with questions which came to my mind during time which I have investigate
different visualization tools and evaluate them for different usage. Of course, you can extend this
questionnaire about any kind of question you like and make it as criteria for your choice.

REFERENCES
[1] A picture is worth a thousand words, Wikipedia
[2] Prostate Cancer Data, 1989:
a. Prostate.info
b. Prostate.data
c. Prostate Cancer Data description
[3] The best stats you've ever seen, Hans Rosling, TED, Jun 2006
[4] Multimodal Learning Through Media: What the Research Says, Cisco, 2008
[5] The Power of Visual Communication, Hewlett-Packard, 2004
[6] Comparison of JavaScript charting frameworks, Wikipedia
[7] ggplot2, R language library official website
[8] List of charting software, Wikipedia
[9] Gapminder World, Google, 2008
[10] LinkedIn Relationships, web
[11] Dipity, web
[12] Many Eyes, web
[13] Wolfram Alpha, web
[14] Weka 3, web
[15] Gapminder, web
[16] Gephi, web
[17] GraphViz, web
[18] Towards A Periodic Table of Visualization Methods for Management, Ralph Lengler & Martin J.
Eppler, University of Lugano
[19] Need to Solve a Big Data Marketing Problem? Visualize It, John Balla, 2014
[20] Start2Cloud.com: Business Intelligence Applications review, web
[21] TrustRadius.com: Business Intelligence/Analytics Reviews, web
[22] Ultimate overview of Visualization Tools, Jaromir Salamon, 2013 (TBD)

Вам также может понравиться