Академический Документы
Профессиональный Документы
Культура Документы
The Analyses of Big Data Effect on The Political Outcomes of Developing Countries:
By Emer Ferguson
PSYC 4P25
Big Data
The term Big Data is currently of much controversy because of the inability to find a
universal definition of what it contains. Currently most researcher having their own definition of
what they believe is the main factor of big data. For example, in a meta-analysis researcher
examined current definitions of big data. Such as Mayer-Schonberger and Cukier (2013)
describe big data in terms of shifts from the original way we have analyzed data including more
data instead of a sample, incomplete or inaccurate data can be used instead of limited to
complete data and increased importance of correlation instead of causality. There is also the 3v
model of defining big data which involves defining big data based on the increase in volume,
velocity and variety (Laney, 2001). Some people believe that big data can be describe based on
its technological needs, where a computer needs increased ability to process the enormous and
complex set of data (Microsoft research, 2013). Other researcher wishes to emphasis the social
impact of big data defining it as a cultural technological and scholarly phenomenon based on the
technology need to accurately analyses the data, the ability to find patterns in such large data sets
and the mythology surrounding the ideas that the data obtained is superior to previously found
data. In the current study the definition used is Kshetri, 2014 defining big data as containing data
of significant, volume, velocity and variety which require cost effective and innovative forms of
Introduction
Social media use has grown significantly and through this growth in popularity it has
been seen to influence real world events such as politics. Through the increased use of social
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 3
media, it has allowed for the ability to exchange content and the increased ability to
communicate between social groups and networks. Through this increase in uses researchers
have been able to see an increase in big data, because of the increase in the data created by social
media there has also been an increase in the need for ways to analyzes this data. One social
media site which is influential is twitter and it has become increasingly important because of its
abundance of unstructured data; it is only recently that researcher have been able to analyzes
such large quantities of data at the speed they are being produced.
In this study researchers used Social Network Analysis (SNA) to represent big data, SNA
is a method of measurement used to analyzes social structure and help to predict the structure of
relationships among social entities as well as the impact such structures have on social
phenomena. SNA also uses sociograms for visualization and data which shows the position of
Social media has a significant effect on politics it can help to provided voters with
information which can help them to make more informed voting decision. social media has also
help bring politician closer to potential voters. Social media has had a significant effect on
political outcomes in developing nations, by providing youth with important information helping
them to make better and more informed voting decisions. It has been seen that with the uses of
social media that countries have been able to rid themselves of dictator and monarchs.
Method
The researchers choose to use twitter since it is based on the idea of communication with
others, and its connection to current events with its ability to track online trends using a hashtag
For the management and analyses of big date they choose to use is NodeXL, they
choose this program based on its flexibility to work, and NodeXL can also be used to do basic
SNA, by using network made of nodes which often represent people and edges which represents
their relationship between other nodes, theses help to form a SNA. Using NodeXL the
researchers extracted over 5000 data from twitter accounts that tweeted and retweeted or
commented on the hashtag #Nigeriadecides, the process of using a hashtag involves many people
referring to the same topic and by doing so they create a tie between people. With NodeXL the
researcher was able to calculate common metrics such as degree, which is the process of
counting the number of connections for each node, for directed networks degree was divided into
in-degree and out-degree. Participants with high in-degree connection were able to connect and
exchange with many people. Actors with high outcome centrality were consider highly
influential actors such as actor Sahara reporter who has the highest out-degree of 13. They also
examined eigenvector centrality which represents the degree of a node in which it connects to
another node. Betweenness centrality which helps to decide how significant each node is in
helping to connect different parts of the network and help to decided which nodes are essential to
the network. They also looked at closeness centrality which looked examines the distances of
each actor from other actors and emphasizes the nodes that connect to the other through a lower
number of edges.
The researcher decided to use a sociogram using a cluster algorithm to group together
people with high network density of both in-degree and out-degree which were considered
influencer to the network compared to people with low network density who were considered
isolates who are not important because of their limited communication to others. Through this
process they were able to select the top people with in-degree values greater than 40 because
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 5
they were considered most influential. Then the researcher used automated group assignment by
algorithms which grouped together the graph’s vertices into cluster using an algorithm which
looks at how the verticals are connected and how the communicate with each other.
Discussion
When looking at the Principal actors in this study there was Sahara reporter with the
highest value of 421 in-degree and thisisbuhari with a value of 261 and the top two actors also
have the highest betweenness values as they provide the bridge to other parts of the network.
This was the fifth quadrennial election held in Nigeria since the end of military rule in
1999 and there was two main political parties people the people democratic party with the
candidate Goodluck Jonathan and the all progressive congress with the candidate Muhammadu
Buhari. The researcher believes it was Muhammadu Buhari use of twitter which increased his
likelihood of winning, and he was the second more influential actor in the network. Another
reason was media, the most influential actor in the network was Sahara reporter which is a
website that encourages citizens to report ongoing corruption and government malfeasance in
When examining the influence of social media, there has been a significant growth in the
internet use through social media outlets and during this election to discuss human rights, politics
and more. However, in Nigeria only 8% of the population as of 2016 had access to internet, so
this limits it success since such a small proportion of the population had access to this
information.
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 6
Results
The purpose of this paper was to look at the effects of social media in developing
countries political outcomes. The finding do support the hypothesis that social media is a driver
for political awareness and social change is developing countries. They also found that as
internet use increases in developing countries so does the discussion of pressing topics. This
study also helped to show that big data can be successful in the collection of data. Through this
study they are also able to see an increase in voting due to social media, the results of this study
indicated that social media made people 39% more likely to vote then people who did not use
social media. Also results of opinion polls on twitter of election outcomes were significant due to
how close they were with online prediction stating 79% for Buhari and 21% for Jonathan and the
results being 55%-45% they are significantly similar, and the difference could be caused by
external factors.
Validity
When looking at validity within this study first there is face validity which is at face
value does the test appears to measuring what it claims. When looking at the study the
hypothesis was to study the effect of social media on political outcomes in developing countries.
However, the researchers on examined the effect of one hashtag on the number of connections
specific user developed. This limits the result because they could have looking at only one
specific side of the election and be ignoring another network which was developed using a
different hashtag. Another factor is that only 8% of Nigeria has access to social media which
does not represent enough of the population to make a significant report on the effect. Also by
using twitter for the collection of data many external factors come into play that the topic the
researchers were examining is accessible to everyone not just the people of Nigeria and because
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 7
of this the results of the study may not have represented the effects of social media on the
political outcome in developing countries because people in another country may have responded
and the researchers would have been unable to know if the people responding were actually
There is also content validity which is the hope that the characteristic of interest are the
only ones present and that the study is not measuring anything irrelevant. When examining is
study as previously stated the researcher were looking at the effect of social media on politics.
However, they only used one form of social media twitter, and limiting it more only one aspect
of twitter the hashtag. When twitter has multiple ways to communicate with people and they only
focused on one and they did not manage to analyses the meaning of the comment people made,
this limiting the significance of the results. The researcher only used one hashtag which was part
of one of the candidate’s campaign, and by only using one they could have been missing a whole
other network which was for the other party. Another problem with this study content validity is
the small amount of the Nigerian population that has access to social media not providing an
accurate representation of the population that the researchers were studying. Along with this a
significant proportion of the population was not included such as people who do not speak
When examining big data, it is not possible to look at examine concurrent validity and
predictive validity, this is because concurrent validity looks at the study in question compared to
other test measuring the same variables however this is not possible with big data because the
data is constantly changing. So, another test would either be measuring the same data using the
same algorithm making it the complete same or be completely incomparable if they used
different algorithms. Predictive validity measures the degree to which a test will predict these
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 8
variables in the future, because the data is constantly changing its impossible predict the
variables will predict future results because they are unlike to be the same.
Reliability
When examining big data, it is very difficult to state reliability because currently it is not
possible to test the reliability of this data. Such as, when looking at test-retest reliability which
involves making the participants take the test one two separate occasions hoping to gain similar
results, that is not possible to do in big data because your participants and the data being sent is
constantly changing leading to no test retest reliability. There is also inter-rater reliability which
involves different researcher giving estimates on the same research however it is not researcher
collecting big data it is an algorithm and like predictive concurrent validity if the researcher were
to uses the same algorithm they would get completely the same results regardless to how reliable
the results are and if they used different algorithms the results would be so different they would
be incomparable.
The only way to test the reliability of big data is to compare its to past methods for testing
political outcomes in developing countries. Looking at a meta-analysis from Jost (2017), they
state the theory of ideology as a motivation for voting behavior and they found an effect size for
ones’ ideology being able to account for .80 of one voting decision. This is much higher than the
effect size found in the study that those that were able to access social media were .39 more
likely to vote then people who didn’t. This shows that one’s ideology has double the
Test Bias
This study had many forms of test bias which could significantly affect the results
reliability; such as social desirability which is the tendency for people to answer in a specific
way to appear socially desirable in this study there were many incidents in which people would
respond in hopes of appearing more socially desirable, first on social media people want to be
part of the in group. This still occurs even online because peoples’ online personality still wants
to be accepted by the in-group. This could affect the results because people may tweet things
they do not agree with or have no intention of doing just to appear socially desirable. This could
affect this study because people may have no plane to vote but still want to appear socially
desirable in this group and speak out about voting. Second is Acquiescence which is the
tendency to agree more with thing especially if you are unsure of the correct answer. This could
affect the results of this study because people may tend to just agree with other by liking and
retweeting the other messages, the researchers may not be getting an accurate representation of
people true opinion but peoples tendency to agree. Also Test sensitivity which involves what if a
person agrees to a claim do they truly agree, this is a significant bias to twitter because any one
can respond to anything and its impossible for the researcher to control so in this study they are
unable to know that the people using the hashtag agree or disagree with it.
Psychometric issues
Within the study there is many psychometric issues, such as the uses of only one social
media site. By only using twitter they have significantly limited their data because they have
eliminated many other forms in which people communicate on and because of this it is not
possible to state social media can predict political outcome, when only one source of social
media was examined. Along with this twitter has many forms of communication such as
THE ANALYSES OF BIG DATA EFFECT ON POLTICAL OUTCOMES 10
retweeting, commenting, liking and the use of a hashtag; however, this study chooses only to use
the hashtag using it to show the connection between users. This is limiting because it ignores
many aspects of twitter and by only focusing on the connection between users of the hashtag, the
researchers eliminated the meaning of the message that the hashtag was apart of. By eliminating
the context of the message, the hashtag, they have no idea if the message direct positivity or
negativity. They also only used one hashtag which was specific to one party in doing so
eliminating a possible network which may have focused more on the other party, so it is hard to
state that social media in this election was able to predict political outcomes, when they only
examined one side of the election. They also did not include enough of the Nigeria population to
make a justified result, because only 8% of the population has access to social media which is
Future research
When looking at the direction for future research there is no way to repeat or improve this
study because big data is constantly changing, and it is impossible to retest this information.
However, it is possible for future researcher to learn from the mistakes in this study and improve
future test in this field. When looking at a direction I would choose to look at developing
countries where more of the population had access to social media and look at more aspects
twitter and other social media outlets in hope to get a fuller understanding of social media effect
on political outcomes in developing countries. When looking at big data I believe research
should be done specifically in hopes of finding methods to test its reliability so that we can have
stronger more trustworthy results and that test can more easily understand where they must make
Bibliography
Jost, J. T. (2017). Ideological Asymmetries and the Essence of Political Psychology. Political
Mauro, A. D., Greco, M., & Grimaldi, M. (2016). A formal definition of Big Data based on its
Udanor, C., Aneke, S. & Ogbuokiri, B. O. (2016). Determining social media impact on the
politics of developing countries using social network analytics. Program, 50, 481-
507. doi:10.1108/PROG-02-2016-0011