Вы находитесь на странице: 1из 25

WEB MINING

SOCIAL NETWORK ANALYSIS


SESSION 5
Learning Objectives
A. To understand the role of web mining and social network analysis in
today’s business context

Key Content :
Web Mining and Social Network Analysis
a. Web Content and Structure Mining
b. Web Usage Mining
c. Social Network Analysis

Essential Readings:
a. Chapter 14 and 15 Data Analytics by Anil Maheshwari of McGraw Hill
WEB MINING
Web Mining: can be broadly defined as discovery and
analysis useful information from the WWW

WEB MINING

Web Content Web Structure Web Usage

HTML content URL Link Site Visit, clicks


WEB USAGE MINING
Web usage mining refers to the automatic discovery and analysis
of patterns in clickstream and associated data collected or
generated as a result of user interactions with Web resources on
one or more Web sites

STEP – 1 STEP – 2 STEP – 3 STEP - 4


Websites Web logs Gather data & Mine the data for patterns
Visitors Web clicks- Prepare data for - Usage pattern
Users streams analysis - Usage profiles
Customers - Web page profiles
- Web optimization
WEB USAGE MINING PROCESS
APPLICATIONS FOR WEB MINING
1. Personalization of web content

The web usage mining technique can be applied to personalize


websites, depending on user profile and behaviour.
Personalization is important in creating a deeper relationship, to
build acceptable marketing strategies, and to automate the
promotion of products for potential customers.
Also, web usage mining aims to obtain information that supports
website design to allow easier and faster access on the part of
customers
2. System improvement
The results produced by web usage mining can be used to
improve the performance of web servers and web-based
applications.
By understanding the behaviour of web traffic, polices and
strategies can be produced for web caching, network transmission,
load balancing and data distribution.

3. Site design support


Usability is one of most important issues in the design and
implementation of websites.
The results of web usage mining give designers information
about user behaviours that help in decisions about any redesign
of the content and structure of the website.
4. Enhance e-learning environment

Usage mining tools can be used to track the activities


happening within the course’s website, and then extract patterns
and behaviours that need to be changed, improved or adapted
to the course contents.

For example, designers can identify the links that are


always visited,
links never visited, and
the cluster of users that visit specific links
PAGE RANK
PageRank is what Google uses to determine the importance of a web
page. It's one of many factors used to determine which pages appear in
search results.

The heart of Google’s searching software is PageRank™, a system for


ranking web pages developed by Larry Page and Sergey Brin at Stanford
University

Page Rank is a numeric value that represents the importance of a page


present on the web.

When one page links to another page, it is effectively casting a vote for
the other page.

More votes implies more importance.


Mathematical PageRanks for a
simple network, expressed as
percentages. (Google uses
a logarithmic scale.)

Page C has a higher PageRank


than Page E, even though there
are fewer links to C; the one link
to C comes from an important
page and hence is of high value.

https://en.wikipedia.org/wiki/PageRank
SOCIAL NETWORK ANALYSIS
WHAT IS SOCIAL NETWORK ANALYSIS ?
oSocial network: are a graphical representation of relationships among people
and or entities.
oIt is the art and science of discovering patterns of interactions and influence
within the participants in a network.

oAchieved relationship:
 Established in the course of regular interaction
 In the processes of daily life and living, cultural activities, etc.
 One household requesting help, support, or advice from another;
 Ties of friendship or choice of individuals to spend leisure time together

oUnits of a "social network":


 Individuals, families, households, villages, communities, regions
BENEFITS OF SOCIAL NETWORK ANALYSIS
Results of the SNA can then be applied by individuals,
departments or organizations to:

identify who are the persons playing central roles (thought leaders, knowledge
brokers, information managers, etc.);
identify bottlenecks and those who are isolated; spot opportunities for
improving knowledge flows;
 target those areas where better knowledge sharing will have the most impact;
and raise awareness of the significance of informal networks.
WHY AND WHEN SHOULD YOU CONDUCT A SNA?
SNA is valuable for better understanding:

 Which actors are involved in a network;


 How they are linked;
 How influential each actor is;
 What their motivations are; and
 How the network is structured.
 How well do people know each others’ knowledge
 What type of people have the most influence on our eating habits?
Which politicians have the most influence on policy?
TYPES OF SOCIAL NETWORK ANALYSIS

1. Socio-centric = Whole networks


 Everyone working in a one company
 Creating one network

2. Egocentric = Personal networks


 Creating many stand alone network
APPLICATIONS OF SNA
1. Self Awareness: Visualizing his/her social network can help a person
organize their relationships and support network
2. Building Communities: Help in strengthening of networks within communities
to build wellness, and comfort.
3. Marketing: Organization can use to reach out with their message to large
number of people and also to listen actively to opinion leaders as ways to
understand their customers need and behavior.
4. Public Health: Awareness of networks can help identify the paths that
certain diseases take to spread. Public health professionals can isolate and
contain diseases before they expand to other networks
5. Extensively used by disciplines such as sociology, public health, and business
management for describing various individual or organizational outcomes
Case Study: Using social network analysis
as part of a knowledge audit.
SOCIAL NETWORK
(In Case of a relationship) the flow is significant:
 In both directions or only in one direction
 In the latter case, from which direction to the other

Symmetric and asymmetric relations


 A prefers B, but B does not prefer A (asymmetric)

Unit: vertex or node


Existence of connection between two nodes: presence (1) or absence (0) of a relationship
Arc: tie with a direction; Edge: tie without a direction
Weight of a tie: value or volume of flow
Representation of networks: diagrams
 Vertices are represented by points
 Arcs by lines with arrowheads
 Edges by lines without arrowheads
19
SOCIAL NETWORK Everybody goes to
everybody else
Reciprocated but highly
fragmented
Connected, concentration
of power
Connected, large number
of intermediaries
Connected, Strong
hierarchy, ties flow only
in one direction

20
INTERPRETATIONS OF METRICS IN SNA
Avg. Distance: How accessible people are overall.

Max Distance: this indicates that there are people within the
network that will be inaccessible to each.

Cohesion Measure: Intuitively, the concept of social cohesion


translates into relatively densely connected sections within the
network: group members relate more extensively, frequently, or more
positively among themselves than to members of other subgroups (ranges from 0
to 1)

Reciprocity: Reciprocity refers to responding to a positive action


with another positive action
Degree – Number of links to a
vertex
Density: describes the general
level of linkage among the points
in a graph.
It is a measure of number of
potential ties that actually exist
and the measure of the average
percentage of the network that an
individual perceives as being
accessible by them.
Density = Actual Connection /
Potential Connection
GROUP DENSITY TABLES:
This is used to highlight overall patterns of relationships within
networks containing identified sub-groups.

The table shows the level of perceived support


both for people within the groups
and between people in the different previous groups

The values in the diagonal indicate level of perceived


accessibility within the former grouping.

This measure provide an indication of the level of cohesion within


the groups and the strength of the links between the groups.
SOCIAL NETWORK TOOLS
For scholarly research tools like UCINet, Pajek , ORA, the statnet suite of packages
in R, and GUESS are popular.
Examples of business oriented social network tools include iPoint , NetMiner , InFlow
, Keyhubs , Sentinel Visualizer , KXEN Social Network , NodeXL .
For large networks with millions of nodes, try Sonamine or ORA. For mobile
telecoms Idiro SNA Plus is recommended.
An open source package with GUI for Linux, Windows and Mac, is Social Networks
Visualizer or SocNetV.
Another generic open source package for Windows, Linux and OS X with interfaces
to Python and R is "igraph”.
Another generic open source package with [GUI] for Windows, Linux and OS X is
"Tulip“.
LINKS FOR SELF STUDY
https://cambridge-intelligence.com/keylines-faqs-social-network-
analysis/

http://www.ifets.info/upcoming/5125.pdf