Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.

Similar presentations


Presentation on theme: "Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques."— Presentation transcript:

1 www.the-data-mine.co.uk Mining Social Networks

2 www.the-data-mine.co.uk Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques  Example Applications

3 www.the-data-mine.co.uk Social Network Analysis  Also called Organizational Network Analysis  Pre-dates data mining.  Developed by sociologists and anthropologists  Formalise their understanding of family and community relationships.

4 www.the-data-mine.co.uk What is a Network  Referred to technically as a "graph".  Each person (or organisation etc.) is represented as a node.  Visually this is normally a dot or square.  Connections are called “ links ” or “ edges ”  Represented as a line.  Indicates communications (e.g. emails), purchases, visits, or less tangible things such as emotional relationships.  Can be “directed” or “undirected” e.g. On Twitter, you follow Stephen Fry, but he doesn’t follow you!

5 www.the-data-mine.co.uk

6 Email communication Graph Nodes = People Links = Emails

7 www.the-data-mine.co.uk Applications of Social Network Data Mining Typical applications of social network analysis and data mining:  Detection of criminal activity, Counter terrorism, "homeland security" and intelligence  Analysis of relationships within companies  Sociological and anthropological studies  Reciprocal trust schemes such as e-bay ratings  Recommended friends on Facebook  Filter or recommend social media content  Etc….

8 www.the-data-mine.co.uk Complex Network Example

9 www.the-data-mine.co.uk Complex Network Example

10 www.the-data-mine.co.uk How do we Analyse Networks?

11 www.the-data-mine.co.uk Graph Statistics - Individual Nodes  Degree Centrality  Number of connections to other nodes.  High values means many connections.  Can measure links in and out separately  Applications  Who is most listened to on Twitter?  Who has most contacts within a company?  Which user’s reviews influence others the most?

12 www.the-data-mine.co.uk Graph Statistics - Individual Nodes  Closeness Centrality  The average number of steps required to reach any other node. Communications are easier if you don't have to go through too many people.  Applications  Is this person central to the group?  Is your message likely to reach the audience?

13 www.the-data-mine.co.uk Graph Statistics - Individual Nodes  Betweenness Centrality  How much of a link between other nodes is this node?  Applications  Someone who has a high betweenness centrality is often a broker between others.  What happens if this person leaves the network?

14 www.the-data-mine.co.uk Graph Statistics - Networks as a Whole  Structural holes  Gaps in linkage between groups.  Applications  Bridges across this access information from both, suggesting influence and understanding of an organisation.  Can we create a bridge?  Is there an opportunity to control or influence communications between groups?

15 www.the-data-mine.co.uk Graph Statistics - Networks as a Whole  Degree of centralisation  is the network held together by just a few nodes?  Or is it more cohesive?  Measures include average and variance of degree centrality  Applications  Is a crime network vulnerable to disruption?  What happens to a company if a few key people leave?

16 www.the-data-mine.co.uk Graph Statistics – More…  There are many other measures, for examples see:  http://faculty.ucr.edu/~hanneman/networkshop/index.html  http://en.wikipedia.org/wiki/Social_network

17 www.the-data-mine.co.uk Data Mining Approaches to Networks  Structural Equivalence  Find nodes with similar roles in the network  Cluster Analysis  Identify groups of nodes which are closely connected - and characterise them  Identifying the Most Influential People  Predicting Node Types (e.g. Fraudster)  Profiling Sub-networks (e.g. terrorist cell)

18 www.the-data-mine.co.uk Twitter - Clustered Network To reduce clutter, we can cluster people who reference each other, and only show links within clusters.

19

20

21

22

23

24 www.the-data-mine.co.uk Data Mining Social Networks - Challenges  Standard problems  Incompleteness – We don’t know everything  Incorrectness – What we think we know is wrong  Inconsistency – We have contradictions in our data  Data transformation - Getting data into a form acceptable by your tools  Fuzzy Boundaries - Networks do not normally have distinct boundaries  Network Dynamics - Relationships change over time

25 www.the-data-mine.co.uk Software for Social Network Analysis / DM StatNet – R Packages - http://statnet.org/ StatNetTutorial - http://www.jstatsoft.org/v24/i09/paper JUNG – Open Source Java toolkit for SNA - http://jung.sourceforge.net/ NetMiner - Commercial, Comprehensive SNA - http://www.netminer.com/ Pajek - Comprehensive Social Network Analysis, free for academic use - http://pajek.imfm.si/doku.php Subdue - Graph based data mining tool. Copyright but freely downloadable - http://ailab.wsu.edu/subdue/ More - http://en.wikipedia.org/wiki/Social_network_analysis_software

26 www.the-data-mine.co.uk Looking Forward  Lots and lots of network data out there  What about:  Applications for individuals  Social Applications  Applications within a University  Applications which makes money

27 www.the-data-mine.co.uk

28 Impact of Computers on SNA The rise in the power and use of computers has had two main impacts. 1.New data is available from logs of email conversations, phone calls, chat and website usage, facebook friends, tweets etc... 2.Computers can be employed for analysis and data mining.

29 www.the-data-mine.co.uk Role of computer analysis  Data collected about social networks can be complex and large.  Imagine a network documenting each purchase you've made using a credit/debit card, every phone call and SMS, each email etc.  When these kinds of data are collected over large populations, the resulting graphs are much too large to be understood by eye.


Download ppt "Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques."

Similar presentations


Ads by Google