Location Mining from Online Social Networks

Slides:



Advertisements
Similar presentations
Music Recommendation by Unified Hypergraph: Music Recommendation by Unified Hypergraph: Combining Social Media Information and Music Content Jiajun Bu,
Advertisements

BiG-Align: Fast Bipartite Graph Alignment
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
The Importance of Social Media. Some facts and statistics: Nearly 1 out of every 5 minutes online is spent on social media Facebook reached 1.11 billion.
{ Trends in Social Network M. Tech Project Presentation By : Pranay Agarwal 2008CS50220 Guides : Amitabha Bagchi Maya Ramanath.
Understanding Cancer-based Networks in Twitter using Social Network Analysis Dhiraj Murthy Daniela Oliveira Alexander Gross Social Network Innovation Lab.
Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
Is Social Media Being Used Properly to Sell Cars? Bryan Harmon Full Name | Company | Job Title | .
Link creation and profile alignment in the aNobii social network Luca Maria Aiello et al. Social Computing Feb 2014 Hyewon Lim.
Privacy in Social Networks CSCE 201. Reading Dwyer, Hiltz, Passerini, Trust and privacy concern within social networking sites: A comparison of Facebook.
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Web 2.0 Web 2.0 is the term given to describe a second generation of the World Wide Web (WWW) that is focused on the ability for people to collaborate.
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New.
Anthony Bonomi, Amber Heeg, Elizabeth Newton, Bianca Robinson & Marzi Shabani.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Introduction to Data Mining Engineering Group in ACL.
A glimpse on social influence and link prediction in OSNs
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
Privacy and trust in social network
Intelius-NYU Cold Start System Ang Sun, Xin Wang, Sen Xu, Yigit Kiran, Shakthi Poornima, Andrew Borthwick (Intelius Inc.) Ralph Grishman (New York University)
Data mining and machine learning A brief introduction.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Short Text Understanding Through Lexical-Semantic Analysis
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
“Study on Parallel SVM Based on MapReduce” Kuei-Ti Lu 03/12/2015.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
TWITTER What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea.
Voice of Pakistan Submitted by : Ghulam mujtaba Term project (Individual ) Social Computing Application MS (CS), IBA Karachi. Submitted to : Dr. Zaheerddin.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Complex Network Theory – An Introduction Niloy Ganguly.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Complex Network Theory – An Introduction Niloy Ganguly.
Measuring Behavioral Trust in Social Networks
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wen-Ning Kuo and Vincent S. Tseng Institute of Computer Science.
Project Seminar on STABLE CLUSTERING ALGORITHM TO IDENTIFY CPU USAGE OF COMPUTERS BEHAVIOR IN GRID ENVIRONMENT Under the guidance of Prof. Lakshmi Rajamani.
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
KNN CF: A Temporal Social Network kNN CF: A Temporal Social Network Neal Lathia, Stephen Hailes, Licia Capra University College London RecSys ’ 08 Advisor:
Data Structures and Algorithms in Parallel Computing Lecture 3.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Dr. Bhavani Thuraisingham September 25, 2015 Analyzing and Securing Social Media Location Mining in Social Networks.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Unsupervised Streaming Feature Selection in Social Media
TWinner : Understanding News Queries with Geo-content using Twitter Satyen Abrol,Latifur Khan University of Texas at Dallas,Department of Computer Science.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.
Clustering Machine Learning Unsupervised Learning K-means Optimization objective Random initialization Determining Number of Clusters Hierarchical Clustering.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Trustworthy Semantic Webs Building Geospatial Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas October 2006 Presented at OGC Meeting,
Cohesive Subgraph Computation over Large Graphs
Data and Applications Security Developments and Directions
CS7280: Special Topics in Data Mining Information/Social Networks
Latent Space Model for Road Networks to Predict Time-Varying Traffic
Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Mingzhen Mo and Irwin King
Algorithms Lecture # 27 Dr. Sohail Aslam.
Presentation transcript:

Location Mining from Online Social Networks Satyen Abrol Advisors: Dr. Latifur Khan Dr. Bhavani Thuraisingham

Location Mining in Online Social Networks What is the city level home location of a user?

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Why is Location Important? Privacy and Security Trustworthiness Location Driven Mining for Business Location-Based Social Networking to generate US $21.14 billion by 20151 But only ~14.3% provide it explicitly2 1 According to New Report by Global Industry Analysts, Inc., (GIA) (http://www.strategyR.com/) 2 According to an experiment performed by us on 1 million users

Twitter - Basics Location # of Followers # of Following # of Tweets Maximum 140 Characters

Why is location so important?

Privacy and Security Losing locational privacy forever Users leave field blank, don’t want strangers to know their locations http://pleaserobme.com/

Trustworthiness Trustworthiness is important in such cases To be able to trust/verify the correctness of location mentioned in user profile Corporate companies use social media for better advertising and marketing Iran Elections of 2009 US State Department used Twitter as a source Trustworthiness is important in such cases

Marketing and Business Large corporations Walmart, Starbucks, United Airlines use social media Great tool for inexpensive advertising Getting feedback from users

The Problem Leave the location field blank in their Twitter profiles Do not provide valid geographic information “Justin Biebers heart”, “NON YA BISNESS!!”, “looking down on u people” Provide incorrect locations which may actually exist in real world “Nothing” in Arizona, “Little Heaven” in Connecticut Provide several locations, difficult to identify the home location “CALi b0Y $TuCC iN V3Ga$” – California boy stuck in Las Vegas, NV (~35%) enter just country, state, county, etc. and no city level locations1 B. Hecht, L. Hong, B. Suh, E. H. Chi, “Tweets from justin biebers heart: the dynamics of the location field in user profiles”, In SIGCHI ’11.

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Location Prediction in Social Networks Two Approaches Content Based1,2 Using Social Graph3,4,5 Z. Cheng, J. Caverlee, and K. Lee, “You are where you tweet: A content-based approach to geo-locating twitter users”. In CIKM ’10. B. Hecht, L. Hong, B. Suh, E. H. Chi, “Tweets from justin biebers heart: the dynamics of the location field in user profiles”, In SIGCHI ’11. S. Abrol, L. Khan and B. Thuraisingham,“Tweeque: Spatio-Temporal Analysis of Social Networks for Location Mining Using Graph Partitioning,” The First ASE/IEEE International Conference on Social Informatics, December 14-16, 2012, Washington D.C., USA. S. Abrol., L. Khan and B. Thuraisingham “Tweecalization: Efficient and intelligent location mining in Twitter using semi-supervised learning,” 8th IEEE International Conference on Collaborative Computing, October 14–17, 2012 Pittsburgh, Pennsylvania. S. Abrol., L. Khan, “Agglomerative clustering on fuzzy k-closest friends with variable depth for location mining,” The Second IEEE International Conference on Social Computing (SocialCom2010), Aug 20-22, 2010 Minneapolis, Minnesota.

Content Based Approach Inaccurate – Location in Text not Location of User Involves Ambiguity: Paris can mean Paris Hilton Paris, the capital of France Paris, a town in Texas Slow – Uses NLP/ Machine Learning techniques, searches gazetteers

Using Social Graphs Based on Japanese Proverb - “When the character of a man is not clear to you, look at his friends.” Relationship between geospatial proximity and friendship Uses classical data mining algorithms for more accurate results Faster and can be used for real world applications

Geospatial Proximity and Friendship Form 1012 Twitter user pairs and identify geo distance Curve follows power law, curve of form a(x+b)-c with exponent of -0.87

Graph Construction Vertices (data points) represents users Edge represents ‘similarity’ between two users Deal with special cases Spammers – follow random people Celebrities – followed by random people Edge weight gets abbreviated

Defining Edge Weight Consists of two components: Trustworthiness (TW) Mutual Friends (MF)

Trustworthiness Fraction of friends which have the same label as the user himself Intuition: A person who has stayed at the same place all his life will have most friends from same location and hence high trustworthiness Location : Seattle/WA/USA A B C D E F G H I J Location : Seattle/WA/USA Location : Seattle/WA/USA Trustworthiness: 0.6 Friend Location:Seattle/WA/USA Location : Seattle/WA/USA Location : Seattle/WA/USA Location : Seattle/WA/USA

Mutual Friends Chose number common friends for similarity Better Accuracy Low Time Complexity

Defining Edge Weight Defined as Weightij=α×Max{TW(Ui), TW(Uj)} + (1- α) × MFij 0<α<1, typically chosen to be around 0.7

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Tweethood: Fuzzy k-Closest Friends with Variable Depth Choose k “closest” friends for the user If location is not found look further for the answer Each node is defined by a vector having locations with their respective probabilities Boost and Aggregate at each step Satyen Abrol, Latifur Khan, “TweetHood: Agglomerative Clustering on Fuzzy k-Closest Friends with Variable Depth for Location Mining”. In Proc. of the Second IEEE International Conference on Social Computing (SocialCom-2010), Minneapolis, USA, August 20-22, 2010

Agglomerative Clustering Don’t want to find just any location Want a location or group of locations with some confidence Tradeoff between number of locations, distance between concepts, and total confidence Construct matrix at each step with Objective Function of the above attributes. Choose concepts with maximum values Continue till we cross threshold

Find the location of John Doe

Social Network of John Doe Friend 1 Friend 2 Friend 3 Friend n CB1 CB2 CB3 CBn

Choose k closest friends of John Doe Friend k CB1 CB2 CB3 CBk

LOW ACCURACY Identify Locations Location : NULL Friend 1 Friend 2 Friend 3 Friend k Location : NULL CB1 LOW ACCURACY Location : Seattle, USA CB2 CB3 Location : NULL CBk Location : NULL

What if we have depth=2 ? CB1 CB2 CB3 CBk Location : Seattle/WA/USA G H I J Location : Seattle/WA/USA Location : NULL Location : NULL Location : Dallas/TX/USA Friend 1 Friend 2 Friend 3 Friend k Location : NULL Location : Sydney/AU CB1 Location : Dallas/TX/USA CB2 Location : NULL Location : Richardson/TX/USA CB3 Location : NULL CBk

Location Vector for John Doe’s friends Dallas/TX/USA 0.4 Seattle/WA/USA 0.2 Richardson/TX/USA 0.2 Sydney/AU 0.2 Friend 1 Friend 2 Friend 3 Friend k CB1 Dallas/TX/USA 0.33 New Delhi/Delhi/India 0.33 Sunnyvale/CA/USA 0.33 CB2 CB3 Austin/TX/USA 0.50 Minneapolis/MN/USA 0.50 CBk Plano/TX/USA 0.25 Boulder/CO/USA 0.25 Salt Lake City/UT/USA 0.25 London/London/GB 0.25

Location Vector for John Doe Dallas/TX/USA 0.1825 Seattle/WA/USA 0.05 Richardson/TX/USA 0.05 Sydney/AU 0.05 New Delhi/Delhi/IN 0.0825 Sunnyvale/CA/USA 0.0825 Austin/TX/USA 0.125 Minneapolis/MN/USA 0.125 Plano/TX/USA 0.0625 Boulder/CO/USA 0.0625 Salt Lake City/UT/US 0.0625 London/GB 0.0625

Agglomerative Clustering John Doe Dallas/TX/USA 0.1825 Seattle/WA/USA 0.05 Richardson/TX/USA 0.05 Sydney/AU 0.05 New Delhi/Delhi/IN 0.0825 Sunnyvale/CA/USA 0.0825 Austin/TX/USA 0.125 Minneapolis/MN/USA 0.125 Plano/TX/USA 0.0625 Boulder/CO/USA 0.0625 Salt Lake City/UT/US 0.0625 London/GB 0.0625

Agglomerative Clustering John Doe {Dallas, Plano, Richardson}/TX/USA 0.295 Seattle/WA/USA 0.05 Sydney/AU 0.05 New Delhi/Delhi/IN 0.0825 Sunnyvale/CA/USA 0.0825 Austin/TX/USA 0.125 Minneapolis/MN/USA 0.125 Boulder/CO/USA 0.0625 Salt Lake City/UT/US 0.0625 London/GB 0.0625

Tweethood: Algorithm

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Tweecalization: Label Propagation But the availability of users with location is limited Most of users do not have a location Need a method that can learn from unlabeled data Satyen Abrol, Latifur Khan and Bhavani Thuraisingham, “Tweecalization: Efficient and Intelligent location mining in Twitter using semi- supervised learning,” 8th IEEE International Conference on Collaborative Computing, October 14–17, 2012, Pittsburgh, Pennsylvania

Tweecalization: Label Propagation Ideal scenario for semi supervised learning: Only a few friends with locations(labeled data)1 Use both labeled and unlabeled data for training Points which are close to each other are more likely to share a label Y. Bengio, O. Dellalleau, and N. L. Roux, “Label propagation and quadratic criterion,” In O. Chapelle, B. Schlkopf and A. Zien (Eds.), Semi-supervised learning. MIT Press, 2006.

Label Propagation: An Illustration “CLAMPED LOCATIONS” Central User Friends with location Friends without location ?

Tweecalization: Algorithm

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

What About Temporal Analysis? None of the existing works do temporal analysis What about migration/ geographical mobility?

Migration/Geographical Mobility 4% to 6% every year, means 12 to 17 million each year United States Census Bureau - Geographical Mobility/Migration Data - http://www.census.gov/hhes/migration/

Migration/Geographical Mobility Migration as a function of age People aged 20-29 have a higher probability to move High Migration Rate: College and Jobs Low Migration Rate: Old age, people settle down United States Census Bureau - Geographical Mobility/Migration Data - http://www.census.gov/hhes/migration/

Facebook Users and Mobility Let us look at the cumulative effect Only 28% to 37% are currently living in their hometown Based on our experiments on 300k Public Facebook Profiles

Twitter Users and Mobility Linking Twitter users to migration 33% of all Twitter users are aged 25-34 years Based on our findings by [1] ABI Research. Online. Available: http://www.abiresearch.com

Tweeque: Graph Partitioning How do we know if “this” is the current location for a user? How do we perform temporal analysis of friendships? Propose a technique that indirectly infers the current location Satyen Abrol, Latifur Khan and Bhavani Thuraisingham,“Tweeque: Spatio-Temporal Analysis of Social Networks for Location Mining Using Graph Partitioning,” The First ASE/IEEE International Conference on Social Informatics, December 14-16, 2012, Washington D.C., USA.

Observation 1: Social Cliques and Location Our definition: A social clique is an inclusive group of people that share friendship Apart from friendship, what is the attribute that links members of a clique? Individual Locations All members of a clique were or are at a particular geographical location at a particular instant of time like college, school, a company, etc.

Observation 2: Migration and Time As shown previously over course of time, people have tendency to migrate Based on these two observations we hypothesize If we can divide the social graph of a particular user into cliques and check for location based purity of the cliques, we can accurately separate out his current location from previous locations. Migration is our latent time factor

Friends from high school in Dallas Friends from college in Boston Tweeque: An example Friend 1 Friend 2 Friend 3 Friend 4 Friend 5 Friend 6 Friend 7 Friend n Friends from high school in Dallas Friends from college in Boston Relatives/Cousins Friends from job in Seattle

Tweeque: An example All Friends of the User

Tweeque: An example Social Clique #1 (High School) Social Clique #2 (College) Social Clique #3 (Current Work) Social Clique #4 (Relatives)

Tweeque: An Example High School College Relatives Work Dallas/TX/USA Boston/MA/USA Singapore Seattle/WA/USA Seattle/WA/USA Portland/OR/USA Sydney/Australia Seattle/WA/USA Dallas/TX/USA Austin/TX/USA Dallas/TX/USA Dallas/TX/USA San Diego/CA/USA Boston/MA/USA Dallas/TX/USA Seattle/WA/USA New York/NY/USA Dallas/TX/USA Ontario/Canada Redmond/WA/USA Purity (Dallas) = 0.32 Purity (Boston) = 0.45 Purity (Dallas) = 0.18 Purity (Seattle) = 0.69

Tweeque: Graph Partitioning  

Tweeque: Graph Partitioning   J. Shi and J. Malik, “Normalized Cuts and Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.

Tweeque: Graph Partitioning  

Tweeque: Algorithm

Tweeque: Purity Voting

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Experiment Data Randomly choose 1000 Twitter users

Experiments and Results 75.5% for city level prediction 80.1% for country level prediction We observe that the accuracy saturates after depth 4 Six degrees of separation is the idea that everyone is on average approximately six steps away, by way of introduction, from any other person in the world` For Twitter this distance is found to be 4.67

Comparison of Different Approaches Tweethood1 Tweecalization2 Tweeque3 Content Based4 Accuracy (City) 72.1% 75.5% 76.3% 35.6% - 51% Accuracy (Country) 80.1% 84.9% 52.3% Complexity O(n) O(n3) N/A Temporal Analysis No Yes Satyen Abrol, Latifur Khan, “TweetHood: Agglomerative Clustering on Fuzzy k-Closest Friends with Variable Depth for Location Mining”. In Proc. of the Second IEEE International Conference on Social Computing (SocialCom-2010), Minneapolis, USA, August 20-22, 2010 (Nominated for best paper award, Acceptance Rate:13%) Satyen Abrol, Latifur Khan and Bhavani Thuraisingham, “Tweecalization: Efficient and Intelligent location mining in Twitter using semi- supervised learning,” 8th IEEE International Conference on Collaborative Computing, October 14–17, 2012, Pittsburgh, Pennsylvania Satyen Abrol, Latifur Khan and Bhavani Thuraisingham,“Tweeque: Spatio-Temporal Analysis of Social Networks for Location Mining Using Graph Partitioning,” The First ASE/IEEE International Conference on Social Informatics, December 14-16, 2012, Washington D.C., USA. Z. Cheng, J. Caverlee, and K. Lee, “You are where you tweet: A content-based approach to geo-locating twitter users”. In CIKM ’10.

Outline Introduction and Problem Statement Different Approaches Social Graph Based: Our Approaches Tweethood: Fuzzy k – Closest Friends with Variable Depth Tweecalization: Label Propagation Tweeque: Graph Partitioning for Spatio-Temporal Analysis Experiments and Results Future Work

Contributions Developed three graph based location mining algorithms for online social networks Maps location mining problem to k-nearest neighbor, semi supervised and graph partitioning problem Outperform content based approach in time and accuracy Relationship between geospatial proximity and friendship Effect of geographical mobility on current location of users

Future Work Combining Content and Graph based methods Score based geo-tagging technique1 Associating keywords with locations to build probabilistic model: “cowboys”  Dallas, “casino”  Las Vegas Since tweets have timestamps, it leads to more accurate prediction of current location 1 Satyen Abrol, Latifur Khan, Tahseen Al-khateeb, “MapIt: Smarter Searches using Location Driven Knowledge Discovery and Mining”, In Proc. of 1st SIGSPATIAL ACM GIS 2009 International Workshop on Querying and Mining Uncertain Spatio-Temporal Data (QUeST), Nov 2009, Seattle.

Future Work Improve scalability of current algorithms using cloud computing framework Each of the friends of a user is handled by a separate node in the distributed environment Micro-level location identification Identify specific points of interests (POIs) such as restaurants, place of work, etc from tweets Identify comfort zone for a user Use Foursquare check-in dataset: over 30 million POIs all over the world

Publications Satyen Abrol, Latifur Khan and Bhavani Thuraisingham,“Tweeque: Spatio-Temporal Analysis of Social Networks for Location Mining Using Graph Partitioning,” The First ASE/IEEE International Conference on Social Informatics, December 14-16, 2012, Washington D.C., USA. Satyen Abrol, Latifur Khan and Bhavani Thuraisingham, “Tweecalization: Efficient and Intelligent location mining in Twitter using semi- supervised learning,” 8th IEEE International Conference on Collaborative Computing, October 14–17, 2012, Pittsburgh, Pennsylvania Satyen Abrol, Latifur Khan, “TweetHood: Agglomerative Clustering on Fuzzy k-Closest Friends with Variable Depth for Location Mining”. In Proc. of the Second IEEE International Conference on Social Computing (SocialCom-2010), Minneapolis, USA, August 20-22, 2010 (Nominated for best paper award, Acceptance Rate:13%)

Publications Satyen Abrol And Latifur Khan, “TWinner: Understanding News Queries With Geo-Content Using Twitter”. In Proc. of 6th Workshop on Geographic Information Retrieval (GIR'10) At Zurich, Switzerland. Satyen Abrol, Latifur Khan, Tahseen Al-khateeb, “MapIt: Smarter Searches using Location Driven Knowledge Discovery and Mining”, In Proc. of 1st SIGSPATIAL ACM GIS 2009 International Workshop on Querying and Mining Uncertain Spatio-Temporal Data (QUeST), Nov 2009, Seattle. Satyen Abrol, Latifur Khan, Vaibhav Khadilkar, Bhavani M. Thuraisingham, Tyrone Cadenhead, “Design and implementation of SNODSOC: Novel class detection for social network analysis”, ISI 2012: 215-220

Thank You! Questions?