Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

1 SESSION 3 FORMAL ASSESSMENT TASKS CAT and IT ASSESSMENT TOOLS.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
One Theme in All Views: Modeling Consensus Topics in Multiple Contexts Jian Tang 1, Ming Zhang 1, Qiaozhu Mei 2 1 School of EECS, Peking University 2 School.
Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Social Media Mining Chapter 5 1 Chapter 5, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.
An Analysis of Social Network-Based Sybil Defenses Sybil Defender
1 Yuxiao Dong *$, Jie Tang $, Sen Wu $, Jilei Tian # Nitesh V. Chawla *, Jinghai Rao #, Huanhuan Cao # Link Prediction and Recommendation across Multiple.
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
Personalized Search Result Diversification via Structured Learning
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.
Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,
Review of Lecture Two Linear Regression Normal Equation
1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.
Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Online Detection of Change in Data Streams Shai Ben-David School of Computer Science U. Waterloo.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Privacy risks of collaborative filtering Yuval Madar, June 2012 Based on a paper by J.A. Calandrino, A. Kilzer, A. Narayanan, E. W. Felten & V. Shmatikov.
Popularity-Aware Topic Model for Social Graphs Junghoo “John” Cho UCLA.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Event Detection using Customer Care Calls 04/17/2013 IEEE INFOCOM 2013 Yi-Chao Chen 1, Gene Moo Lee 1, Nick Duffield 2, Lili Qiu 1, Jia Wang 2 The University.
Confidential. This presentation is provided for the recipient only and cannot be reproduced or shared without Fair Isaac Corporation's express consent.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Data Mining and Machine Learning Lab Unsupervised Feature Selection for Linked Social Media Data Jiliang Tang and Huan Liu Computer Science and Engineering.
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
Date: 2012/4/23 Source: Michael J. Welch. al(WSDM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Topical semantics of twitter links 1.
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign User Profiling in Ego-network: Co-profiling Attributes and Relationships.
CHAPTER 2 Statistical Inference, Exploratory Data Analysis and Data Science Process cse4/587-Sprint
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Nicholas D. Lane, Hong Lu, Shane B. Eisenman, and Andrew T. Campbell Presenter: Pete Clements Cooperative Techniques Supporting Sensor- based People-centric.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
A Study of Smartphone User Privacy from the Advertiser's Perspective Yan Wang 1, Yingying Chen 1, Fan Ye 2, Jie Yang 3, Hongbo Liu 4 1 Department of Electrical.
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.
SocialTube: P2P-assisted Video Sharing in Online Social Networks
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
A Latent Social Approach to YouTube Popularity Prediction Amandianeze Nwana Prof. Salman Avestimehr Prof. Tsuhan Chen.
Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes ∗ Source: VLDB.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Unsupervised Streaming Feature Selection in Social Media
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
Inferring User Interest Familiarity and Topic Similarity with Social Neighbors in Facebook INSTRUCTOR: DONGCHUL KIM ANUSHA BOOTHPUR
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Mining Utility Functions based on user ratings
Event Detection using Customer Care Calls
Summary Presented by : Aishwarya Deep Shukla
Collective Network Linkage across Heterogeneous Social Platforms
Integrating Meta-Path Selection With User-Guided Object Clustering in Heterogeneous Information Networks Yizhou Sun†, Brandon Norick†, Jiawei Han†, Xifeng.
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Community-based User Recommendation in Uni-Directional Social Networks
RECOMMENDER SYSTEMS WITH SOCIAL REGULARIZATION
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Privacy Protection for Social Network Services
Probabilistic Latent Preference Analysis
GANG: Detecting Fraudulent Users in OSNs
Mixed Up Multiplication Challenge
Presentation transcript:

Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign

Users’ Locations are important for many information services and many others. Lives in: Los Angeles 2 Carol User Social Network Content Provider Local Content Recommendation Local Friends Recommendation

Community has explored social network and content to profile users’ locations. Profiling a User’s Home Location Location: Los Angeles Tweets Terrible LA traffic! Want to go to Honolulu for Spring vacation! See Gaga in Hollywood. Good Morning! Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ? Social Network 3

Problem 1 They only profile a single home location. Locations of a user’s friends Locational WordFrequencies Paramount1 Los Angeles1 Hollywood2 Austin2 Tweeted Locational Words Carol lives Los Angeles and studied at Uni. of Texas at Austin Uni. of Texas at Austin o incomplete o inaccurate 4

5 Problem 2 They totally miss profiling relationships. Relationships Profiling Carol follows Bob Carol follows Lucy Carol tweets Hollywood both Carol and Lucy studied at Austin Carol lives Los Angeles both Carol and Bob work at Los Angeles o useful !

We focus on multiple location profiling for users and relationships. Carol in Real-world Location: Los Angeles Education: Uni. of Texas at Austin Uni. of Texas at Austin Terrible LA traffic! Want to go to Honolulu for Spring vacation! See Gaga in Hollywood. Good Morning! Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ? Carol’s Location Profile: Los Angeles, Austin Carol follows Lucy: Austin, Austin 6

Our approach is to build a model to connect known relationships with unknown locations. Known Relationships Following Relationships Carol follows Lucy Carol follows Mike …. Tweeting Relationships Carol tweets Hollywood Carol tweets Honolulu …. Users’ Locations ? Unknown Locations 7 MLP Model Generation Model Inference Algorithm

 Challenge 1 How to connect users’ locations with relationships? A.from users’ locations to following relationships B.from users’ locations to tweeting relationships  Challenge 2 How to model that the relationships are mixed? A.some relationships are not based on locations. B.each relationship is based on a different location.  Challenge 3 How to utilize home locations from labeled users? There are three challenges for building MLP. 8

Challenge 1.A We need to connect following relationships with two users’ locations. 9 Even a user has only one location follows others from different locations. Tweeting Probability Carol at Los Angeles follows Bob in San Diego. 20% Carol at Los Angeles follows Mike in Los Angeles. 30% … The following probability as the probability generating a following relationship from a user to another user based on their locations

10 Observation We explore following probability via investigating a corpus It captures our intuition well. It fits a power law distribution.

11 Solution: We derive location-based following model for following probability. The location-based following model

12 Challenge 1.B We need to connect tweeting relationships with a user’s location. User at a location tweets different locations. The tweeting probability as the probability generating a tweeting relationship from a user to a venue based on a location Probability of Tweeting Carol at Los Angeles tweets about watching a show in Hollywood. 30% Carol at Los Angeles tweets about traffic in Los Angeles. 40% …

They capture our intuition well. They can be modeled as a set of multinomial distributions. 13 Observation We explore tweeting probability via investigating a corpus.

14 Solution: We derive location-based tweeting model for tweeting probability. The location-based tweeting model

Noisy relationships are not useful! Noisy Relationships Carol follows Lady Gaga Carol tweets Honolulu Location-based Relationshipsb Carol follows Lucy Carol tweets Los Angeles 15 Challenge 2.A There are both noisy and location-based relationships.

16 Solution: We propose a mixture component for two types of relationships. 1.A relationship is generated based on either a location-based model or a random model. 2.A binary model selector μ indicates which model is used. 3.The selector is generated via a binomial distribution

17 Challenge 2.B Location-based relationships are related to multiple locations. Location-based relationships Carol follows Lucy Carol tweets Hollywood Accurate! Complete! both Carol and Lucy studied at Austin Carol lives Los Angeles

Solution: We fundamentally model users multiple locations in generating relationships. Carol {Los Angels 0.1, Austin 0.1, … } 18 Location profile as a multinomial distribution over locations. Each relationship is based on one particular location from his profile.

Challenge 3 We should utilize observed locations from some users’ profiles. Mike LA Carol ? Lucy Austin Gaga NY Bob San Diego Jean ?  they are useful for profiling locations!  we cannot use them directly to generate relationships! 19 20% users provide their home locations in their profiles.

Solution: We utilize observed locations from as priors to generate users’ profiles. Bob {San Diego 0.9, Los Angels 0.05, …} We assume users profiles are generated prior distributions. Home locations of users are likely to be generated.

Therefore, we arrive a complete model. 21

 We crawled a subset of Twitter.  There are 139K users, 50 million tweets and 2 million following relationships. We evaluate our model on a large Twitter corpus. 22

Task 1 profiling users’ home locations, MLP performs accurately and improves baselines. 23

Task 2 profiling users’ multiple locations, MLP proforms accurately and completely. Precision and Recall at Rank 2 Case Studies Locations in a similar region Locations in different areas Accurately Completely 24

Task 3 profiling following relationships, MLP achieves 57% accuracy. 25

26 Thanks and Questions !