The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign User Profiling in Ego-network: Co-profiling Attributes and Relationships.

Slides:



Advertisements
Similar presentations
What Well Cover In person meetings. Using online social networks. Tasks!
Advertisements

Multi-label Relational Neighbor Classification using Social Context Features Xi Wang and Gita Sukthankar Department of EECS University of Central Florida.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Markov Logic Networks: Exploring their Application to Social Network Analysis Parag Singla Dept. of Computer Science and Engineering Indian Institute of.
Indian Statistical Institute Kolkata
Social Media Mining Chapter 5 1 Chapter 5, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.
Communities in Heterogeneous Networks Chapter 4 1 Chapter 4, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool,
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
INFO 624 Week 3 Retrieval System Evaluation
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,
Creating and Protecting your Personal Brand with Social Media ISM FLORIDA GULF COAST April 18,
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Models of Influence in Online Social Networks
Social Network Analysis via Factor Graph Model
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Survey Of Music Information Needs, Uses, and Seeking Behaviors Jin Ha Lee J. Stephen Downie Graduate School of Library and Information Science University.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Protecting Sensitive Labels in Social Network Data Anonymization.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
Google News Personalization: Scalable Online Collaborative Filtering
Uncovering Overlap Community Structure in Complex Networks using Particle Competition Fabricio A. Liang
The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski Status Report April 14, 2006.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Stratified K-means Clustering Over A Deep Web Data Source Tantan Liu, Gagan Agrawal Dept. of Computer Science & Engineering Ohio State University Aug.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Data Mining and Decision Support
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign Understanding Web Query Interfaces: Best-Efforts Parsing with Hidden Syntax.
CS 590 Term Project Epidemic model on Facebook
Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes ∗ Source: VLDB.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
Scalable Learning of Collective Behavior Based on Sparse Social Dimensions Lei Tang, Huan Liu CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/02/01.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Applying Link-based Classification to Label Blogs Smriti Bhagat, Irina Rozenbaum Graham Cormode.
Alan Mislove Bimal Viswanath Krishna P. Gummadi Peter Druschel.
Uncovering Social Spammers: Social Honeypots + Machine Learning
What Is Cluster Analysis?
Model Discovery through Metalearning
Chapter 7. Classification and Prediction
Artificial Neural Networks
Collective Network Linkage across Heterogeneous Social Platforms
Dieudo Mulamba November 2017
Author: Kazunari Sugiyama, etc. (WWW2004)
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
WHO ARE YOU?...HONESTLY! A study on inferring missing attributes in social networks Zeinab Mahdavifar Advisor: Prof. Martine De Cock.
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
GANG: Detecting Fraudulent Users in OSNs
Graph Neural Networks Amog Kamsetty January 30, 2019.
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Joint Label Inference in Networks
Presentation transcript:

The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign User Profiling in Ego-network: Co-profiling Attributes and Relationships Rui Li, Chi Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign

User Profiling, which infers users’ attributes, is important for Personalized Services 2 and many others. Personalized Search Targeted Advertisement Search Engines Advertisers Richard User College: UIUC Location: Champaign

User Profiling is crucial for Social Analysis – Ability to survey the world Surveying people for behavior: How do college students like iPad vs. Galaxy? How do California age 50+ males like ObamaCare? Surveying behavior for people: What demographics of users like Samsung more than Apple? What communities of people support ObamaCare? 3

Can we profile users’ missing attributes in social network? 4 Some users provide attributes in their online profiles Some users’ attributes are missing Employer: Yahoo! College: Stanford Employer: ? College:? Employer: Yahoo! College: Berkeley Employer: Twitter College: Berkeley Employer: ? College:? Employer: Twitter College: UIUC Employee: ? College:? Employer: Google College: UIUC Employee: JP Morgan College: UIUC Employer: ? College:?

Thus, we abstract our problem as profiling users' attributes based on friends’ attributes 5 Input: a network G(V, E), some users’ attributes Output: users’ attributes Employer: Yahoo! College: Stanford Employer: Yahoo! College: Berkeley Employer: Twitter College: Berkeley Employer: ? College:? Employer: Twitter College: UIUC Employer: ? College:? Employer: JP Morgan College: UIUC Employer: ? College:? Employer: Yahoo! College: UIUC

While attributes may “propagate” across links— Links are very noisy. 6 Existing methods simply assume that two connected users share the same value for any attribute Employer: Yahoo! College: Stanford Employer: ? College:? Employer: Yahoo! College: Berkeley Employer: Twitter College: Berkeley Employer: ? College:? Employer: Twitter College: UIUC Employer: ? College:? Employer: JP Morgan College: UIUC Employer: ? College:? However, users connect to friends with different values for an attribute Employer: Google College: UIUC About 11% friends share the employer and 18% friends share the college. Only 20% may have attributes.

Why noisy? Every link is for a (different) relationship! 7 Richard and Bob share the same employer, but may have different values for other attributes. Richard and Cindy share the same college, but may have different values for other attributes. Richard and Peter share the same interests, but may have different values for other attributes. Richard Bob Colleagues Cindy Peter College classmates Club friends Users have different types of relationships in real life.

On the other hand, Relationship Profiling is necessary by itself, and similarly challenged! Link: Why does a link happen?  Given a link, what friendship does it represent? Circle: Who form what circles?  Where are my circles?  What does each circle represent? Challenge: While links/circles depend on attributes to detect and to explain, attributes are often unknown. 8

Proposal: Co-profiling Attributes and Relationships Attributes– properties of nodes Relationships– properties of links Together, understanding both nodes and links. Why together? 1. Necessity: Dependency on each other to decide. 2. Benefit: Useful to know both! 9 classmates Employer: Google College: UIUC Employer: Yahoo! College: Berkeley colleagues College: UIUC Employer: Yahoo! Missing

10 But how? Observing how attributes and relationships relate.

Insight: Correlation between attributes and connections through relationship 11 Discriminative Correlation Insight : Attributes and connections are discriminatively correlated via a hidden factor -- relationship To concretize our insight, we explore two dependencies based on a real-world user study. Attribute-Relationship Dependency: How users’ attributes are related to hidden relationship types? Connection-Relationship Dependency: How connections are related to hidden relationship types?

Observation #1: Attribute-Relationship Dependency Friends do not share all attributes. What attributes they share depend on relationship. 12 The percentages of friends sharing the same value with the ego for different attributes overall of different relationship types.

Observation #2: Connection-Relationship Dependency Friends do not connect to all friends. What friends they connect to depend on relationship. 13 The average connections per user within and across three different relationships types

f 3 = Specifically, we focus on co-profiling upon each user’s ego-network 14 Ego-network : a subnet that around an individual user. Circle1: friends likely to share employee Circle 2: friends likely to share college Circle 3: friends likely to share other attribute Employer: Yahoo! College: Stanford Employer: ? College:? Employer: Yahoo! College: Berkeley Employee: Twitter College: Berkeley Employer: ? College:? Employer: Twitter College: UIUC Employer: ? College:? Employer: Google College: UIUC Employer: Yahoo College: UIUC Attribute Vector f 1 = Circle Assignment x 1 =1 x 3 =1 Association Vector w 1 = w 2 = f 4 = x 4 =2

Solution Overview: we realize co-profiling in an optimization framework 15 Unobserved Friends’ circles Observed User Connections Partially Observed User Attributes Cost Function: capture the dependences between the variables based on the insight Algorithm: finds the unknown variable that best satisfy the dependences

Cost Function: we design a cost function to model the dependencies between variables 16 Attribute-Relationship (circle) Dependency Connection-Relationship Type (circle) Dependency There are other formulas to model the dependencies. However, the function can not be optimized directly, as there are both discrete and continuous variables

Algorithm: we minimize the function via updating each group of variables 17 Update User Attribute Vectors F Update User Circle Assignments X Update Circle Association Vectors W Only propagate values from friends in the same circles Only propagate the attribute value associated with the circle Cosider both user’s attributes and connections Make association vector sparse

Experiment: we first collect real-world ego- networks to evaluate our data set We conduct user studies to collect users’ attributes and relationship types (circles) from LinkedIn. 18 Ego UsersUsersConnections 17519K110K We share the data online  LinkedinCrawl-Jan LinkedinCrawl-Jan2014 Most users are have three attributes 8K connection are labeled

Experiment: we evaluate our algorithm on both attribute and relationship type profiling Attribute Profiling  AP w : a classic collective classification approach, which profiles a node’s label using weighted votes from its neighbors.  AP i : anther collective classification (semi-supervised learning) approach, which iteratively profiles nodes’ labels with AP w.  AP c : a state-of-art method, which profiles users’ attributers based on clustering network. Relationship Type (circle) profiling  RP a : profiles friends’ circles based on their attributes.  RP n : profiles friends’ circles based on network structure  RP an : profiles friends’ circles based on network and attributes, but assumes attributes known. 19

CP is not only capable of profiling AP and RP and but also outperforms baselines for both 20

Summary: we made the following contributions in this problem We propose a co-profiling approach that jointly profiles users’ attributes and relationship types (circles) in ego networks. We present the discriminative correlation insight to capture the correlation between attributes and social connections. We conduct extensive experiments to evaluate our algorithms on two tasks based on real-world ego networks. 21

22 Thank You!