Identifying Significant Locations Petteri Nurmi 1, Johan Koolwaaij 2 1) Helsinki Institute for Information Technology HIIT 2) Telematica Instituut (TELIN)

Slides:



Advertisements
Similar presentations
ENT4310 Business Economics and Marketing A six-step model for marketing research Arild Aspelund.
Advertisements

An Interactive-Voting Based Map Matching Algorithm
Word Spotting DTW.
Location Based Service Aloizio P. Silva Researcher at Federal University Of Minas Gerais, Brazil Copyright © 2003 Aloizio Silva, All rights reserved. School.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
Fault Tolerant Routing in Tri-Sector Wireless Cellular Mesh Networks Yasir Drabu and Hassan Peyravi Kent State University Kent, OH
SurroundSense: Mobile Phone Localization via Ambience Fingerprinting Written by Martin Azizyan, Ionut Constandache, & Romit Choudhury Presented by Craig.
IBM TJ Watson Research Center © 2010 IBM Corporation – All Rights Reserved AFRL 2010 Anand Ranganathan Role of Stream Processing in Ad-Hoc Networks Where.
Spatial Hypermedia and Augmented Reality
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Link Analysis, PageRank and Search Engines on the Web
CSE 222 Systems Programming Graph Theory Basics Dr. Jim Holten.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
Learning Transportation Mode from Raw GPS Data for Geographic Applications on the Web Yu Zheng, Like Liu, Xing Xie Microsoft Research.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
Clustering Unsupervised learning Generating “classes”
Time Series Data Analysis - II
Stream Clustering CSE 902. Big Data Stream analysis Stream: Continuous flow of data Challenges ◦Volume: Not possible to store all the data ◦One-time.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
Models of Influence in Online Social Networks
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,
Detecting Node encounters through WiFi By: Karim Keramat Jahromi Supervisor: Prof Adriano Moreira Co-Supervisor: Prof Filipe Meneses Oct 2013.
Friends and Locations Recommendation with the use of LBSN
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Network Aware Resource Allocation in Distributed Clouds.
Distributed Computing Rik Sarkar. Distributed Computing Old style: Use a computer for computation.
Chapter 14: SEGMENTATION BY CLUSTERING 1. 2 Outline Introduction Human Vision & Gestalt Properties Applications – Background Subtraction – Shot Boundary.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Video Streaming over Cooperative Wireless Networks Mohamed Hefeeda (Joint.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Kampala, Uganda, 23 June 2014 Applicability of the ITU-T E.803 Quality of service parameters for supporting service aspects Kwame Baah-Acheamfuor Chairman,
Clustering Personalized Web Search Results Xuehua Shen and Hong Cheng.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Advanced Spectrum Management in Multicell OFDMA Networks enabling Cognitive Radio Usage F. Bernardo, J. Pérez-Romero, O. Sallent, R. Agustí Radio Communications.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Freelib: A Self-sustainable Digital Library for Education Community Ashraf Amrou, Kurt Maly, Mohammad Zubair Computer Science Dept., Old Dominion University.
A new Ad Hoc Positioning System 컴퓨터 공학과 오영준.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
Advanced Technologies in Education Virtual Observatory 1 Virtual Observatory: D-Space Project Athens, 14 November 2004 Elena Tavlaki Head of Research Programs.
INFERRING HUMAN ACTIVITY FROM GPS TRACKS Sun Simiao.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
A Trust Based Distributed Kalman Filtering Approach for Mode Estimation in Power Systems Tao Jiang, Ion Matei and John S. Baras Institute for Systems Research.
Spectral Sequencing Based on Graph Distance Rong Liu, Hao Zhang, Oliver van Kaick {lrong, haoz, cs.sfu.ca {lrong, haoz, cs.sfu.ca.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
A SEMINAR REPORT ON CELLULAR SYSTEM Introduction to cellular system The cellular concept was developed and introduce by the bell laboratories in the.
Network Community Behavior to Infer Human Activities.
About Me Swaroop Butala  MSCS – graduating in Dec 09  Specialization: Systems and Databases  Interests:  Learning new technologies  Application of.
1 Jong Hee Kang, William Welbourne, Benjamin Stewart, Gaetano Borriello, October 2004, Proceedings of the 2nd ACM international workshop on Wireless mobile.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Ethnographic Interviews: Interviewing and Observing Users Project: Investigating Sakai 3 Capabilities to Support Learning Activities Jacqueline Mai 10/20/09.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
1 Travel Times from Mobile Sensors Ram Rajagopal, Raffi Sevlian and Pravin Varaiya University of California, Berkeley Singapore Road Traffic Control TexPoint.
Housekeeping –5 sets of aerial photo stereo pairs on reserve at SLC under FOR 420/520 –June 1993 photography.
Paper Presentation Social influence based clustering of heterogeneous information networks Qiwei Bao & Siqi Huang.
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
A K-Main Routes Approach to Spatial Network Activity Summarization(SNAS) Group 8.
Weighted Available Space Selection Fixing write pool selection Mattias Wadenstein Gerd Behrmann
RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng,
Housekeeping 5 sets of aerial photo stereo pairs on reserve at SLC under FOR 420/520 June 1993 photography.
CASE − Cognitive Agents for Social Environments
Automatic Segmentation of Data Sequences
Pei Lee, ICDE 2014, Chicago, IL, USA
Resource Allocation for Distributed Streaming Applications
Presentation transcript:

Identifying Significant Locations Petteri Nurmi 1, Johan Koolwaaij 2 1) Helsinki Institute for Information Technology HIIT 2) Telematica Instituut (TELIN)

1.Introduction and motivation 2.Algorithms for location clustering 3.Experiments

What are significant locations? 1.Raw location data is usually meaningless to a user  –Data is clustered with the goal of finding clusters that are somehow meaningful to a user –Meaningful clusters are called places or significant locations Examples: HOME, OFFICE, LIBRARY,... –Usually a two phase process: 1.Apply spatial clustering 2.Prune out meaningless clusters using temporal information 2.Existing work differs regarding the source of data: –Continuous GPS streams –GSM cell identifiers –WiFi access point information

More on data sources 1.Our focus: open, “multi-country” mobile environments 2.Practical issues with different sources: –GPS coordinates Continuous gathering infeasible and communication and processing costs infeasible –WiFi Locations of base stations must be known beforehand –GSM cell identifiers Can be done on device, but poor location accuracy 3.In our work, combine GPS and GSM data to achieve: –Better accuracy than with mere GSM identifiers –Resource costs higher than with GSM identifiers, but remain feasible to use in practice

Why are some locations significant? 1.The motivation comes from (social) identity theory: –A person acts in several societal categories (roles) –To each role, the person associates behavioral expectations  When acting in a specific role, the observed behavior is influenced by the behavioral expectations 2.The link to significant locations: –Many locations, such as work and home are boundaries between different roles –Thus we can expect the location to influence the behavior of the user –However, not the only explanatory factor, but one, potentially useful source of information.

1.Introduction and motivation 2.Algorithms for location clustering 3.Experiments

Goals for algorithms 1.Accuracy –Are our clusters correct? How much additional area they cover? 2.Time to reasonable results –How quickly can an application access clusters? 3.Scalability –Months of data, how to handle data overflow? 4.Adaptability –How well can the clusters be adapted over time? How well the clustering works in online or batch mode? 5.Meaningfulness of the clusters? –Do the users think the clusters that are captured are somehow meaningful? 6.Cluttering –How to avoid the effect that the diameter of the clusters grows over time?

Graph-based Algorithms 1.Heuristic graph clustering, –Builds a weighted graph where edge weights are distances between centroids of GPS measurements within individual GSM cells –Maps the distances into probabilities and uses (exponential) weight decay to merge GSM cells 2.Spectral clustering –Finds a minimum cut in the weighted graph using a spectral representation and a suitable objective function –In our case, form the spectral representation, calculate eigenvalues and use K-means to cluster top eigenvectors

Algorithms 1.Duration-based grid clustering –Accuracy of a location measurement is used to distributed the duration user spends in a single location (no transitions) to a grid –Clusters formed by merging grid points where the user spends ”enough” time 2.Frequent transitions –Builds a probability matrix from transitions between cells and clusters two cells when transition probabilities are significant

1.Introduction and motivation 2.Algorithms for location clustering 3.Experiments

The test data 1.Data gathered by multiple users throughout Europe 2) using the Context Watcher application waaij/showcase/crf/cw.html (Google for context watcher, 1st hit) waaij/showcase/crf/cw.html 2.In the picture you can see some example places where the author has gathered data 3.Thus we have a rich and heterogeneous collection of location data as our test set 2) Also small amounts of data outside Europe: USA, Russia, Canada, Ukraine

Results: Heuristic graph clustering

Results: Spectral clustering

Example results: Grid clustering

Results: Transition-based clustering

Summary Why are some locations meaningful? Identity theory: a person acts in different roles, to which (s)he associated behavioral expectations. Thus, if the location is a boundary between two roles, we can expect to observe different kind of behavior. We introduced a novel domain for (mobile) location data: GSM cell identifiers + GPS coordinates Presented four algorithms for the problem Compared the algorithms using real data In addition, we discussed what are desirable properties for a location clustering algorithm.