INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Conceptualization of Place via Spatial Clustering and Co- occurrence Analysis.

Slides:



Advertisements
Similar presentations
eClassifier: Tool for Taxonomies
Advertisements

DBSCAN & Its Implementation on Atlas Xin Zhou, Richard Luo Prof. Carlo Zaniolo Spring 2002.
Lecture outline Density-based clustering (DB-Scan) – Reference: Martin Ester, Hans-Peter Kriegel, Jorg Sander, Xiaowei Xu: A Density-Based Algorithm for.
DBSCAN – Density-Based Spatial Clustering of Applications with Noise M.Ester, H.P.Kriegel, J.Sander and Xu. A density-based algorithm for discovering clusters.
Segmentation in color space using clustering Student: Yijian Yang Advisor: Longin Jan Latecki.
Qiang Yang Adapted from Tan et al. and Han et al.
Clustering Prof. Navneet Goyal BITS, Pilani
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Clustering CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu.
Part II - Clustering© Prentice Hall1 Clustering Large DB Most clustering algorithms assume a large data structure which is memory resident. Most clustering.
Clustering Methods Professor: Dr. Mansouri
More on Clustering Hierarchical Clustering to be discussed in Clustering Part2 DBSCAN will be used in programming project.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. local-density based spatial clustering algorithm with noise Presenter : Lin, Shu-Han Authors : Lian Duan,
MR-DBSCAN: An Efficient Parallel Density-based Clustering Algorithm using MapReduce Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng,
Geographical and Temporal Similarity Measurement in Location-based Social Networks Chongqing University of Posts and Telecommunications KTH – Royal Institute.
1 Clustering Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: J.W. Han, I. Witten, E. Frank.
Tagging Systems Mustafa Kilavuz. Tags A tag is a keyword added to an internet resource (web page, image, video) by users without relying on a controlled.
Cluster Analysis.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
An Introduction to Clustering
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Instructor: Qiang Yang
Cluster Analysis.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Friends and Locations Recommendation with the use of LBSN
Math 5364 Notes Chapter 8: Cluster Analysis Jesse Crawford Department of Mathematics Tarleton State University.
 Clustering of Web Documents Jinfeng Chen. Zhong Su, Qiang Yang, HongHiang Zhang, Xiaowei Xu and Yuhen Hu, Correlation- based Document Clustering using.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Grade clustering and seriation of words based on their co-occurrences Emilia Jarochowska & Krzysztof Ciesielski Institute of Computer Science, Poland.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
Improved search for Socially Annotated Data Authors: Nikos Sarkas, Gautam Das, Nick Koudas Presented by: Amanda Cohen Mostafavi.
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Mining the Structure of User Activity using Cluster Stability Jeffrey Heer, Ed H. Chi Palo Alto Research Center, Inc – SIAM Web Analytics Workshop.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Wang-Chien Lee i Pervasive Data Access ( i PDA) Group Pennsylvania State University Mining Social Network Big Data Intelligent.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:
Clustering Algorithms for Numerical Data Sets. Contents 1.Data Clustering Introduction 2.Hierarchical Clustering Algorithms 3.Partitional Data Clustering.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
DBSCAN Data Mining algorithm Dr Veljko Milutinović Milan Micić
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Presented by Ho Wai Shing
Density-Based Clustering Methods. Clustering based on density (local cluster criterion), such as density-connected points Major features: –Discover clusters.
5/29/2008AI UEC in Japan Chapter 12 Clustering: Large Databases Written by Farial Shahnaz Presented by Zhao Xinyou Data Mining Technology.
The 1st Global Tech Mining Conference, Atlanta, USA Analyzing Technology Evolution of Graphene Sensor Based on Patent Documents Fang Shu 1, Hu Zhengyin.
1 Core Techniques: Cluster Analysis Cluster: a number of things of the same kind being close together in a group (Longman dictionary of contemporary English.
CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 8. Text Clustering.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Christopher C. Yang and Tobun Dorbin Ng TSMCA Analyzing and Visualizing Web Opinion.
Bayesian Networks in Document Clustering Slawomir Wierzchon, Mieczyslaw Klopotek Michal Draminski Krzysztof Ciesielski Mariusz Kujawiak Institute of Computer.
Marko Živković 3179/2015.  Clustering is the process of grouping large data sets according to their similarity  Density-based clustering: ◦ groups together.
Clustering By : Babu Ram Dawadi. 2 Clustering cluster is a collection of data objects, in which the objects similar to one another within the same cluster.
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
1 Similarity and Dissimilarity Between Objects Distances are normally used to measure the similarity or dissimilarity between two data objects Some popular.
Clustering Microarray Data based on Density and Shared Nearest Neighbor Measure CATA’06, March 23-25, 2006 Seattle, WA, USA Ranapratap Syamala, Taufik.
Location-based Social Networks 6/11/20161 CENG 770.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
DATA MINING: CLUSTER ANALYSIS (3) Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Clustering (2) Center-based algorithms Fuzzy k-means Density-based algorithms ( DBSCAN as an example ) Evaluation of clustering results Figures and equations.
Best pTree organization? level-1 gives te, tf (term level)
More on Clustering in COSC 4335
CSE 4705 Artificial Intelligence
Clustering of Web pages
Personalized Social Image Recommendation
CS 685: Special Topics in Data Mining Jinze Liu
Presentation transcript:

INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Conceptualization of Place via Spatial Clustering and Co- occurrence Analysis 2009 International Workshop on Location Based Social Networks (LBSN’09) Dong–Po Deng; Tyng–Ruey Chuang; Rob Lemmens Nov. 3, 2009, Seattle, WA, USA

GeoInformation is increasing on the Web  It’s a common activity for people to search and share geo-referenced information and resource on the Web 2 11/03/2009 From

Folksonomy  A tagging system allows users to classify objects of interests by keywords or terms  Folksonomy = practice of personal tagging of information and objects in social environment while people consume the information and use the objects 3 11/03/2009 Social tools

4 11/03/2009 Tags and Geo-tags  Tagging is a process that is established by keywords (k), users (u), and objects (o)  Geotag  geo:lat=latitude e.g. geo:lat =  geo:lon=longitude e.g. geolong= 4.269

5 11/03/2009 Questions are …  Is geospatial data created in a social network a valuable production for a geospatial society in general?  How to extract the geospatial information from user- generated contents in a social network?

6 11/03/2009 Places as artifacts  Place is a center of meaning constructed by experiences  Place may be significant to any individual or group, and may exist at any scale  Locations become places only when activities occur that cause them to become imbued with meaning  Place provides the conditions of possibility for creative social practice

7 11/03/2009 Photos with tags = locations with tags Tags

Collective intelligence  Tags should give rise to emergent semantics and shared conceptualization  Accumulation of tags on shared objects often express common consensus  Patterns and trends emerge from the collaboration and competition of many individuals are able to turn out structured information from tag-based system despite the lack of ontology and priori defined semantics 8 11/03/2009

9 Photos and Tags in Flickr Tags Geo-Tag Time-Tag

10 11/03/2009 Selected photos from Flickr

11 11/03/2009 Where is the beef?  2008 amsterdam canal europe holland netherlands noordholland north travel The most frequently occurring 20%

12 11/03/2009 Steps for extracting conceptualization of place Tags crawling geotagged & tagged photos database Spatial clusteringCo-occurrence analysisPlace concepts

DBSCAN is a density-based algorithm  Two global parameters:  Eps: Maximum radius of the neighbourhood  MinPts: Minimum number of points in an Eps- neighbourhood of that point  Core Object: object with at least MinPts objects within a radius ‘Eps-neighborhood’  Border Object: object that on the border of a cluster 13 11/03/2009 p q MinPts = 5 Eps = 1 cm

Density-Based Clustering: Background  Density-reachable  A point p is density-reachable from a point q wrt Eps, MinPts if there is a chain of points p 1, …, p n, p 1 = q, p n = p such that p i+1 is directly density-reachable from p i  Density-connected  A point p is density-connected to a point q wrt. Eps, MinPts if there is a point o such that both, p and q are density-reachable from o wrt. Eps and MinPts /03/2009 p q p1p1 pq o

DBSCAN: The Algorithm  Arbitrary select a point p  Retrieve all points density-reachable from p wrt Eps and MinPts.  If p is a core point, a cluster is formed.  If p is a border point, no points are density- reachable from p and DBSCAN visits the next point of the database.  Continue the process until all of the points have been processed /03/2009

16 11/03/2009 Density-Based Clustering: Results

Co-occurrence analysis  Co-occurrence can be interpreted as an indicator of semantic similarity or an idiomatic expression.  Co-occurrence assumes interdependency of the two terms  Semantic similarity is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content /03/2009

18 11/03/2009 Co-occurrence matrix  The element at (i,j) is the tag count or frequency of the i’th tag in the j’th photos

19 11/03/2009 Co-occurrence matrix  A row in the matrix is a vector of the tag’s occurrence in all photos:  While a column is a vector of the occurrence of all tags in a photo

20 11/03/2009 Co-occurrence correlations Photo-tag matrix tag-tag correlation matrix

21 11/03/2009 The correlation between the tag “amsterdam" and the tags of several landmarks associated to Amsterdam Distance Correlation coefficient

22 11/03/2009 Conceptualizing places in 2500 meters

23 11/03/2009 Conceptualizing places 150 meters

24 11/03/2009 Conceptualizing places in 75 meters

Schiphol airport 25 11/03/2009

Anne Frank House 26 11/03/2009

Rijksmuseum 27 11/03/2009

28 11/03/2009 Conclusions and future works  Without the use of suitable spatial clustering, detailed information about a place is veiled by high frequency tags  A conceptualization of place is unveiled by tag co- occurrences at a suitable spatial scale  Location-based applications can be developed to suggest tags to users as they take photos  In the future we will ground the semantics between pairs of tags via the use of gazetteers or dictionaries

INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Thank you for your attention! Dongpo Deng