Clustering Via Persistent Homology

Slides:



Advertisements
Similar presentations
Computing Persistent Homology
Advertisements

Lecture 6: Creating a simplicial complex from data. in a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305) Topics in Topology:
Homology Groups And Persistence Homology
Matthew L. WrightMatthew L. Wright Institute for Mathematics and its ApplicationsInstitute for Mathematics and its Applications University of MinnesotaUniversity.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 8 Network models.
Lecture 5: Triangulations & simplicial complexes (and cell complexes). in a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305)
Topological Data Analysis
©2008 I.K. Darcy. All rights reserved This work was partially supported by the Joint DMS/NIGMS Initiative to Support Research in the Area of Mathematical.

MA5209 Algebraic Topology Wayne Lawton Department of Mathematics National University of Singapore S ,
All that remains is to connect the edges in the variable-setters to the appropriate clause-checkers in the way that we require. This is done by the convey.
Lecture 4: Addition (and free vector spaces) of a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305) Topics in Topology:
General (point-set) topology Jundong Liu Ohio Univ.
CS 376b Introduction to Computer Vision 02 / 22 / 2008 Instructor: Michael Eckmann.
Digital Image Processing CCS331 Relationships of Pixel 1.
Topological Data Analysis
Lecture 2: Addition (and free abelian groups) of a series of preparatory lectures for the Fall 2013 online course MATH:7450 (22M:305) Topics in Topology:
Persistent Homology in Topological Data Analysis Ben Fraser May 27, 2015.
Creating a simplicial complex Step 0.) Start by adding 0-dimensional vertices (0-simplices)
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 9, 2013: Create your own homology. Fall 2013.
A filtered complex is an increasing sequence of simplicial complexes: C 0 C 1 C 2 … UUU.
Welcome to MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Week 1: Introduction to Topological Data.
Sajid Ghuffar 24.June  Introduction  Simplicial Complex  Boundary Operator  Homology  Triangulation  Persistent Homology 6/24/ Persistence.
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 16, 2013: Persistent homology III Fall 2013.
Recombination:. Different recombinases have different topological mechanisms: Xer recombinase on psi. Unique product Uses topological filter to only perform.
Creating a cell complex = CW complex Building block: n-cells = { x in R n : || x || ≤ 1 } 2-cell = open disk = { x in R 2 : ||x || < 1 } Examples: 0-cell.
Copyright © Cengage Learning. All rights reserved.
Sept 25, 2013: Applicable Triangulations.
From Natural Images to MRIs: Using TDA to Analyze Image Data
Vectors and the Geometry
Copyright © Cengage Learning. All rights reserved.
Copyright © Cengage Learning. All rights reserved.
Virtual University of Pakistan
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Zigzag Persistent Homology Survey
We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.
Creating a cell complex = CW complex
Structural Properties of Networks: Introduction
Dec 2, 2013: Hippocampal spatial map formation
Sept 23, 2013: Image data Application.
Chapter 2 Straight Line Motion
Application to Natural Image Statistics
Chapter 2 Motion Along a Straight Line
MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Oct 21, 2013: Cohomology Fall 2013 course offered.
Dec 4, 2013: Hippocampal spatial map formation
Matrices 3 1.
Main Project total points: 500
مقدمة: الاطفال الذين يعانون من كثرة النشاط الحركى ليسوا باطفال مشاغبين، او عديمين التربية لكن هم اطفال عندهم مشكلة مرضية لها تاثير سيء على التطور النفسى.
Graph Analysis by Persistent Homology
Depth-First Search.
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Discrete Structure II: Introduction

Computer Vision Lecture 9: Edge Detection II
A filtered complex is an increasing sequence of simplicial complexes: C0 C1 C2 …
Topological Data Analysis
6.2 Volume: The Disc Method
Depth Estimation via Sampling
Lecture 3 : Isosurface Extraction
Volume Graphics (lecture 4 : Isosurface Extraction)
Connected Components, Directed Graphs, Topological Sort
Graphs, Linear Equations, and Functions
Copyright © Cengage Learning. All rights reserved.
Introduction to Algebraic Topology and Persistent Homology
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Quantum Foundations Lecture 3
Betti numbers provide a signature of the underlying topology.
Lecture 5: Triangulations & simplicial complexes (and cell complexes).
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Presentation transcript:

Clustering Via Persistent Homology MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 4, 2013 Clustering Via Persistent Homology

Creating a simplicial complex Recall that to create a simplicial complex, we start by adding 0-simplices (ie 0-dimensional vertices). So our step zero will be to add 0-simplices, but in this case our 0-dimensional points will be Step 0.) Start by adding 0-dimensional vertices (0-simplices)

Creating a simplicial complex So we can measure (or calculate) the distance between two points to determine if they are “close”. But we still need to define close. 1.) Adding 1-dimensional edges (1-simplices) Add an edge between data points that are “close”

Creating a simplicial complex Thus we will choose a threshold. We will connect a pair of vertices with an edge if and only if their distance is less then our chosen threshold. But it is not always obvious how to choose the threshold. This problem can be dealt with by using persistence which we will briefly introduce later in this lecture. For now let’s somewhat arbitrarily choose a distance, say 1.8 cm. 1.) Adding 1-dimensional edges (1-simplices) Let T = Threshold Connect vertices v and w with an edge iff the distance between v and w is less than T

Creating a simplicial complex So we will connect every pair of vertices if their distance is less than 1.8 cm. If the center of these points represent the 0-dimensional vertices, then this distance is less than our threshold, so we add an edge. Similarly this pair of points satisfy our definition of close, so we add an edge The distance between this pair of vertices is greater than 1.8, so we won’t connect them with an edge. Continuing to add edges between vertices whenever their distance is less than our threshold of 1.8cm, we now have 1.) Adding 1-dimensional edges (1-simplices) Let T = Threshold = Connect vertices v and w with an edge iff the distance between v and w is less than T

Creating a simplicial complex a one dimensional simplicial complex. Note that we have clustered our data into five disjoint connected sets. So this is one way to cluster our data – that is grouping our data points into disjoint sets based on some definition of similarity. In this case, we have 5 clusters. We can now add higher dimensional simplices. 1.) Adding 1-dimensional edges (1-simplices) Add an edge between data points that are “close” Note: we only need a definition of closeness between data points. The data points do not need to be actual points in Rn

Creating the Vietoris Rips simplicial complex Thus we now have the Vietoris Rips simplicial complex. Note we get the same simplex by adding one dimension at a time 2.) Add all possible simplices of dimensional > 1.

Creating the Vietoris Rips simplicial complex We start again by adding our 0-simplices, our data points. I can indicate the threshold using a ball centered at my data point. The diameter of the ball will be my threshold, so if two balls intersect then the distance between the vertices is less than the threshold, so we connect the pair of vertices with an edge if and only if the balls intersect, so we can get edges after which we can then add all possible higher dimensional simplices. Note how I grew my balls. As the diameter increases, my threshold which is equal to my diameter increases. 0.) Start by adding 0-dimensional data points Use balls to measure distance (Threshold = diameter). Two points are close if their corresponding balls intersect

Creating the Vietoris Rips simplicial complex We can compute the number of clusters for a variety of diameters. We start with 17 data points, so if the diameter is 0, we have 17 clusters. Increasing the diameter, these 2 balls intersect so I now have 16 clusters. If we continue to increase the diameter, we will eventually create the complex we saw before with 5 clusters, etc until we only have one cluster left. Eventually this entire page will be purple, but right now, we know have one component. To choose the threshold, one can determine how long a particular number of clusters lasts, for example for what set of radii do we have five clusters. If we have five clusters for the largest set of radii, then have gives us a good idea where to set the threshold and which simplicial complex best models our data. I have put links to better animations on my on my YouTube site which may better illustrate this persistence concept. Next month, we will also talk much more about persistence during the live lectures for this course. This is just a preliminary introduction. 0.) Start by adding 0-dimensional data points Use balls of varying radii to determine persistence of clusters. H. Edelsbrunner, D. Letscher, and A. Zomorodian, Topological persistence and simplication, Discrete and Computational Geometry 28, 2002, 511-533.

Discriminative persistent homology of brain networks, 2011 Hyekyoung Lee Chung, M.K.; Hyejin Kang; Bung-Nyun Kim;Dong Soo Lee Constructing functional brain networks with 97 regions of interest (ROIs) extracted from FDG-PET data for 24 attention-deficit hyperactivity disorder (ADHD), 26 autism spectrum disorder (ASD) and 11 pediatric control (PedCon). Data = measurement fj taken at region j Graph: 97 vertices representing 97 regions of interest edge exists between two vertices i,j if correlation between fj and fj ≥ threshold How to choose the threshold? Don’t, instead use persistent homology

Creating the Vietoris Rips simplicial complex Thus we now have the Vietoris Rips simplicial complex. Note we get the same simplex by adding one dimension at a time Note to determine clusters, only need vertices and edges (but we will use higher dimensional simplices Friday to look at holes in data).

Creating a simplicial complex a one dimensional simplicial complex. Note that we have clustered our data into five disjoint connected sets. So this is one way to cluster our data – that is grouping our data points into disjoint sets based on some definition of similarity. In this case, we have 5 clusters. We can now add higher dimensional simplices. Note to determine clusters, only need vertices and edges.

measurement at location i Vertices = Regions of Interest Create Rips complex by growing epsilon balls (i.e. decreasing threshold) where distance between two vertices is given by where fi = measurement at location i

How many connected components?

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 C0 = Z2[v1, v2, v3, v4, v5, v6] = set of 0-chains C1 = Z2[e1, e2, e3, e4, e5] = set of 1-chains

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 o (e1) = v1 + v2 Components are {v1, v2, v3} and {v4, v5, v6} o (e2) = v2 + v3 o (e3) = v4 + v5 o (e4) = v5 + v6 o (e5) = v4 + v6

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 C0 = Z2[v1, v2, v3, v4, v5, v6] = set of 0-chains C1 = Z2[e1, e2, e3, e4, e5] = set of 1-chains : C1  C0 o

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 : C1  C0 o o (e1) = v1 + v2 o (e2) = v2 + v3 o (e3) = v4 + v5 o (e4) = v5 + v6 o (e5) = v4 + v6

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 : C1  C0 o o (e1) = v1 + v2 Extend linearly: o (e2) = v2 + v3 o (e3) = v4 + v5 o (e4) = v5 + v6 o (e5) = v4 + v6

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 : C1  C0 o o (e1) = v1 + v2 Extend linearly: o (e2) = v2 + v3 o (e3) = v4 + v5 o (e4) = v5 + v6 o (e5) = v4 + v6

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 : C1  C0 o o (e1) = v1 + v2 Extend linearly: o (e2) = v2 + v3 o (e3) = v4 + v5 o (e4) = v5 + v6 o (e5) = v4 + v6

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 o1 o0 C1  C0  0 Z0 = kernal of = {x : (x) = 0} o0 o0

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 o1 o0 C1  C0  0 Z0 = kernal of = {x : (x) = 0} = C0 = Z2[v1, v2, v3, v4, v5, v6] o0 o0

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 o1 o0 o (e1) = v1 + v2 (e2) = v2 + v3 (e3) = v4 + v5 (e4) = v5 + v6 (e5) = v4 + v6 C1  C0  0 Z0 = Z2[v1, v2, v3, v4, v5, v6] B0 = image of o1

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 H0 = Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0>

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0>

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0> = <v1, v2, v3, v4, v5, v6 : v1 + v2 , v2 + v3 , v4 + v5 , v5 + v6 , v4 + v6 >

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0>

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0> v1 + v2 = 0 implies v1 = v2 Z0/B0 = <v1, v3, v4, v5, v6 : v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0>

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v3, v4, v5, v6 : v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0> Z0/B0 = <v1, v4>

Counting number of connected components using homology v1 v2 v3 e1 e2 v6 v4 v5 e3 e5 e4 Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0> Z0/B0 = <[v1], [v4]> where [v1] = {v1, v2, v3} and [v4] = {v4, v5, v6}

Counting number of connected components using homology Z0/B0 = <v1, v2, v3, v4, v5, v6 : v1 + v2 = 0, v2 + v3 = 0, v4 + v5 = 0, v5 + v6 = 0, v4 + v6 = 0> Use matrices: See Computing Persistent Homology by Afra Zomorodian, Gunnar Carlsson