Histogram Analysis to Choose the Number of Clusters for K Means By: Matthew Fawcett Dept. of Computer Science and Engineering University of South Carolina.

Slides:



Advertisements
Similar presentations
CMPUT 615 Applications of Machine Learning in Image Analysis
Advertisements

PARTITIONAL CLUSTERING
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
Detecting Grapes in Vineyard Images How can we do it? Sivan Radt.
Clustering & image segmentation Goal::Identify groups of pixels that go together Segmentation.
Automatic Histogram Threshold Using Fuzzy Measures 呂惠琪.
1 Video Processing Lecture on the image part (8+9) Automatic Perception Volker Krüger Aalborg Media Lab Aalborg University Copenhagen
Human-Computer Interaction Human-Computer Interaction Segmentation Hanyang University Jong-Il Park.
Absorbing Random walks Coverage
RANSAC experimentation Slides by Marc van Kreveld 1.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Medical Imaging Mohammad Dawood Department of Computer Science University of Münster Germany.
Lecture 6 Image Segmentation
Medical Imaging Mohammad Dawood Department of Computer Science University of Münster Germany.
EE 7730 Image Segmentation.
Clustering and greedy algorithms — Part 2 Prof. Noah Snavely CS1114
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
1Ellen L. Walker Segmentation Separating “content” from background Separating image into parts corresponding to “real” objects Complete segmentation Each.
MRI Image Segmentation for Brain Injury Quantification Lindsay Kulkin 1 and Bir Bhanu 2 1 Department of Biomedical Engineering, Syracuse University, Syracuse,
Segmentation CSE P 576 Larry Zitnick Many slides courtesy of Steve Seitz.
Segmentation Divide the image into segments. Each segment:
CS 376b Introduction to Computer Vision 04 / 04 / 2008 Instructor: Michael Eckmann.
Clustering Color/Intensity
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
CS 376b Introduction to Computer Vision 02 / 25 / 2008 Instructor: Michael Eckmann.
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Cliff Rhyne and Jerry Fu June 5, 2007 Parallel Image Segmenter CSE 262 Spring 2007 Project Final Presentation.
Binary Image Analysis. YOU HAVE TO READ THE BOOK! reminder.
K-means Clustering. What is clustering? Why would we want to cluster? How would you determine clusters? How can you do this efficiently?
Clustering and greedy algorithms Prof. Noah Snavely CS1114
Clustering Vertices of 3D Animated Meshes
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Computer Vision James Hays, Brown
Graph-based Segmentation. Main Ideas Convert image into a graph Vertices for the pixels Vertices for the pixels Edges between the pixels Edges between.
What is Museum Box? A Museum box is a way of presenting information that allows you to create a cube project that can be shared with others. You can use.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
CS 6825: Binary Image Processing – binary blob metrics
CS 376b Introduction to Computer Vision 02 / 22 / 2008 Instructor: Michael Eckmann.
CSE 185 Introduction to Computer Vision Pattern Recognition 2.
Medical Imaging Dr. Mohammad Dawood Department of Computer Science University of Münster Germany.
Binarization of gray-scale hologram Fan Jiang Fall 2006.
Image segmentation Prof. Noah Snavely CS1114
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
EECS 274 Computer Vision Segmentation by Clustering II.
Data Extraction using Image Similarity CIS 601 Image Processing Ajay Kumar Yadav.
Centroids part 2 Getting rid of outliers and sorting.
CS654: Digital Image Analysis
Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.
Image Segmentation in Color Space By Anisa Chaudhary.
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
MultiModality Registration Using Hilbert-Schmidt Estimators By: Srinivas Peddi Computer Integrated Surgery II April 6 th, 2001.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Medical Image Analysis Dr. Mohammad Dawood Department of Computer Science University of Münster Germany.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.
May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.
Image Processing Intro2CS – week 6 1. Image Processing Many devices now have cameras on them Lots of image data recorded for computers to process. But.
Graph-based Segmentation
Clustering MacKay - Chapter 20.
Classification of unlabeled data:
Computer Vision Lecture 12: Image Segmentation II
CSSE463: Image Recognition Day 23
Histogram Probability distribution of the different grays in an image.
Seam Carving Project 1a due at midnight tonight.
Announcements Project 4 questions Evaluations.
Presentation transcript:

Histogram Analysis to Choose the Number of Clusters for K Means By: Matthew Fawcett Dept. of Computer Science and Engineering University of South Carolina

Overview Importance and use K means cluster algorithm The changes and adaptation I used Results Conclusions and Future Work

Importance The main reason for use in Medical Imaging is for Segmentation. Other uses outside of the realms of Image Processing(e.g. information retrieval) Widespread algorithm

K Means Clustering Problem is that user doesn ’ t know the optimal number of clusters to pick. This is the problem I am trying to solve by using Histogram Analysis. Histogram of the pixel intensity to find the optimal number of clusters for a picture.

Overview Importance and use K means cluster algorithm The changes and adaptation I used Results Conclusions and Future Work

Algorithm K means clusters is a very simple algorithm First the user picks the number of centers that he/she would like. Next the centers are chosen randomly.

Algorithm I have read on different ways to choose the centers. (e.g. pick the 2 farthest points away from each other.) After the centers have been established then we check every other point with each of the centers and find the minimum distance.

Algorithm Each point is assigned to 1 cluster which it is closet. This makes sense that points that are closer to each other are normally together After each point is assigned the cluster centers are then recalculated based on these assignments

Algorithm So once the new centers have been processed the routine starts over and continues until it converges and the centers do not move. cc/Clustering/tutorial_html/AppletKM.ht ml cc/Clustering/tutorial_html/AppletKM.ht ml

Overview Importance and use K means cluster algorithm The changes and adaptation I used Results Conclusions and Future Work

The new algorithm Instead of guessing the number of clusters to have, I have used some preprocessing information to choose the number of clusters. The first thing to be done is to make a histogram of pixel intensity.

Histogram The histogram will probably have many peaks and valleys so the idea is to pick the correct number. My idea was to basically count the peaks on the histogram. However this can cause problems Any guesses?

Histogram Which peaks do I take?

Histogram I added a term called Threshold. The threshold term just determines the cutoff point for a peak. For example: If the threshold is 150 then I only take peaks with 151 or more. The threshold I chose was the max color which was 255 divided by the number of pixels which equaled to 64. How about any other problems with a histogram?

Histogram What about neighboring peaks?

Histogram I know introduce another term to my work called span. Span basically covers the number of pixels to the left and right of the current pixel. For example if span was set to 3 then I would check 3 pixels to the left and 3 pixels to the right and then take the maxmium one over the threshold

Histogram The span guarantees that I don ’ t have 2 pixels next to each other as 2 different centers in the picture. This seems like a reasonable idea because pixels with the same intensity or near same intensity should share the same center and are probably close together.

Find Centers Based on this information I determine the number of peaks above the threshold and no neighbors based on the span. This the magic number I am using for the clusters by anglicizing the histogram of the pixel intensity.

Metric Now I have the number of centers(k) Start the k means algorithm Pick k center points at random. The metric I am using is the difference in intensity. We take the absolute value of this to make sure it positive. Assign each pixel to one of the clusters

Resign the cluster centers Now that we have all the pixels in a cluster we recalculate the centers. Add up each pixel in each cluster and divide by the number of pixels in the cluster and we get the new center. Supposed to repeat this until it converges but here I just do this 25 times.

Overview Importance and use K means cluster algorithm The changes and adaptation I used Results Conclusions and Future Work

Results Found some MRI images Used ImageMagik to change the size of the pictures to be 120 X 120

Results Number of centers = 6

Results Number of Centers = 19

Results Number of Centers = 17

Results

Want to compare the variance of each cluster. The variance in each cluster should be about the same.

Overview Importance and use K means cluster algorithm The changes and adaptation I used Results Conclusions and Future Work

A method to find the centers of the clusters The parameters for threshold and span Supersampling instead of using just one pixel.