Scientific Data Mining Principles and applications with astronomical data. Amos Storkey Institute for Adaptive and Neural Computation Division of Informatics.

Slides:



Advertisements
Similar presentations
Face Recognition Sumitha Balasuriya.
Advertisements

Face Recognition. Introduction Why we are interested in face recognition? Why we are interested in face recognition? Passport control at terminals in.
Vestrand Real Time Transient Detection with RAPTOR: Exploring the Path Toward a “Thinking” Telescope Tom Vestrand on behalf of the RAPTOR Team Los Alamos.
Grape Detection in Vineyards Ishay Levi Eran Brill.
Computer Vision Lecture 16: Region Representation
Computer Vision – Image Representation (Histograms)
Vision Based Control Motion Matt Baker Kevin VanDyke.
Esmail Hadi Houssein ID/  „Motivation  „Problem Overview  „License plate segmentation  „Character segmentation  „Character Recognition.
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
Face Recognition & Biometric Systems, 2005/2006 Face recognition process.
M77 (NGC 1068) By: Ryan Desautels. Messier 77 A Brief History A Brief History General Information General Information Galactic Information Galactic Information.
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Fitting a Model to Data Reading: 15.1,
Stockman MSU/CSE Fall 2009 Finding region boundaries.
Amos Storkey, School of Informatics. Density Traversal Clustering and Generative Kernels a generative framework for spectral clustering Amos Storkey, Tom.
KDD for Science Data Analysis Issues and Examples.
Astronomy class: Pages 2-9
Data provenance in astronomy Bob Mann Wide-Field Astronomy Unit University of Edinburgh
VINCENT URIAS, CURTIS HASH Detection of Humans in Images Using Skin-tone Analysis and Face Detection.
1B50 – Percepts and Concepts Daniel J Hulme. Outline Cognitive Vision –Why do we want computers to see? –Why can’t computers see? –Introducing percepts.
Interactive animations of electromagnetic waves András Szilágyi Institute of Enzymology, Hungarian Academy of Sciences.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean Hall 5409 T-R 10:30am – 11:50am.
Introduction The réseau grid of 27 horizontal and 27 vertical lines was originally superimposed over Carte du Ciel (CdC) plates to assist the process of.
Handwriting Copybook Style Analysis Of Pseudo-Online Data Student and Faculty Research Day Mary L. Manfredi, Dr. Sung-Hyuk Cha, Dr. Charles Tappert, Dr.
E-Science: Stuart Anderson National e-Science Centre Stuart Anderson National e-Science Centre.
Classifying Galaxies A.N. Other and N. O’Body All Saints School, Upper Nowhere, UK.
Maria Teresa Crosta and Francois Mignard Small field relativistic experiment with Gaia: detection of the quadrupolar light deflection.
What can we learn from the luminosity function and color studies? THE SDSS GALAXIES AT REDSHIFT 0.1.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
RGB Color Balance with ExCalibrator – “Take 2” SIG Presentation B. Waddington 5/21/2013.
RAS National Astronomy Meeting April 2006, University of Leicester, UK The XMM-Newton Slew Survey The excellent sensitivity of the XMM-Newton.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Cosmology and extragalactic astronomy Mat Page Mullard Space Science Lab, UCL 7. Quasars.
Full Spectral Analysis of Galaxies - Are we there yet? Ben Panter, Edinburgh
Making the Sky Searchable: Automatically Organizing the World’s Astronomical Data Sam Roweis, Dustin Lang &
Chapter 10 Image Segmentation.
Star Formation Efficiency and Environment Marianne T. Doyle *1, David J. Rohde 1, Michael J. Drinkwater 1, Mike Read 2, Baerbel S Koribalski 3 and The.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Advanced Stellar Populations Advanced Stellar Populations Raul Jimenez
Automated Detection and Classification Models SAR Automatic Target Recognition Proposal J.Bell, Y. Petillot.
X-ray sky survey of bright, serendipitous sources with 2XMMi at the AIP Speaker: Alexander Kolodzig Origin: Humboldt-Uni Berlin, Germany Institute:AIP.
Earth in Space Exploring Space Telescopes By Alice M. Darnell 5/4/11.
Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Bright & Dark Galaxies from the HIPASS Radio Survey Marianne T. Doyle *1, Michael J. Drinkwater 1, David J. Rohde 1, Mike Read 2, Baerbel S Koribalski.
UNIT 1 The Milky Way Galaxy.
Andrii Elyiv and XMM-LSS collaboration The correlation function analysis of AGN in the XMM-LSS survey.
October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
The Stellar Assembly History of Massive Galaxies Decoding the fossil record Raul Jimenez Licia Verde UPenn Alan Heavens Ben.
Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.
Annotation of “special structures” in astronomy Bob Mann Institute for Astronomy and National e-Science Centre University of Edinburgh.
The Population of Near-Earth Asteroids and Current Survey Completion Alan W. Harris MoreData! : The Golden Age of Solar System Exploration Rome,
Galaxies This lesson deals with important topics relating to galaxies. Each of these topics represents a great body of knowledge and areas of interest.
Efficient Image Classification on Vertically Decomposed Data
VII BSAC  "ASTROINFORMATICS" – Chepelare, 1-4 June 2010
H Stacked Images Reveal Large Numbers of PNe in the LMC
Swayamtrupta Panda National Institute of Technology Rourkela, India
Introduction to Computational and Biological Vision Keren shemesh
Image Primitives and Correspondence
Galaxies This lesson deals with important topics relating to galaxies. Each of these topics represents a great body of knowledge and areas of interest.
Fitting Curve Models to Edges
Classification of GAIA data
Efficient Image Classification on Vertically Decomposed Data
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Detecting Dark Clouds in the Galactic Plane with 2MASS data
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Presentation transcript:

Scientific Data Mining Principles and applications with astronomical data. Amos Storkey Institute for Adaptive and Neural Computation Division of Informatics and Institute for Astronomy University of Edinburgh

Collaborators and Thanks Collaborative work with Nigel Hambly, Chris Williams and Bob Mann. Thanks also to many others at the Royal Observatory, Edinburgh for their help in clarifying many of the things that an astronomical outsider might misunderstand or falsely presume!

Astro-informatics Problems in Astronomy increasingly require use of machine learning, data mining and informatics techniques. Detection of spurious objects Record linkage Object classification and clustering Source seperation Compression Information about techniques

Galaxy spectra James Riden, with Alan Heavens and Ben Panter.Chris Williams. Given spectra, what can be said about the generation history and metallicity of galaxy. Data exploration techniques: ISOMAP and LLE – find data manifold and project to low dimension. Develop probabilistic model for galaxy generation, infer history and metallicity parameters from spectra.

Exploratory Data Analysis

Record Linkage Problem of linking records from different datasets. There is an ambiguity in matches. Room for new techniques.

Super-resolution Improving resolution of a single image, or combining images from different sources to provide an increased resolution. Image cleaning and characterisation. H alpha survey. Matches in short red. Examples.

Part II – Main Problem Locating junk objects in astronomical databases. Makes finding non- matches across epochs or colours hard.

Supercosmos Sky Survey Data UK, ESO and Palomar Schmidt sky survey plates. Optical: 3 colours and 2 epochs, 894 fields for each covering the Southern sky. Digitised using SuperCOSMOS to 10 micron (0.7arcsec). 5x10 5 to 10 7 objects on the plate. Objects and features extracted from plates to form a catalogue of stars and galaxies and characteristics (eg ellipses), but also spurious objects, eg. from satellite tracks Average of 2 satellite tracks per plate, a few hundred to a few thousand objects per track. Aeroplanes, diffraction spikes, halos, scratches...

Satellite track problem Some satellite tracks tend to be recognised as a line of objects:

Optical Artefacts Can be halos about bright stars. High density of spurious points local to the star. (Almost) horizontal and (almost) vertical diffraction spikes are possible.

Spurious object characteristics Spurious objects cover all the ranges of magnitude measurements, they often (but not always) have characteristics resembling those of galaxies. In fact their characteristics are wide and various. They are not easy to detect from their characteristics alone.

Machine Learning Methods Hough Transform and Circular Hough Transform See

Circular Hough Transform

Hough Example: UKJ005 angle Distance from origin0 2 d max

Data space corresponding to bin However: Cant find short lines Curves are problematic Background star/galaxy density changes can cause errors.

Renewal Strings Hidden-Markov renewal processes. Look at all possible line segments in terms of renewal processes. If local density is closer in signature to a satellite track than the background stars and galaxies, then flag as a satellite track.

Benefits Can use line widths thirty times narrower than with Hough. Copes with curves by using local linearity rather than restricted to global linearity. Deals with local star/galaxy density differences. Copes with partial lines, dashed lines etc. Flexible model. Can use other data (eg ellipticity) to strengthen classification. Bayesian.

Generative renewal string Can generate from model.

To use Dont use generative model! Too hard. Look at all line segments. Transform star/galaxy model to Poisson process on line. Run Markov chain along each line. Simplest case: class 0 is background process. Class 1defines a renewal processes corresponding to a scratch, satellite track etc. Processing is fully Markovian.

Diffraction spikes Modifications can be made for diffraction spikes: look only at certain orientations and positions.

Results Get probabilistic results. Two possibilities: Probability of a given point being a spurious point. Most probable classification of points.

Results Two examples. The left example is a small scratch or track in the corner of ukj005. Right is a track on a dense plate.

Further examples Further examples can be found at A flythrough movie of one plate can be found at hnew3c0.avi (36MB) hnew3c0.avi

Conclusions Machine Learning and Data Mining methods are, and will continue, to prove useful with astronomical databases. Methods do not always work automatically. Some thought is needed. Circular Hough transforms, and renewal strings have proven effective in locating a variety of spurious objects in astronomical databases. So far have run on a quarter of one colour of SuperCOSMOS data.

Contact and URLs