Presentation is loading. Please wait.

Presentation is loading. Please wait.

The VONeural team Progetto S.Co.P.E. – WP4 The Virtual Observatory and the PON-SCOPE The VO-Neural Team G. Longo (Principal Investigator) M. Brescia (Project.

Similar presentations


Presentation on theme: "The VONeural team Progetto S.Co.P.E. – WP4 The Virtual Observatory and the PON-SCOPE The VO-Neural Team G. Longo (Principal Investigator) M. Brescia (Project."— Presentation transcript:

1 The VONeural team Progetto S.Co.P.E. – WP4 The Virtual Observatory and the PON-SCOPE The VO-Neural Team G. Longo (Principal Investigator) M. Brescia (Project Manager) S. Cavuoti (applications) A. Corazza (models and algorithms) R. D’Abrusco (applications)G. d’Angelo (documentation, GRID) N. Deniskina (GRID – VO interface)M. Garofalo (applications) O. Laurino (System, Applications)A. Nocella (UML software engineering) G. Riccio (Applications) S. Pardi External Members C. Donalek (Caltech)G. Djorgovski (Caltech)

2 Summary 1.What is the Virtual Observatory & its international background 2.Why the V.Obs. is so important for the future of cosmology 3.Applications already ported under SCOPE Astronomy has become an immensely data rich field Detector evolution (plates to digital to mosaics) Telescope evolution Space instruments From 1MB/night to 1TB/night Heterogeneous Data + Metadata

3 The VLT Survey Telescope 100 GB/night 2.6 meter 0.021”/pxl 16 k x 16 k Digital libraries Follow-Up Telescopes and Missions Data Services --------------- Data Mining and Analysis, Target Selection Secondary Data Providers Results V.O

4 The Virtual Observatory Data Gathering (e.g., from sensor networks, telescopes…) Data Farming: Storage/Archiving Indexing, Searchability Data Fusion, Interoperability Data Mining (or Knowledge Discovery in Databases): Pattern or correlation search Clustering analysis, automated classification Outlier / anomaly searches Hyperdimensional visualization Data understanding Computer aided understanding KDD Etc. New Knowledge Database technologies Key mathematical issues Ongoing research Users: >>1000 Total data ca. 1 PByte

5 –Clustering ~ N log N  N 2, ~ D 2 –Correlations ~ N log N  N 2, ~ D k (k ≥ 1) –Likelihood, Bayesian ~ N m (m ≥ 3), ~ D k (k ≥ 1) Data Mining algorithms scale very badly: The scientific exploitation of a multi band, multiepoch (K epochs) survey implies to search for patterns, trends, etc. among N points in a DxK dimensional parameter space N >10 9, D>>100, K>10 Cf. isophotal, petrosian, aperture magnitudes concentration indexes, shape parameters, etc. Band 1 Band 2 Band 3 V.S.T.

6 Tools in the VONeural Middleware Astrogrid Model (Nocella) Interface between Virtual Observatory and GRID computing (GRID-launcher; Deniskina, D’Angelo) Models Multi Layer Perceptron (VONeural_MLP; Donalek, Cavuoti, Skordovski) Support Vector Machines (VONeural_SVM; Cavuoti, Russo) Probabilistic Principal Surfaces (VONeural_PPS; Garofalo) Tools Segmentation of Astronomical images (VONeural_Ext; Laurino)

7 Scientific Applications Data mining in multiparametric spaces (supervised and unsupervised) Photometric redshifts (MLP, SVM) Search for candidate quasars and AGN (PPS, NEC) Galaxy groups and clusters CMB simulations of cosmic string signatures In collaboration with Moscow University Extraction of catalogues from astronomical images INAF + Caltech VST pipeline for distant clusters INAF + Caltech

8 Application 1 – VONeural _MLP photometric redshifts  Phot z are an alternative way, less accurate than spectroscopic but much more convenient in terms of computing power and observing time, to derive redshifts (i.e. distances) of extragalactic objects

9 SDSS-DR4/5 – GG trainingvalidationTest set 60%, 20%, 20% MLP, 1(5), 1(18) 0.01<Z<0.250.25<Z<0.50 99.6 % accuracy MLP, 1(5), 1(23)MLP, 1(5), 1(24)  rob = 0.206  rob = 0.234 Interpolation of systematic errors Phot Z for SDSS General Galaxy sample at least 30 experiments (10-12 h/each) training on 350.000 objects 12 features results for 32.000.000 objects

10 σ z = 0.02 Redshifts for 30 million galaxies Photometric redshifts for 30 million SDSS galaxies

11 Two types of compact groups Spatial clustering in phot_z space: two types of groups: Compact and isolated Loose and non embebbed into larger structures 95% of SKG has large fraction of E-type galaxies f 150 (E) ≥ 0.5.

12 Looking for AGN candidates Different orientations Different parameters become significant Different clusters in parameter space BUT, STILL THE SAME OBJECT !

13 3-D PCA PPS Dimensionality reduction (classification of correlated non linear data)

14 Negative entropy clustering

15 NEC: a matter of Gaussians Clustering method based on the “neg-entropy” NegE, a measure of non gaussianity of a variable. If A is gaussian, then NegE(A) = 0. Given a threshold d: If NegE(A U B) < d, then clusters A and B are replaced by cluster A U B Not replaced!Replaced! Negative entropy clustering

16 SDSS UKIDSS preprocessing clustering labeling BoK results PPS NEC dendrogram Cluster optimization 1 experiment ca. 11 days

17 SpecClass 0 | 1 | 2 | 3 | 4 |5| 6 PPS: We select clusters associating latent variables on the sphere and sources NEC: The number of clusters after the aggregation is determined by “cluster optimization”. Leads to proper binning of parameter space

18 Applicazione 2 con SVM Miglior Risultato: 81.5% PON-SCOPE GRID Infrastructure (110 nodes PON NA-CA-CT) lg 2 (gamma) lg 2 (C)

19 SDSS spectroscopic subsample of confimed QSO (specclass=4 & 6) UKIDS HO-QSO’s Colours used for all these experimentswere calculated using adjacent bands: u−g, g−r, r−i, i−z for the optical bands, and Y −J, J −H, H −K for the near infrared ones

20 Applicazione 2 con MLP Gli esperimenti sono stati effettuati selezionando soltanto gli oggetti presenti nel catalogo di G. Sorrentino et al. (2006) (z compreso tra 0.05 e 0.095) che venivano indicati come Tipo 1 e Tipo 2. Si sono selezionati solo quelli sicuramente AGN. Il dataset si componeva di 1570 oggetti: si è indicato con 1 gli oggetti di Tipo 1 e con 0 gli oggetti di Tipo 2. Il miglior risultato ottenuto è stato: Efficienza totale e = 99.4% Efficienza tipo 1 e tipo 1 = 98.4% Efficienza tipo 2 e tipo 2 = 100% Completezza tipo 1: c tipo 1 = 100% Completezza tipo 2: c tipo 2 = 98.9% 1(net)0(net) 1(known)1260 0(known)2186

21 Workshop SCoPE - Stato del progetto e dei Work Packages Sala Azzurra - Complesso universitario Monte Sant’Angelo 21-2-2008 THE END


Download ppt "The VONeural team Progetto S.Co.P.E. – WP4 The Virtual Observatory and the PON-SCOPE The VO-Neural Team G. Longo (Principal Investigator) M. Brescia (Project."

Similar presentations


Ads by Google