Presentation is loading. Please wait.

Presentation is loading. Please wait.

MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam Presented.

Similar presentations


Presentation on theme: "MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam Presented."— Presentation transcript:

1 MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam Presented at SSDBM, July 2007, Banff, Canada

2 SkyServer provides public access to SDSS for astronomers, students, and wide public A project to make a map of a large part of the Universe 230 million object images 1 million spectra 4TB catalog data 9TB images

3 DBDBD’07, Eindhoven M. Ivanova et al., CWI SkyServer Schema 446 columns >370 million rows Vertical fragment of 100+ popular columns Materialized join of Photo and Spectra

4 DBDBD’07, Eindhoven M. Ivanova et al., CWI Outline MonetDB/SQL SkyServer porting lessons Query log lessons Evaluation Outlook

5 DBDBD’07, Eindhoven M. Ivanova et al., CWI MonetDB Background H 0@0 T Ra 0@0 1@0 2@0 0.0645 0.1433 0.2811 … H 0@0 T Dec 0@0 1@0 2@0 1.2079 1.0662 1.2495 … H 0@0 TUTU 1@0 2@0 14.70872 11.71277 12.02889 … RaDecU... 0.0645 0.1433 0.2811 … 1.2079 1.0662 1.2495 … 14.70872 11.71277 12.02889 … ……………… PhotoObjAll Ra BATDec BATU BAT

6 DBDBD’07, Eindhoven M. Ivanova et al., CWI MonetDB Architecture SQL MonetDB Server Tactical Optimizer MonetDB Kernel XQuery MAL function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X6:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",1); X9:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",2); X13:bat[:oid,:oid] := sql.bind_dbat("sys","photoobjall",1); X8 := algebra.kunion(X1,X6); X11 := algebra.kdifference(X8,X9); X12 := algebra.kunion(X11,X9); X14 := bat.reverse(X13); X15 := algebra.kdifference(X12,X14); X16 := calc.oid(0@0); X18 := algebra.markT(X15,X16); X19 := bat.reverse(X18); X20 := aggr.count(X19); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1; select count(*) from photoobjall;

7 DBDBD’07, Eindhoven M. Ivanova et al., CWI SkyServer with MonetDB Goal: To provide SkyServer mirror with similar functionality using MonetDB Three phases: 1%, 10%, entire SDSS data set Can we Do better in terms of performance and functionality? Improve query processing by novel parallelism and query cracking techniques? Extend functionality to support, e.g. LOFAR?

8 DBDBD’07, Eindhoven M. Ivanova et al., CWI Portability Lessons Need for rich SQL environment (PSM) Cast to SQL:2003 standard –Replacement of data types and operations –Specific extensions ignored or replaced Avoid data redundancy –Auxiliary tables replaced by views:10% size reduction

9 DBDBD’07, Eindhoven M. Ivanova et al., CWI Spatial Search Lesson HTM (Hierarchical Triangular Mesh) –Implemented in C++, C# –Good for point-near-point and point-in- region queries Zones –Implemented in SQL –Good for point-near-point (x3) –Efficient for batch-oriented spatial join(x32) –Enables SQL optimizer usage

10 DBDBD’07, Eindhoven M. Ivanova et al., CWI Query Log Lessons Query logs important for both application and science Analysed 1.2M queries, August 2006 Spatial access prevails (83%) Small core of photo and spectro tables accessed –64% photo, 44% spectro, 27% both

11 DBDBD’07, Eindhoven M. Ivanova et al., CWI Common Patterns Limited number of query patterns –Correlation to web site interface Most popular query (25%) SELECT top 10 p.objID, p.run, p.rerun, p.camcol, p.field, p.obj, p.type, p.ra, p.dec, p.u, p.g, p.r, p.i, p.z, p.Err_u, p.Err_g, p.Err_r, p.Err_i, p.Err_z FROM fGetNearbyObjEq(195,2.5,3) n, PhotoPrimary p WHERE n.objID = p.objID;

12 DBDBD’07, Eindhoven M. Ivanova et al., CWI Query Coverage Query coverage on the sky

13 DBDBD’07, Eindhoven M. Ivanova et al., CWI Spatial Overlap 24% queries overlap Mean sequence length of 9.4, max of 6200 Overlap and equality patterns for script- based interaction Zoom in/zoom out patterns for manual interaction

14 DBDBD’07, Eindhoven M. Ivanova et al., CWI Evaluation on 1GB

15 DBDBD’07, Eindhoven M. Ivanova et al., CWI Evaluation on 100GB ‘Color-cut’ for low-z quasars SELECT g, run, rerun, camcol, field, objID, FROM Galaxy WHERE ( ( g <= 22) and (u - g >= -0.27) and (u - g < 0.71) and (g - r >= -0.24) and (g - r < 0.35) and (r - i >= -0.27) and (r - i < 0.57) and (i - z >= -0.35) and (i - z < 0.7) ); Moving asteroids SELECT objID, sqrt(power(rowv,2) + power(colv,2)) as velocity FROM PhotoObj WHERE power(rowv,2) + power(colv,2) > 50 and rowv >= 0 and colv >= 0;

16 DBDBD’07, Eindhoven M. Ivanova et al., CWI Status Staircase to the sky –1GB: done –100GB: towards completion –Entire 4TB DR6: in progress Web site

17 DBDBD’07, Eindhoven M. Ivanova et al., CWI Inspirations Self-organization vs. hard-coded zoning –Adaptive segmentation (ICDE’08) –Adaptive replication (EDBT’08) Results caching and reuse Workload-driven optimization


Download ppt "MonetDB/SQL Meets SkyServer: the Challenges of a Scientific Database Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten CWI, Amsterdam Presented."

Similar presentations


Ads by Google