GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Status of the D1 (Event) and D2 (Spacecraft Data) Database Prototypes for DC1 Robert.

Slides:



Advertisements
Similar presentations
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
Advertisements

Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
What is MySQL? MySQL is a relational database management system (A relational database stores data in separate tables rather than putting all the data.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
LAT Data Server Workshop - 1 Jan 13-14, 2005 Tom Stephens GSSC Database Lead GSSC LAT Data Server Overview.
GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Science Analysis Tools Design Robert Schaefer – Software Lead, GSSC.
Report Distribution Report Distribution in PeopleTools 8.4 Doug Ostler & Eric Knapp 7264.
1 Tom Stephens GSSC/GSFC Database Access and the dataSubselector Tool December 8, 2003 DC1 Kickoff Workshop.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
Automatic Software Testing Tool for Computer Networks ARD Presentation Adi Shachar Yaniv Cohen Dudi Patimer
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
1 DAN FARRAR SQL ANYWHERE ENGINEERING JUNE 7, 2010 SCHEMA-DRIVEN EXPERIMENT MANAGEMENT DECLARATIVE TESTING WITH “DEXTERITY”
Christopher Jeffers August 2012
CSCI 6962: Server-side Design and Programming JDBC Database Programming.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
Home Media Network Hard Drive Training for Update to 2.0 By Erik Collett Revised for Firmware Update.
Codeigniter is an open source web application. It occupies a very small amount of space in the memory and is most useful for developers who aim to develop.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
M1G Introduction to Database Development 6. Building Applications.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
Customizing your own SENSORS (site) Ethan Danahy Tufts University June 7 th, 2001.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Computer Emergency Notification System (CENS)
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Capabilities of Software. Object Linking & Embedding (OLE) OLE allows information to be shared between different programs For example, a spreadsheet created.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
The Software Development Process
Peter Chochula ALICE Offline Week, October 04,2005 External access to the ALICE DCS archives.
 Registry itself is easy and straightforward in implementation  The objects of registry are actually complicated to store and manage  Objects of Registry.
Content Management System/ Web Quality Initiative Administrative Departments.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Marcelo R.N. Mendes. What is FINCoS? A set of tools for data generation, load submission, and performance measurement of CEP systems; Main Characteristics:
GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Science Analysis Tools Design Robert Schaefer – Software Lead, GSSC.
NSF DUE ; Wen M. Andrews J. Sargeant Reynolds Community College Richmond, Virginia.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Oct HPS Collaboration Meeting Jeremy McCormick (SLAC) HPS Web 2.0 OR Web Apps and Databases (Oh My!) Jeremy McCormick (SLAC)
+ Publishing Your First Post USING WORDPRESS. + A CMS (content management system) is an application that allows you to publish, edit, modify, organize,
This material is based upon work supported by the U.S. Department of Energy Office of Science under Cooperative Agreement DE-SC Michigan State.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Accounting in DataGrid HLR software demo Andrea Guarise Milano, September 11, 2001.
June 27-29, DC2 Software Workshop - 1 Tom Stephens GSSC Database Programmer GSSC Data Servers for DC2.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
1 BCS 4 th Semester. Step 1: Download SQL Server 2005 Express Edition Version Feature SQL Server 2005 Express Edition SP1 SQL Server 2005 Express Edition.
 Project Team: Suzana Vaserman David Fleish Moran Zafir Tzvika Stein  Academic adviser: Dr. Mayer Goldberg  Technical adviser: Mr. Guy Wiener.
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
GLAST Science Support Center November 8, 2005 GUC Action Item #15 AI#15: Pre-Launch GI Proposal Tools David Band (GSSC/JCA-UMBC)
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
Understanding and Improving Server Performance
Simulink Interface Layer (SIL)
LOCO Extract – Transform - Load
Database application MySQL Database and PhpMyAdmin
Cloud based Open Source Backup/Restore Tool
Introduction to Ms-Access Submitted By- Navjot Kaur Mahi
David Band – Science Lead, GSSC
Exploring the Power of EPDM Tasks - Working with and Developing Tasks in EPDM By: Marc Young XLM Solutions
Presentation transcript:

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Status of the D1 (Event) and D2 (Spacecraft Data) Database Prototypes for DC1 Robert Schaefer – Software Lead, GSSC

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 2 Databases Outline Database Definitions review EGRET Databases DC1 plans D1/D2 Design Decisions D1 and access Prototype Architecture D2 D1E & D2E Future

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 3 Database definition D1 - Event Database: LAT Event Data queryable by region of the sky, energy, time, Earth azimuth, etc. A query results in an FT1 ) file which contains all data matching the selection. –Actually two databases - a separate photon database (called photon database) and one containing all events (called event database). D2 - Spacecraft Database: LAT pointing, livetime, and mode history data queryable by (most commonly) time range, spacecraft mode, and other spacecraft pointing parameters and returns the data in an FT2 file ( )

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 4 Database Extractor Utilities Need to create “extractor” utilities, i.e., tools to send the query to the database and retrieve the data from it. There is also a “User level data extraction utility”. This is for refining data queries locally. This tool also has added benefits: –load is moved off of the database server. –Home version of D1/U1! Nomenclature: –U1 Event Data extractor –U2 User level data extraction utility –U3 Pointing/livetime history extractor

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 5 EGRET Databases EGRET databases for analysis tools: –D1E EGRET photon database –D2E EGRET pointing history database Not necessary to force EGRET data strictly into FT1 and FT2 formats to add the analysis capability - Easier for implementing EGRET analysis.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 6 DC1 Plans Working prototypes of Databases (D1, D1E, D2, and D2E) for DC1 User friendly extraction capabilities (U1, U1E, and U3, U3E) Possibly a stripped down U2 sub-selector tool which can also handle EGRET data. For DC1, a user will be able to log into the database web page, make a query, and ftp the data as FT1 and FT2 files to a local disk for analysis. All of these tasks will be done with the prototype architecture that is being discussed here.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 7 D1/D2 Design Decisions D1: “Database and Related Utilities working group” settled on using a beowulf cluster to do parallel searches of time- ordered data in FITS files (using the CFITSIO library). D2: Since we are already writing all the software to search FITS files for D1, it was sensible to make D2 do the same. However, since we expect the vast majority of queries to be by time range only, we do not need a beowulf for this database.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 8 GLAST photon database access Head Node Slave nodes Staging Beowulf/Cluster Web/ftp server ftp stager FITS data ingest LAT DPF SSL Queue Manager Database Server DTS SSL hosts processes disks local client search X-mounted disks coordinate 1...5

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop -- 9 Beowulf Cluster Head 1 N 2 FITS data FITS data FITS data … Staging 1000BaseT data, 100BaseT messages FITS Data list RAID 0 SCSI Database Server hosts processes disks coordinate search X-mounted disks SSL Note: only single disks (no RAID) for DC1 prototype

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Beowulf Status Development beowulf for prototype purchased (now in boxes at SSC) –5 slave + 1 head nodes (dual processor AMD, 2 ethernet cards, 73 Gb 15k rpm SCSI disk) –All nodes rack mounted, with kvm switch. Beowulf perl-based search code prototype presented at June 2002 ST workshop converted to C, and improved to use better communication and process handling. Tested on beowulf at UNM with PIII / 1 GHz processors. Next, get that code running on GSFC beowulf and then on SLAC beowulf.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Queue Manager In a RDBMS system, QM would be part of the DBMS server. Queue Manager is a large event loop which: –Tracks state of all submitted queries –Validates queries –Assigns priority to query (based on size and submission type a batch queue will be enabled) –Sends queries to database by priority and time in. –Checks for timeouts –Handles communication with query clients, database, and stager. –Logs all requests Beowulf + Queue Manager = “D1”

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Stager Reformats output of Database into proper FITS files. Moves data to ftp accessible area. Works under direction of queue manager. We would need a stager even if we were using an RDBMS

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop D2 Design is just like D1, only there is no beowulf. Communication and searching is done on a single node which mounts the data disk.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop D1E, D2E D1E can run within D1 on beowulf. Choosing EGRET data was already available with a command line option on the proto-prototype beowulf presented at last June’s ST workshop. D2E can also easily run within framework of D2. Note: parameters for EGRET are not all the same as for GLAST, so the stagers must be different. Fun with MySQL. Since the amount of data in EGRET’s pointing data is so small (40 Mb - see ) we loaded it into a MySQL database. D2E has been implemented this way (web page front end, D2E stager, MySQL table), but there are still some bugs in it.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Many Worlds of U1 and U3 For DC1, –U1 and U3 are the web pages that allow database querying. –Also will have the local client to send queries directly to the Queue Manager. (This tool could be an exclusive LAT team tool for use with the SLAC Bewoulf.) Later, U1 and U3 will expand to include local tools that will –allow a user to construct the query fill in the Web page form –read the resulting web pages and –ftp the resulting data back to the person’s local host.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop Status Beowulf prototype search code written - needs testing Queue Manager nearly out of design phase - coding about to begin Stager, Ingest programs not yet written but should be easy to do. D2 search engine (without beowulf) needs to be done - but should also be straightforward. Test D1 with 389 days of photon event data expanded to FT1 size. Implementation of “E” versions easy D2E as MySQL almost ready for querying Use Cases being updated.

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop To Do List and Issues Queue Manager is complicated will take a couple of months to have it working. Beowulf needs to be setup and have software installed and debugged. Stager, Ingest programs need to be written. Issues for future –Should single QM handle D1 and D2? –Just what parameters should be query-able via web page? –Get performance benchmarks with newer hardware –How big is D2 (how often do we need position information updates?) –Use of pixelization of data for D1 (to improve search speed).

GLAST Science Support CenterJuly, 2003 LAT Ground Software Workshop GLAST D1 Photon Database The main science product database will be a Beowulf cluster of machines which will do parallel spatial searching of data stored in FITS files. Why not use a standard DBMS? –Data access will be simple read-only queries; we do not need many of the features of a DBMS (database management system). –DBMS power comes with an overhead. We benchmarked search speeds of FITS files using CFITSIO and three DBMS systems - searching FITS files was fastest. (same results found by ASTROGRID project in the UK - –Database will be stored in FITS format - desired by HEASARC, and flexible - can easily accommodate modifications of data content. –Reprocessing will replacing FITS files rather than finding and deleting old photons and inserting new photons. –Downside is that we have to write search and interface software, but we are well on the way.