SDSS Quasars Spectra Fitting N. Kuropatkin, C. Stoughton.

Slides:



Advertisements
Similar presentations
Global Hands-On Universe meeting July 15, 2007 Authentic Data in the Classroom with the Sloan Digital Sky Survey Jordan Raddick (Johns Hopkins University)
Advertisements

A Toolbox for Blackboard Tim Roberts
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Many sources (hot, glowing, solid, liquid or high pressure gas) show a continuous spectra across wavebands. Emission spectra Elements in hot gases or.
Sphinx Server Sphinx Client Data Warehouse Submitter Generic Grid Site Monitoring Service Resource Message Interface Current Sphinx Client/Server Multi-threaded.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Building Enterprise Applications Using Visual Studio ®.NET Enterprise Architect.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Near Earth Objects Near-Earth Objects (NEOs) are comets and asteroids that have been nudged by the gravitational attraction of nearby planets into orbits.
Astrophysics on the OSG (LIGO, SDSS, DES) Astrophysics on the OSG (LIGO, SDSS, DES) Kent Blackburn LIGO Laboratory California Institute of Technology Open.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
An Accretion Disc Model for Quasar Optical Variability An Accretion Disc Model for Quasar Optical Variability Li Shuang-Liang Li Shuang-Liang Shanghai.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
W EB - BASED B IOINFORMATICS P IPELINES FOR B IOLOGISTS Integrative Services for Genomic Analysis (ISGA) Chris Hemmerich Center for Genomics and Bioformatics.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
WorkPlace Pro Utilities.
PowerPoint Presentation By: David Larson. IPA’s Identify Components of Visio 2010 interface, navigate a Visio drawing, and get help Using Visio. Manipulate.
Mobile search engine for a smart phone / navigation system can be used to search and compare hundreds of stores and their products in seconds. © 2001 –
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
Software Engineering 2003 Jyrki Nummenmaa 1 CASE Tools CASE = Computer-Aided Software Engineering A set of tools to (optimally) assist in each.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
COMPUTER MODELS FOR SKY IMAGE ANALYSIS OF THE INASAN ZVENIGOROD OBSERVATORY Sergei Pirogov ( Institute of Astronomy, Russian Academy of Sciences) VIIth.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
XML Registries Source: Java TM API for XML Registries Specification.
Spectral Analysis in the VO Thomas Rauch, Iliya Nickelt and the GAVO and AstroGrid-D Teams.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
EÖTVÖS UNIVERSITY BUDAPEST Department of Physics of Complex Systems VO Spectroscopy Workshop, ESAC Spectrum Services 2007 László Dobos (ELTE)
Grid Workload Management Massimo Sgaravatto INFN Padova.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Java Portals and Portlets Submitted By: Rashi Chopra CIS 764 Fall 2007 Rashi Chopra.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The Software Development Process
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
INFSO-RI Enabling Grids for E-sciencE CHARON System Jan Kmuníček, Petr Kulhánek, Martin Petřek CESNET, Czech Republic.
HammerCloud Functional tests Valentina Mancinelli IT/SDC 28/2/2014.
Solving Function Optimization Problems with Genetic Algorithms September 26, 2001 Cho, Dong-Yeon , Tel:
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Sem 2v2 Chapter 5 Router Startup and Setup. A router initializes by loading the bootstrap, the operating system, and a configuration file. If the router.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
GAIA (Genetic Algorithm Interface Architecture) Requirements Analysis Document (RAD) Version 1.0 Created By: Charles Hall Héctor Aybar William Grim Simone.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
T Project Review Muuntaja I1 Iteration
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Software tools for digital LLRF system integration at CERN 04/11/2015 LLRF15, Software tools2 Andy Butterworth Tom Levens, Andrey Pashnin, Anthony Rey.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Advanced Higher Computing Science The Project. Introduction Worth 60% of the total marks for the course Must include: An appropriate interface using input.
Workload Management Workpackage
TE004 Smart Change Management with Sage CRM Component Manager
MIK 2.1 DBNS - introduction to WS-PGRADE, 2013
Chapter 15 QUERY EXECUTION.
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Google Sky.
The connection between Lyman-alpha absorption in low-redshift quasars and intervening galaxies Johen Liske Tutor: Alizera Aghaee, Begona Ascaso Angles,
Status and plans for bookkeeping system and production tools
Frieda meets Pegasus-WMS
Web Application Development Using PHP
Presentation transcript:

SDSS Quasars Spectra Fitting N. Kuropatkin, C. Stoughton

Introduction Chris Stoughton Quasars are complex objects. A swirling cloud of gas and plasma falling into a black hole glows at many different wavelengths. Astronomers measure this spectrum of light to measure the properties of each quasar. The model we fit to the spectrum includes the following components: Quasars are complex objects. A swirling cloud of gas and plasma falling into a black hole glows at many different wavelengths. Astronomers measure this spectrum of light to measure the properties of each quasar. The model we fit to the spectrum includes the following components:

power-law continuum, decreasing as exp(-lambda) power-law continuum, decreasing as exp(-lambda) a Balmer Continuum due to ionized Hydrogen, with a characteristic bump from 2000 to 4000 Angstroms a Balmer Continuum due to ionized Hydrogen, with a characteristic bump from 2000 to 4000 Angstroms strong emission lines from ionized gas, such as Hydrogen, Nitrogen, Oxygen, and Magnesium. strong emission lines from ionized gas, such as Hydrogen, Nitrogen, Oxygen, and Magnesium. many faint emission lines from Iron many faint emission lines from Iron starlight from the galaxy that surrounds the quasar. starlight from the galaxy that surrounds the quasar.

We vary the values of the parameters in this model to search for the parameters set that minimizes chi-squared. Since this includes hundreds of parameters, we used a "genetic" algorithm to find a good estimate of the parameters set with the best chi-squared. We vary the values of the parameters in this model to search for the parameters set that minimizes chi-squared. Since this includes hundreds of parameters, we used a "genetic" algorithm to find a good estimate of the parameters set with the best chi-squared. The genetic algorithm keeps track of 100 sets of parameters. Borrowing terms from biology, we call one set of parameters a chromosome, and each parameter is a gene. We start by generating 100 random chromosomes, using reasonable ranges for the value of each gene. We calculate chi-squared for each chromosome and sort the results in order of increasing chi-squared. We then do 100 iterations of the following steps: The genetic algorithm keeps track of 100 sets of parameters. Borrowing terms from biology, we call one set of parameters a chromosome, and each parameter is a gene. We start by generating 100 random chromosomes, using reasonable ranges for the value of each gene. We calculate chi-squared for each chromosome and sort the results in order of increasing chi-squared. We then do 100 iterations of the following steps:

save the first chromosome (the "fittest" survives) save the first chromosome (the "fittest" survives) for the next 20 chromosomes, perturb the gene values by 1 sigma for the next 20 chromosomes, perturb the gene values by 1 sigma for the next 20 chromosomes, perturb the gene values by 5 sigma for the next 20 chromosomes, perturb the gene values by 5 sigma for the next 20 chromosomes, "breed" them by taking some genes from one parent and the rest of the genes from another parent for the next 20 chromosomes, "breed" them by taking some genes from one parent and the rest of the genes from another parent remove the remaining chromosomes and replace them with randomly generated ones remove the remaining chromosomes and replace them with randomly generated ones sort these "new" chromosomes in order of increasing chi-squared sort these "new" chromosomes in order of increasing chi-squared

At the end of these iterations, declare the first chromosome to be the estimate of the best chi- squared fit. At the end of these iterations, declare the first chromosome to be the estimate of the best chi- squared fit. The Sloan Digital Sky Survey has measured the spectrum of tens of thousands of quasars. The Sloan Digital Sky Survey has measured the spectrum of tens of thousands of quasars. Each spectral fit consumes approximately 1 hour of CPU time. Each spectral fit consumes approximately 1 hour of CPU time. We are using the OSG to process these spectra with various implementations of this model. We are using the OSG to process these spectra with various implementations of this model.

Generic Grid Gofer N. Kuropatkin The task of fitting QSO spectra is an ideal job for the grid. The task of fitting QSO spectra is an ideal job for the grid. It is CPU bound. Execution time is about 1 hour. It is CPU bound. Execution time is about 1 hour. Staged-in data and parameters are only about 1 Mbytes Staged-in data and parameters are only about 1 Mbytes Staged-out results are only about 2 Mbytes Staged-out results are only about 2 Mbytes

SDSS QSO spectra fitting dataflow

Shown dataflow is very generic. Shown dataflow is very generic. About 90% of all jobs on grid can satisfy the dataflow. About 90% of all jobs on grid can satisfy the dataflow. The main specific of different grid tools is the software used on the submission host. The main specific of different grid tools is the software used on the submission host. We are using Generic Grid Gofer (GGG) – fine blend of SQL database and Grid Middleware in form of Java package. We are using Generic Grid Gofer (GGG) – fine blend of SQL database and Grid Middleware in form of Java package. Objectivities – simplicity, reliability, comprehensive bookkeeping, automatic production Objectivities – simplicity, reliability, comprehensive bookkeeping, automatic production

Generic dataflow in GGG

GGG production steps All jobs are stored in “jobs” table. All jobs are stored in “jobs” table. Available grid sites are stored in “pool” table Available grid sites are stored in “pool” table Job Manager takes jobs from the database, creates Condor DAG files and submits them to sites from the pool in an automatic mode. Job Manager takes jobs from the database, creates Condor DAG files and submits them to sites from the pool in an automatic mode. Two main parts – Job Manager and DAG Creator Two main parts – Job Manager and DAG Creator All completed stages of a job are recorded in the database together with submission time and execution time All completed stages of a job are recorded in the database together with submission time and execution time

The DAG creator block diagram

The DAG Creator class Implements interface between the Job Manager and Grid Middleware Implements interface between the Job Manager and Grid Middleware Uses XML templates describing the job DAG and Condor submit files to create an abstract DAG and then a concrete DAG Uses XML templates describing the job DAG and Condor submit files to create an abstract DAG and then a concrete DAG Performs several stages of substitution of dummy parameters in the templates using values from environment, job description and site description files. Performs several stages of substitution of dummy parameters in the templates using values from environment, job description and site description files.

Install OSG software. Install OSG software. Install the GGG package Install the GGG package Use the Demo Application as a template to create your own production. You will need to modify 5 simple shell scripts and 5 simple XML files. Use the Demo Application as a template to create your own production. You will need to modify 5 simple shell scripts and 5 simple XML files. Create site description XML files for sites where you want to run your jobs. There is tool to help with this. Create site description XML files for sites where you want to run your jobs. There is tool to help with this. Distribute your software on those sites. See demo application how to do this Distribute your software on those sites. See demo application how to do this Initialize database. There are example programs Initialize database. There are example programs Lunch JobManager Lunch JobManager Watch how it works. Watch how it works. How any user can use the package to start his own production?

Conclusion We have created simple and generic tool to organize data processing on grid. This tool was used to process 10% of SDSS QSO spectra in about two weeks. The tool can be used for many different grid productions. We have created simple and generic tool to organize data processing on grid. This tool was used to process 10% of SDSS QSO spectra in about two weeks. The tool can be used for many different grid productions. We are working on the software distribution and web page. We are working on the software distribution and web page. More details can be found at More details can be found at