NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System.

Slides:



Advertisements
Similar presentations
SIP and Instant Messaging. SIP Summit SIP and Instant Messaging What Does Presence Have to Do With SIP? How to Deliver.
Advertisements

MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
SUS Feature Pack for SMS Michel Jouvin LAL / IN2P3
Heroix Longitude - multiplatform, automated application performance monitoring and management software.
REI – Recipe Execution Infrastructure Jens Knudstrup/ REI Recipe Execution Infrastructure.
MIT Lincoln Laboratory A Service-Oriented Approach to Application Development Robert Darneille & Gary Schorer WPI MQP Presentations ICS Group 10 October.
CACORE TOOLS FEATURES. caCORE SDK Features caCORE Workbench Plugin EA/ArgoUML Plug-in development Integrated support of semantic integration in the plugin.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
ICS 434 Advanced Database Systems
Multi-Mode Survey Management An Approach to Addressing its Challenges
Edoclite and Managing Client Engagements What is Edoclite? How is it used at IU? Development Process?
MIT iCampus iLabs Software Architecture Workshop June , 2006.
CCC/WNUG Exchange Update May 5, 2005 Nate Wilken Web and Messaging Applications Information Technology Arizona State University.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Modifying the SCSI / Fibre Channel Block Size Presented by Keith Bonneau, John Chrzanowski and Craig O’Brien Advised by Robert Kinicki and Mark Claypool.
Distributed Information Systems - The Client server model
Dolphin software SCI Software Replace in Title/Slide Master with Company Logo or delete Hugo Kohmann Dolphin Interconnect Solutions.
Kerim KORKMAZ A. Tolga KILINÇ H. Özgür BATUR Berkan KURTOĞLU.
Overview of Lustre ECE, U of MN Changjin Hong (Prof. Tewfik’s group) Monday, Aug. 19, 2002.
Honeywall CD-ROM. 2 Developers and Speakers  Dave Dittrich University of Washington  Rob McMillen USMC  Jeff Nathan Sygate  William Salusky AOL.
Professional Informatics & Quality Assurance Software Lifecycle Manager „Tools that are more a help than a hindrance”
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
©2010 Check Point Software Technologies Ltd. | [Unrestricted] For everyone Endpoint Security Current portfolio and looking forward October 2010.
Why Interchange?. What is Interchange? Interchange Capabilities: Offers complete replacement of CommBridge point-to-point solution with a hub and spoke.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
Overview of SQL Server Alka Arora.
Remote Service Solutions ATS 8550
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
CERN - IT Department CH-1211 Genève 23 Switzerland t The High Performance Archiver for the LHC Experiments Manuel Gonzalez Berges CERN, Geneva.
CIS 375—Web App Dev II Microsoft’s.NET. 2 Introduction to.NET Steve Ballmer (January 2000): Steve Ballmer "Delivering an Internet-based platform of Next.
1 Web Server Administration Chapter 1 The Basics of Server and Web Server Administration.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
S. Veseli - SAM Project Status SAMGrid Developments – Part I Siniša Veseli CD/D0CA.
SCADA. 3-Oct-15 Contents.. Introduction Hardware Architecture Software Architecture Functionality Conclusion References.
第十四章 J2EE 入门 Introduction What is J2EE ?
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice HP Library Encryption - LTO4 Key.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
1 Introduction to Microsoft Windows 2000 Windows 2000 Overview Windows 2000 Architecture Overview Windows 2000 Directory Services Overview Logging On to.
How to create DNS rule that allow internal network clients DNS access Right click on Firewall Policy ->New- >Access Rule Right click on Firewall.
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Computing Division Requests The following is a list of tasks about to be officially submitted to the Computing Division for requested support. D0 personnel.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
ECI – electronic Commerce Infrastructure “ An application to the Shares Market ” Demetris Zeinalipour ( Melinos Kyriacou
1 Putchong Uthayopas, Thara Angsakul, Jullawadee Maneesilp Parallel Research Group, Computer and Network System Research Laboratory Department of Computer.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
System Center Lesson 4: Overview of System Center 2012 Components System Center 2012 Private Cloud Components VMM Overview App Controller Overview.

Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
Marcelo R.N. Mendes. What is FINCoS? A set of tools for data generation, load submission, and performance measurement of CEP systems; Main Characteristics:
B. Dalesio, N. Arnold, M. Kraimer, E. Norum, A. Johnson EPICS Collaboration Meeting December 8-10, 2004 Roadmap for IOC.
March 2004 At A Glance ITPS is a flexible and complete trending and plotting solution which provides user access to an entire mission full-resolution spacecraft.
AFS/OSD Project R.Belloni, L.Giammarino, A.Maslennikov, G.Palumbo, H.Reuter, R.Toebbicke.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
CLIENT SERVER COMPUTING. We have 2 types of n/w architectures – client server and peer to peer. In P2P, each system has equal capabilities and responsibilities.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Online Software November 10, 2009 Infrastructure Overview Luciano Orsini, Roland Moser Invited Talk at SuperB ETD-Online Status Review.
Understanding SOAP and REST calls The types of web service requests
Cloud based Open Source Backup/Restore Tool
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Technical Capabilities
Presentation transcript:

NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System

NGAS – The Next Generation Archive System Jens Knudstrup Motivation Motivation for NGAS: -Handle huge amount of data streams in real time. -Reduce operational costs (man-power). -Decrease expenses in general. -Provide online and offline processing capabilities. -Ease integration of archive facility with external clients/applications. -Provide a common concept for the online archive and the long-term storage facilities (NGAS OLAS + ASTO + Jukebox SW + more). Note, no plan to replace OLAS for now. -Simplify and unify the overall infrastructure of the archive system. -Increase data security.

NGAS – The Next Generation Archive System Jens Knudstrup Main Objectives Main Objectives of NGAS: Provide an archive facility with services for handling all stages in the life-time of data files: - Archiving files (+ on-the-fly checking and processing). - Retrieving & on-the-fly processing of files. - Ensuring data consistency. - Providing services for managing data. - (Executing complex, parallel data processing - TBD) In addition, to provide a system: - Which is adaptable to specific contexts. - With a high performance + scalable.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS: History History of NGAS: -April 2001: Project started. -Mid June 2001: First operational prototype. -June 2001: Review + approval of design/concept. -Beginning July 2001: Installation/commissioning at La Silla (2.2m/WFI). -Mid July 2001: Entered operation at La Silla. -August 2001: Started operation of Garching NGAS Cluster. -February 2001: Upgrade from Suse to RedHat Linux. -August 2003: Installation/commissioning at Paranal (VLTI). -January 2004: Installation of second archive system for 3.6m/LS. -March 2004: First integration of NGAS on new HW (SATA). -September 2004: First tests using NGAS together with RAID5 Arrays. -September 2004: Archiving of HARPS pipeline products. -December 2004:Archiving of WFCAM frames from Cambridge/UK.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS: Components Main Components of the NGAS Project: 1. NGAS SW – NG/AMS (Next Generation Archive Management System). 2. NGAS WEB Interfaces. 3. HW – (low cost) PCs with removable ATA disks. 4. NGAS OS (Linux). 5. NGAS Utilities. 6. NGAS Installation and Configuration Tools.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Basic Concepts Basic Concepts of the NGAS SW (NG/AMS): NG/AMS is a platform/framework providing basic services. No information is hard-coded to support specific types of data – NG/AMS does not know what e.g. a FITS file is. No information is hard-coded to support specific HW configurations. The specific behavior and the specific knowledge has to be added to the NGAS system – customizable. Based on standard protocols and formats wherever possible – can be used as a building block. Simple - advanced features can be added in front-end applications giving clients a different view of the data + provide specific services.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Main Features/1 Main Features of NG/AMS (1): Multi-threaded server. Standard communication protocol (HTTP) + HTTP Authentication. Data file archiving via Push and Pull Techniques. Subscription Service including filter mechanism. DB synchronization (DB Snapshot Feature). Easy adaptation to different kinds of DBMS (ANSI SQL Engine/DB Driver). Flexible/adaptable due to usage of 10 different kinds of plug-ins. Many configurable parameters. XML information exchange. Notification Service.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Main Features/2 Main Features of NG/AMS (2): Advanced logging service (Verbose, Local Log File, Syslog). Background Data Consistency Checking. Operation in Cluster Mode. Transparent data retrieval & on-the-fly processing. APIs in ANSI-C and Python + two clients applications based on these. Archive Client for secure and simple, remote data file archiving. Many commands to interact with and control the system. Portable. Unit/Functional Tests.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Server Data Provider Data Provider Host Data Requestor Data Requestor Host Info Requestor Info Requestor Host NGAS DB DBMS Host Operations UNIX Sys Logs Log NG/AM S Server Main Disks Array NGAS Host Replication Disk Array Stdout NG/AMS Configuration Archive Pull Request Data Subscriber Client Data Subscriber Host HTTP POST Request NG/AMS Server NGAS Subscriber Host Archive Push Request HTTP POST Request

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Storage Media Infrastructure Basic Infrastructure of Storage Media:

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: XML Information Exchange Interprocess Data Exchange: - Most information exchanged between NG/AMS Servers and between the NG/AMS Server and clients, is based on XML. - Example, NgasDiskInfo Document (NG/AMS Status XML Document): <Status Date=" T08:40:23.350" HostId="acngast1" Message="Disk status file" Version="v2.0-Beta2/ T09:22:53"/> <DiskStatus Archive="ESO-ARCHIVE" AvailableMb="32300" BytesStored=" Checksum="" Completed="0" CompletionDate=" DiskId="IC35L040AVER07-0-SXPTX093675" InstallationDate=" T09:48: LogicalName="FITS-M Manufacturer="IBM" NumberOfFiles="163 TotalDiskWriteTime=" " Type="MAGNETIC DISK/ATA"/>

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: HTTP Command Interface telnet acngast Trying Connected to acngast1. Escape character is '^]'. GET STATUS HTTP/ OK <Status Date=" T14:59:42.724" HostId="acngast1" Message="Successfully handled command STATUS" State="ONLINE" Status="SUCCESS" SubState="IDLE" Version="v2.0-Beta2/ T09:22:53"/> Connection closed by foreign host.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: DB Synchronization DB Synchronization: NGAS DBs replicated from Paranal/La Silla to Garching (Unidirectional). Synchronization between DBs of the various NGAS sites also carried out by NGAS. NG/AMS maintains snapshot (DBM) on the disks with info about the files stored on it. Local DB synchronized with this info when the disk reappears on a site. DB Snapshot can be used as a table of contents for the disk. LS NGAS DB La Silla PAR NGAS DB Paranal PAR NGAS DB Garching DB Snapshot NG/AMS DB Synchronization DBMS Synchronization (Sybase)

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Plug-Ins NG/AMS Plug-Ins: -Ten different kinds of plug-ins provided. These make it possible to adapt the system to different kinds of hardware and different types of data – nothing is hard-coded: 1. Online Plug-In. 2. Offline Plug-In. 3. Data Archiving Plug-In. 4. Checksum Plug-In. 5. Data Processing Plug-In. 6. Registration Plug-In. 7. Label Printer Plug-In. 8. Filter Plug-In. 9. Suspension Plug-In. 10. Wake-Up Plug-In. -Standard plug-ins delivered with the system. Possible to replace these or add new plug-ins when needed. -The plug-ins delivered with a distribution of NGAS should be viewed as belonging to the core of the system when it comes to testing. -Normal user does not need to know about the plug-ins used.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Plug-Ins Data Archiving Plug-In – Basic Functioning: Replication Disk Storage Area Staging Area Main Disk Bad Files Area Storage Area NgasDiskInfo Target Storage Set NG/AMS Server DAPI Data File NGAS DB 1. Archive Request 2. Reception in Staging Area 3. DAPI Invocation 4. Data Checking/Processing, Parameter Extraction 5. DAPI Return Status 6. Storage of Main File in Final Location 7. DB Update, Main File 8. Replication of File 9. DB Update, Replication File

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: XML Configuration NG/AMS Configuration (1): About 110 different configurable parameters. Configuration can be loaded from an XML document or from the DB or a combination of these. Possible to re-use DB based parameters to compose specific configurations (easier to handle many, slightly different installations). Main groups of configurable parameters (1): -Basic Parameters: Port number, simulation mode, proxy mode, root mount point, … -Plug-Ins: The various plug-ins the system should use e.g. to handle data of a specific type. -DB Connection: The DB connection parameters. -Permissions: Archive, Retrieve, Processing, Remove Requests allowed. -Archive Handling Parameters: Parameters for handling Archive Requests. -Accepted Data Types: Types of data (mime-types) the system is can handle.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: XML Configuration NG/AMS Configuration (2): Main groups of configurable parameters (2): -Storage Sets: The disk configuration. -Streams: Defines how the different kind of data should be streamed onto the Storage Sets. -Available Processing Capabilities: Defines the types of data that can be processed and which Data Processing Plug-Ins to use. -Data Check/Janitor Thread Configuration: Parameters to tune the Data Checking and Janitor Threads. -Logging Parameters: E.g. name of log files + intensity to apply when logging. - Notification Parameters: Recipients of the various types of Notification Messages. -Host Suspension Parameters: Parameters for suspending a host + for waking up suspended hosts. -Subscription Parameters: Parameters to define if a server should subscribe for data. -Authorization Parameters: Defines the known users and their access code.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Data Consistency Checking Data Consistency Checking: -Necessary constantly to monitor the condition of the data in the archive. -Data Consistency Checking – Thread running in background. -Possible to tune the amount of resources occupied by the service. -A check run can be scheduled to run periodically via the configuration. -Checksum check, file availability, unregistered files on storage media. -A check sub-thread is started per disk (max. number configurable). -Info about files on the system dumped once in a DBM, retrieved file by file during checking. -Possible to resume a checking from where the previous was interrupted. - Notification send to subscribers in case problems found, e.g.: Subject: NGAS-arcus2-7778: DATA INCONSISTENCY(IES) FOUND Date: Fri, 25 Jan :06: (MET) From: Error Message: DATA INCONSISTENY(IES) FOUND IN DATA HOLDING: Date: T15:32: NGAS Host: arcus2 Inconsistencies: 1 Problem Description File ID Version ERROR: Inconsistent checksum found TEST T15:25:

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Operation in Cluster Mode/1 Example: NGAS Super Node (Proxy Mode) NGAS Super Node (Proxy Mode) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Main Node 1 NGAS Main Node 1 Network Switch Network Switch NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Main Node 2 NGAS Main Node 2 Network Switch Network Switch NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Sub-Node (10.X.X.X) NGAS Main Node 3 NGAS Main Node 3 Network Switch Network Switch Network Switch Network Switch Retrieve Request Private Network Cluster Back-Bone Network

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Operation in Cluster Mode/2 Example: NGAS Main Node NGAS Main Node Network Switch Network Switch Retrieve Request NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node NGAS Node

NGAS – The Next Generation Archive System Jens Knudstrup Garching NGAS Cluster NGAS Cluster

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Data Processing Data Processing at Retrieval: Simple processing supported when retrieving files. Possible to request the system to apply a Processing Plug-In on the data and to send back the result of the plug-in rather than the data itself. Processing performed on the sub-node hosting the data. Possible for clients to use the NGAS Cluster as a number cruncher to carry out parallel data processing in a simple manner. Reduces the amount of data to be transferred to the client. I.e., a floating point number may be returned rather than the entire data file. Can be extended by providing new Data Processing Plug-Ins for specific contexts. Could be used to integrate NGAS with the AVO or other archive services.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: APIs NG/AMS APIs + Clients: Two APIs implemented in C (C library) and Python (class) provided. Facilitates implementation of client applications communicating with NGAS, e.g. to retrieve data files. Two command line utilities are provided, based on the C and Python API, which can be used to interact with an NG/AMS Server. A standalone Archive Client is provided, based on the C-API: Independent of any DBMS. Can be used to archive files from any remote host which can access the NGAS Archive via HTTP. Attempts to archive file is retried until success is returned or file classified as bad by the remote NGAS system. Files not cleaned up before cross-checking that they are really in the remote NGAS Archive (CHECKFILE Command). First applications: Archiving of HARPS pipeline products and WFCAM files from Cambridge/UK.

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS Client Applications NG/AMS Archive Client NG/AMS Server Remote NGAS System NG/AMS Archive Client Data Provider Host Archive Queue Archived Files Area Bad Files Area Log Files Area BAD Log Info Log Rotation Control Archive Requests + Commands NGAS DB

NGAS – The Next Generation Archive System Jens Knudstrup NG/AMS: Server Commands NG/AMS Server Commands (HTTP Protocol): -Commands issued as URLs: : / [? [& ]] -Commands: ARCHIVE: Archive data with Archive Push or Archive Pull Technique. CHECKFILE: Execute an explicit file check of the given file. CLONE: Clone an entire disk or individual files. CONFIG:Configure an online system. DISCARD:Force removal of file from disk and/or DB independent of number of copies. EXIT:Make the NG/AMS Server exit. INIT:Re-initialize the NG/AMS Server. LABEL:Print out disk labels. OFFLINE:Bring server to Offline State. ONLINE:Bring server Online. REGISTER:Register a file of a set of file already stored on an NGAS Disk. REMDISK:Remove a disk from the archive (only allowed if at least 3 copies of each files available). REMFILE:Remove a file from the archive. RETRIEVE:Retrieve a file, transparently, from the archive. STATUS:Query status about the server or another component in the NGAS system/cluster. SUBSCRIBE:Subscribe to new data or a set of data. UNSUBSCRIBE:Unsubscribe a previously created subscription.

NGAS – The Next Generation Archive System Jens Knudstrup Unit/Functional Tests - Features Unit/Functional Tests: -Extensive set of automatic tests provided, consisting of: -30 Test Suites. -~130 Test Cases. -Tests portable (platform/HW independent). -Testing the business logic of the system and correct functioning (simulation mode). -Need to add more Test Cases for testing correct and consistent behavior under abnormal conditions and stress tests. -Needs to be enhanced with ~200 Test Cases before next release. -Possible to generate Test Plan from test code (next slide - overhaul ongoing).

NGAS – The Next Generation Archive System Jens Knudstrup Unit/Functional Tests - Test Plan Example:

NGAS – The Next Generation Archive System Jens Knudstrup NGAS WEB Interfaces NGAS WEB Interfaces: WEB Interfaces provided to assist operators in querying the status of the system and to search for various components (data files, disks, machines). Used at all sites by the operators (Garching, Paranal, La Silla). Based on Zope. WEB management system providing editing via WEB browser ( Local Zope WEB Servers available on each site. Tools provided to list disks, find specific files get an overview of the nodes and their status. Also the so-called Operators Log Book is provided. The operators use this to log all actions carried out. Used by the operators at Paranal/La Silla to monitor the online archiving activities. Services missing for interacting with the system. Only possible to control the disk label printing for now. An enhancement is planned in the near future.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS System/OS NGAS OS Distribution: -Started on a Suse Linux distribution and migrated to RedHat Linux (ESO standardization). -OS distribution prepared/managed by OTS-SOS. -Support for single-processor and multi-processor configurations. -Support for old HW (PATA) and new HW (SATA). -Limited installation, many packages removed to reduce the size of system. -Special packages needed by NGAS: Python, Sybase interface, Zope, … - installed by the NGAS Installation Tool. -Special driver SW needed for the 3ware controller. -Zope WEB server running on some nodes (optional). -3ware disk controller WEB server running on every host. -Possibility to back-up/restore complete system by means of the Mondo/Mindi tool kit (from a single CDROM) in 10 minutes. -From July 2004 NGAS OS platform installed with kickstart installation script.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS HW NGAS HW (1): -Started with 8 slots parallel ATA systems. -8 x 80 GB storage capacity per node (640 GB/node, ~1.2 TB compressed). -Since March 2004 a 24 slot serial ATA system in operation (up to 24 * 400 GB = 9.6 TB/node, 19.2 TB compressed). -Reduces price per GB. -More robust HW amongst other due to serial ATA (cleaner cabling). -Disk handling easier, more robust disk frames. -Overall HW stability (hopefully) better and less intervention needed (TBC). -Amount of data/CPU should be balanced to be able to process the data in a limited time. -TBD when to use new HW in operation at observatory sites. -Investigating usage of RAID5 rather then JBOD disks.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS HW NGAS HW (2):

NGAS – The Next Generation Archive System Jens Knudstrup NGAS Utilities NGAS Operators Utilities/Installation Utilities: -Small module provided (NGAS Utilities) with utilities for the daily work of the operators: -Limited time invested in this so far, however essential tools for the operation provided (e.g. Clone Verification Tool, Check File List Tool, Clone File List Tool, …). -The function of many of these tools should be taken over by the NGAS WEB Interfaces when these have been enhanced. -The module NGAS Installation Tools provides some utilities to install and check the system: -Tool provided to build NGAS layer on top of the basic NGAS Linux distribution. -Functionality still to be implemented.

NGAS – The Next Generation Archive System Jens Knudstrup NGAS Infrastructure Present ESO NGAS Infrastructure: NGAS DB NGAS DB NGAS DB Replication Archive Disk Sets Archive Unit Buffering Unit Archive Handling Unit Cluster Unit Ext. Archive Client Ext. Archive Client LS PAR GAR IN S

NGAS – The Next Generation Archive System Jens Knudstrup NGAS: Future Plans (Near) Future Plans for NGAS: Received detailed requirements from archive operations. Enhance NGAS WEB Management Interfaces. Enhancement of services for operation in cluster (extended proxy mode). Enhancement of installation utilities. Enhancement of unit tests (simulation of archive cluster operation). Implement load balancing/archive cluster operation for high availability/high data rates (VST/Cam: up to 300 GB/night, VISTA/VistaCAM up to 1 TB/night - TBC). Support for advanced data processing, utilizing an NGAS Cluster as a parallel processing engine (specify complex recipes, which are executing parallel data processing) – will be analyzed in the near future. Support for the Astrophysical Virtual Observatory/GRID?

NGAS – The Next Generation Archive System Jens Knudstrup Status - December 2004 Status of NGAS Project December 2004: -In operation since July Used heavily on a daily basis by archive operators in Garching. -Data archived daily at La Silla, Paranal and at ESO HQ. -Data archived directly into NGAS Archive in Garching from Paranal and Cambridge/WFCAM. -Some statistics: -Total number of nodes: ~25. -Total number of disks in use: ~260. -Total number of files in NGAS Archive: ~1,500,000. -Amount of compressed data in NGAS Archive: ~27 TB. -Amount of uncompressed data in NGAS Archive: ~45 TB. -Maximum throughput per node (archiving): ~400 GB/24 hours (including compression). -Major Issues to Address: -Need to invest more resources in implementing automatic tests in particular for testing robustness and handling of abnormal conditions. -Need to implement resources in implement an enhanced user interface - not very user-friendly at the moment. -Need to update the design document to reflect present status of system (not updated since it was written SPRING 2001). -Should investigate improved ways of ensuring data consistency and means for recovering lost data.