CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006.

Slides:



Advertisements
Similar presentations
CASPUR Site Report Andrei Maslennikov Lead - Systems Edinburgh, May 2004.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
LAL Site Report Michel Jouvin LAL / IN2P3
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
National Energy Research Scientific Computing Center (NERSC) The GUPFS Project at NERSC GUPFS Team NERSC Center Division, LBNL November 2003.
Backup Rationalisation Reorganisation of the CERN Computer Centre Backups David Asbury IT/DS Friday 6 December 2002.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Storage Survey and Recent Acquisition at LAL Michel Jouvin LAL / IN2P3
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
HPC at IISER Pune Neet Deo System Administrator
A Makeshift HPC (Test) Cluster Hardware Selection Our goal was low-cost cycles in a configuration that can be easily expanded using heterogeneous processors.
CASPUR SAN News Andrei Maslennikov Orsay, April 2001.
CC - IN2P3 Site Report Hepix Fall meeting 2009 – Berkeley
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Catania, April 2001.
Ceph Storage in OpenStack Part 2 openstack-ch,
HEPiX FSWG Progress Report Andrei Maslennikov Michel Jouvin GDB, May 2 nd, 2007.
LAL Site Report Michel Jouvin LAL / IN2P3
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Orsay, April 2001.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
MURI Hardware Resources Ray Garcia Erik Olson Space Science and Engineering Center at the University of WI - Madison.
Nov 1, 2000Site report DESY1 DESY Site Report Wolfgang Friebel DESY Nov 1, 2000 HEPiX Fall
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
CASPUR Site Report Andrei Maslennikov Lead - Systems Karlsruhe, May 2005.
November 2, 2000HEPiX/HEPNT FERMI SAN Effort Lisa Giacchetti Ray Pasetes GFS information contributed by Jim Annis.
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
PDSF at NERSC Site Report HEPiX April 2010 Jay Srinivasan (w/contributions from I. Sakrejda, C. Whitney, and B. Draney) (Presented by Sandy.
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
CASPUR Storage Lab Andrei Maslennikov CASPUR Consortium Catania, April 2002.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility (formerly CEBAF - The Continuous Electron Beam Accelerator Facility)
CEA DSM Irfu IRFU site report. CEA DSM Irfu HEPiX Fall 0927/10/ Computing centers used by IRFU people IRFU local computing IRFU GRIF sub site Windows.
KIT – The cooperation of Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Hadoop on HEPiX storage test bed at FZK Artem Trunov.
CASPUR Site Report Andrei Maslennikov Lead - Systems Amsterdam, May 2003.
CASPUR / GARR / CERN / CNAF / CSP New results from CASPUR Storage Lab Andrei Maslennikov CASPUR Consortium May 2003.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
26/4/2001LAL Site Report - HEPix - LAL 2001 LAL Site Report HEPix – LAL Apr Michel Jouvin
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
CASPUR Site Report Andrei Maslennikov Group Leader - Systems RAL, April 1999.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
HEPIX Backup Survey David Asbury CERN/IT/FIO HEPIX, Rome, 6 April 2006.
Tested, seen, heard… Andrei Maslennikov Rome, April 2006.
Mass Storage at SARA Peter Michielse (NCF) Mark van de Sanden, Ron Trompert (SARA) GDB – CERN – January 12, 2005.
3-5/10/2005, LjubljanaEWGLAM/SRNWP meeting NWP related Slovak Hydrometeorological Institute 2005.
Evangelos Markatos and Charalampos Gkikas FORTH-ICS Athens, th Mar Institute of Computer Science - FORTH Christos.
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
Jefferson Lab Site Report Sandy Philpott HEPiX Fall 07 Genome Sequencing Center Washington University at St. Louis.
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
Presentation transcript:

CASPUR Site Report Andrei Maslennikov Lead - Systems Rome, April 2006

A.Maslennikov - Rome Contents Update on central computers Batch news Storage news Projects 2006 HEPiX services

A.Maslennikov - Rome Central computers IBM SMP: - Purchased a POWER-5 cluster (21 nodes, 168 p575 CPUs at 1.9 GHz and 400 GB of RAM) - Communication subsystem: High Performance Switch (Federation) - Plenty of problems while putting it in production: - 2 nodes required parts replacements - HPS was failing stress tests (HW issues) - buggy software (both AIX and cluster software issues) - Now everything is solved, the system is running pre-production tests and is being tuned - Old 80 POWER-4 CPUs at 1.1 GHz and 144 GB of RAM – will soon be decomissioned HP SMP: - One EV7 system with 32 CPUs at 1.15 GHz, RAM: 64 GB, Tru64 5.1B+ - Pretty stable Opteron SMP: - Getting very popular: 2 more clusters with 50 CPUs each, one with Infiniband, one with QSnet - Most probably this area will be growing, the platform is very competitive NEC SX-6: - 8 CPUs, 64 GB of RAM

A.Maslennikov - Rome Batch news - Since many years we are using SGEEE on all platforms, but this is now going to change - Got impressed with PBS on Opteron clusters: - Better support for MPI jobs - Configuration is resource-based and allows for more flexibility - Fits very well with our new accounting scheme - Commercial variant (and hence support) available: PBSpro - May run on all our platforms - Now evaluating PBSpro on our new PWR5 cluster

A.Maslennikov - Rome Storage news Decomissioned IBM SANFS (StorTank) and now moving to GPFS: - Simultaneous problems on both MDS units, metadata lost (of course, we had a backup and immediately brought the data online on NFS) - This coincided with arrival of PWR5 cluster with DS4800 disk system (20TB) - First benchmarks of DS4800: 750 MB/sec aggregate - GPFS has a small overhead and may operate in the range Mb/sec on our hw - Already recycled all StorTank hw base (disks and machines) Purchased 2 new powerful NFS servers (CERN disk server with R6) - Currently under stress test, will shortly be put in production Purchased 2 new IFT disk systems (G2422, R6) - Currently being evaluated, will replace AFS RAID-5 arrays Tapes: some upgrades - Replaced the remaining LTO-1 units with LTO-3 - Data migration in progress SAN: migrated from Brocade to Qlogic, 2 new 5600 switches at 4 Gbit

A.Maslennikov - Rome CASPUR: principal resources in 2006 HP - 32 CPUs (1.15GHz) FC TAPE SYSTEMS 120 / 240 TB FC RAID SYSTEMS 70 TB AFS - 6TB NFS - 16 TB Data Movers Digital Library TSM Backup NEC 6X – 8 CPUs Opteron – 152 CPUs (2-2.4 GHz) IPSAN AFS Backup IBM CPUs (575,1900 MHz) HPS GPFS - 20 TB qsnetIB Polyserve - 24TB

A.Maslennikov - Rome Some projects, 2006 Technology tracking (in collab. with CERN and other centers) – 0.5 FTE - Just renewed the lab, tests in progress, plan to report at JLAB - New R6 devices (disk arrays and PCI boards) - Fast interconnects (10Gbit and IB) - Distributed file systems (new and updtated solutions): - GFS - Terragrid - … etc - PVFS2 - GPFS - Lustre - StorNext - New appliances like Open-E AFS/OSD (in collaboration with CERN, ENEA and RZ Garching) FTE - Implementation of an Object Shared Device (OSD) in accordance with T10 specs - OSD integration with AFS - Progressing reasonably, v 1.0 in August, will be reported during Storage Day

A.Maslennikov - Rome HEPiX services As was agreed shortly after Karlsruhe meeting: - Put in place a new K5 domain (HEPIX.ORG) and a new AFS cell (/afs/hepix.org) - Partial archive of past meetings (not all yet collected) - Photo archive - Web access to AFS areas - This service will be integrated with the SLAB in Access granted to all HEP institutes - Complementary to sitehttp://