Presentation on theme: "LONI LONI/LSU RP Update Honggao Liu, Ph.D Director, LSU NSF HPCOPS PI, LONI November 5, 2009."— Presentation transcript:
LONI LONI/LSU RP Update Honggao Liu, Ph.D Director, HPC @ LSU NSF HPCOPS PI, LONI November 5, 2009
LONI Queen Bee Update Was fairly reliable and down four times for total of 116 unavailable hours in the past four months Network connection between QB and the rest of the TeraGrid used a dedicated 1 Gbps connection to LONI, and a 10 Gbps connection from LONI to Chicago. The planned 10 Gbps to QB has been delayed by physical construction at a carrier hotel building in downtown Baton Rouge, which is preventing the installation of a new fiber. LONI ordered a 10GE Metro-E circuit between LSU and QB in September as a local wave service from AT&T. We are looking at the beginning of January 2010 to have the local wave service operational. Four incidents occurred in the past four months. Each involved a user gaining root priviledge on QB head nodes. In each case, the impacted users were notified and forced to change their passwords. The head nodes were reinstalled and kernel patches were installed.
LONI Queen Bee Usage Queen Bee had over 85% total usage in past four months
LONI Queen Bee Usage Users, Jobs and SUs for Queen Bee relative to the peak of each data type. That lets one plot all 3 data types on the same graph to see how they relate
LONI LONI’s New TeraGrid Projects Project: SAGA (http://saga.cct.lsu.edu/) Deployment on TeraGrid Project Aim I: Deploy SAGA on major TeraGrid resources (Kraken, Ranger, Abe/QB) –Get stable release (Ole Weidner) Scheduled Date: 31st Oct, 2009 Estimated Date: 15th Nov, 2009 –Make available via CTSS (Lukasz) Work with RP and GIG (progressing well) Test deployment available on QB Project Aim II & III: SAGA-based Shell & Developing/Deploying FAUST (Framework for Adaptive Ubiquitous Scalable Tasks) –Planned for second half (Jan’10-May’10) of project –Depends upon stable and reliable deployment on TG
LONI SAGA Deployment on TeraGrid Project Aim IV: Documentation (Andre) –Programming Manual and Exercise (Andre, Bety) in progress http://faust.cct.lsu.edu/trac/saga/wiki/Tutorials/NeSC2009 –Tutorial and Training Held several training events in Fall 2009 –International Summer School on Grid Computing –Advanced Distributed Summer School –NeSC-Edinburgh Training –Planned LONI training (January 2010)? –Is there interest in a TG-wide tutorial/training? We currently provide source releases only – they’re available at http://saga.cct.lsu.edu/download/ We’re following a 6/8-weekly release cycle. –1.4 release due date 15 Nov (TeraGrid version) –1.5 release due date 15 Jan 2010 File a bug or feature request here: –http://faust.cct.lsu.edu/trac/saga/
LONI LONI’s New TeraGrid Projects Project: TeraGrid-LONI-DEISA Interoperability Background: Demonstrate the advantages of Scale-Out and Interoperability (across TG and DEISA) for appropriate scientific problems Aim: To enhance the understanding of HIV-1 enzymes using replica-based methods across federated TG-DEISA-LONI –Do so using general-purpose, extensible, scalable approach –Test limits of Distributed Scale-Out – both algorithmic and infrastructure limits –As part of the VPH project, to ultimately help build the CI for quick, efficient (patient- specific) decision-tools using predictive MD of drugs and enzymatic targets (HIV-1 protease) Application Models of HIV-1 and drugs created Integration of LAMMPS with SAGA Initial Replica-Exchange performed Integration of LAMMPS with SAGA-based BigJob Initial isolated runs on TeraGrid: Ranger and Abe Working on launching on DEISA SAGA-UNICORE (via GridSAM) testing in place
LONI TeraGrid-LONI-DEISA Interoperability Next Steps: Integration of SAGA into Binding Affinity Calculator (BAC) tools to facilitate distributed Scale-Out Protonation study of Ritonavir bound to HIV-1 Protease wild type (on QB/Ranger) Study of binding affinity between 6 HIV-1 Protease mutants and the drug Ritonavir using SAGA-BAC Tools Develop tools for Post-Processing on UK NGS and DEISA Investigation of Reverse Transcriptase with Replica-Exchange (If time permits )
LONI LONI’s New TeraGrid Projects Project: Extension of PetaShare to TeraGrid PetaShare is an NSF-funded project that is deploying additional disk and tape storage at LONI sites and developing user-friendly data-aware storage systems, data-aware schedulers, and cross-domain metadata schemes. PetaShare is currently providing distributed data storage and management capabilities to nine LONI institutions connected via high-speed LONI network. This project is to extend PetaShare toTeraGrid thus TeraGrid users are be able to access their datasets in a more convenient way using the transparent PetaShare interfaces. TeraGrid and LONI users be able to easily share and exchange data with each other. PetaShare data access and retrieval services currently optimized for the LONI network and will need to be enhanced and optimized for the wide-area TeraGrid networks. PetaShare services currently run only Linux-based systems and will need to be ported to different architecture and operating systems on Teragrid. Ahmet Topcu was hired from IU for the TG PetaShare project and started here on June 15.
LONI LSU HPC/CCT Update New Linux Cluster –Philip –Total 38 nodes, with 8 Intel “Nehalem” Xeon cores @ 2.93GHz, 160GB HD, 1GB Ethernet per node –32 nodes with 24GB 1333MHz Ram, 3 nodes with 96GB 1066MHz Ram and 3 nodes with 48GB 1066MHz Ram –Open to users in September. Not a TeraGrid resource but potential for OSG jobs New Educational Cluster dedicated for students--Arete –Total 72 nodes. 56 nodes have 8 AMD Opteron cores @2.3GHz and 16 nodes with cores @ 2.7GHz, 8GB RAM, 4x146GB HDD, Infiniband and 1GB Ethernet –Available for campus wide use beginning in the Spring 2010 New Lustre storage –240TB DDN storage through Dell was received and deployed as long term storage and will be allocated to LSU HPC users –The current 55TB Panasas storage will be upgraded to 80TB in December
LONI LONI/LSU Training 5 workshops were held at LONI/LSU since June 13 tutorials were provided since September at LSU and on Access Grid TitleLocationDate# of Participant s Method Beowulf Boot campLSU6/15-1822In classroom SC09 Parallel ComputingLSU7/05-1132in classroom Scaling to PetascaleLSU8/03-0718in classroom LONI HPC workshopLaTech10/6-712in classroom LONI HPC workshopULL10/26-2730in classroom