GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.

Slides:

Advertisements

Similar presentations

Wei Lu 1, Kate Keahey 2, Tim Freeman 2, Frank Siebenlist 2 1 Indiana University, 2 Argonne National Lab

Advertisements

The Enterprise Guide to Video Conferencing Created using iThoughts [...] [...]

Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.

1 Applications Virtualization in VPC Nadya Williams UCSD.

SLA-Oriented Resource Provisioning for Cloud Computing

The Case for Enterprise Ready Virtual Private Clouds Timothy Wood, Alexandre Gerber *, K.K. Ramakrishnan *, Jacobus van der Merwe *, and Prashant Shenoy.

User Documentation.  You cannot build a system for a client and leave them without adequate documentation  Computer systems are complex, require configuration.

Cloud Computing Imranul Hoque. Today’s Cloud Computing.

Xen , Linux Vserver , Planet Lab

An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.

1 Week #1 Objectives Review clients, servers, and Windows network models Differentiate among the editions of Server 2008 Discuss the new Windows Server.

1 Week #1 Objectives Review clients, servers, and Windows network models Differentiate among the editions of Server 2008 Discuss the new Windows Server.

Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.

European Organization for Nuclear Research Virtualization Review and Discussion Omer Khalid 17 th June 2010.

Yes, yes it does! 1.Guest Clustering is supported with SQL Server when running a guest operating system of Windows Server 2008 SP2 or newer.

The Origin of the VM/370 Time-sharing system Presented by Niranjan Soundararajan.

Virtual Desktop Infrastructure Solution Stack Cam Merrett – Demonstrator User device Connection Bandwidth Virtualisation Hardware Centralised desktops.

Cost Effort Complexity Benefit Cloud Hosted Low Cost Agile Integrated Fully Supported.

Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,

Minerva Infrastructure Meeting – October 04, 2011.

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

Additional SugarCRM details for complete, functional, and portable deployment.

System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.

Real Security for Server Virtualization Rajiv Motwani 2 nd October 2010.

WP6: Grid Authorization Service Review meeting in Berlin, March 8 th 2004 Marcin Adamski Michał Chmielewski Sergiusz Fonrobert Jarek Nabrzyski Tomasz Nowocień.

October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.

Virtualization Lab 3 – Virtualization Fall 2012 CSCI 6303 Principles of I.T.

Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.

Appendix B Planning a Virtualization Strategy for Exchange Server 2010.

Network Aware Resource Allocation in Distributed Clouds.

INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.

Grids, Clouds and the Community. Cloud Technology and the NGS Steve Thorn Edinburgh University Matteo Turilli, Oxford University Presented by David Fergusson.

From Virtualization Management to Private Cloud with SCVMM 2012 Dan Stolts Sr. IT Pro Evangelist Microsoft Corporation

Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.

Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.

Workshop on Computing for Neutrino Experiments - Summary April 24, 2009 Lee Lueking, Heidi Schellman NOvA Collaboration Meeting.

US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects Fermi National Laboratory November.

Microsoft Management Seminar Series SMS 2003 Change Management.

11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.

VMware vSphere Configuration and Management v6

Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

The role of networking in the Dynamic Data Center Niels Friis-Hansen Senior IT Specialist, CCIE IBM Communication & Collaboration.

CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.

R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,

Chapter 3 : Designing a Consolidation Strategy MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design Study Guide (70-443)

INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.

SQL Server 2012 Session: 1 Session: 4 SQL Azure Data Management Using Microsoft SQL Server.

Next Generation of Apache Hadoop MapReduce Owen

SYSTEM CENTER VIRTUAL MACHINE MANAGER 2012 Gorazd Šemrov Microsoft Consulting Services

Introduction to Exadata X5 and X6 New Features

VMware Certified Professional 6-Data Center Virtualization Beta 2V0-621Exam.

Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.

Auxiliary services Web page Secrets repository RSV Nagios Monitoring Ganglia NIS server Syslog Forward FermiCloud: A private cloud to support Fermilab.

FermiCloud Status Report Fall 2010 Keith Chadwick Grid & Cloud Computing Department Head Fermilab Work supported by the U.S. Department.

Authentication, Authorization, and Contextualization in FermiCloud S. Timm, D. Yocum, F. Lowe, K. Chadwick, G. Garzoglio, D. Strain, D. Dykstra, T. Hesselroth.

Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.

CD FY10 Budget and Tactical Plan Review FY10 Tactical Plans for Scientific Computing Facilities / General Physics Computing Facility (GPCF) Stu Fuess 06-Oct-2009.

Parag Mhashilkar (Fermi National Accelerator Laboratory)

UFIT Infrastructure Self-Service. Service Offerings And Changes Virtual Machine Hosting Self service portal Virtual Machine Backups Virtual Machine Snapshots.

July 18, 2011S. Timm FermiCloud Enabling Scientific Computing with Integrated Private Cloud Infrastructures Steven Timm.

Computing Infrastructure – Minos 2009/12 ● Downtime schedule – 3 rd Thur monthly ● Dcache upgrades ● Condor / Grid status ● Bluearc performance – cpn lock.

Computing Infrastructure Arthur Kreymer 1 ● Power status in FCC (UPS1) ● Bluearc disk purchase – coming soon ● Planned downtimes – none ! ● Minos.

Computing Infrastructure Arthur Kreymer 1 ● Power status in FCC (UPS1) ● Bluearc disk purchase – still coming soon ● Planned downtimes – none.

Understanding The Cloud

1. 2 VIRTUAL MACHINES By: Satya Prasanna Mallick Reg.No

GGF15 – Grids and Network Virtualization

Cloud Computing Architecture

Presentation transcript:

GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is not a memorable name. Suggestions for a better name and TLA are welcome!

What needs are we addressing? Common solution for a varied community – Intensity and Cosmic Frontier experiments – Some of the old fnalu functions Shared resources – To optimize utilization Focus on long term management and operation – Reduce the burden on the experiments / users Reduction of “one off” solutions and orphans – Reduce the burden on the CD

What are we not addressing (yet)? Data management schemes – And implications on processing and data access patterns Performance – Learn from experience – Build in flexibility Thinking started, but a “plan” needed

Guiding principles Use virtualization Training ground and gateway to the Grid No undue complexity – user and admin friendly Model after the CMS LPC where sensible Expect to support / partition the GPCF for multiple user groups

Basic architecture Interactive facility – VMs dedicated to user groups – Access to common, group, and private storage Local batch facility – VMs dedicated to user groups – Logins possible – Otherwise close to or same as grid environment Server / Service Nodes – VM homes for group-specific or system services Storage – BlueArc, dCache, or otherwise (Lustre, HDFS?) Network infrastructure – Work with LAN to make sure adequate resource

VMs Q: Which VMs are allowed? A: Supported (baselined) SLF versions. Customized for user groups. Patches will be applied to VM store and active VMs. Q: Resources per VM? A: 2 GB memory per core x GB local disk storage n guaranteed / n shared processors x guaranteed / x shared network bandwidth Where oversubscription is allowed.

VMs (#2) Q: Which hypervisor? A: Xen (for now) Q: How are VMs provisioned and deployed? A: Will be guided by FermiCloud work, but currently use manual provisioning of static VMs Q: How are the VMs stored? A: Will be guided by FermiCloud work, but currently envision BlueArc  These choices do not impact user environment

Storage Systems Q: Which storage / file systems will be used? A: This is the principal remaining question for the hardware architecture. We expect to start with use of BlueArc and public dCache, operated in a manner largely unchanged. Storage system capacity is reasonably well specified, but performance as a function of usage is not.

Storage systems (cont’d) Q: What about Hadoop or Lustre or …? A: It’s too early to think about these for production systems in a “new” facility. We want to study these within the FermiCloud facility, and perhaps introduce limited capacity within the GPCF facility. Q: What are the implications of delaying a decision on storage? A: This affects specifics of hardware purchase. Distributed storage systems might want many nodes with associated disks, possibly with dedicated (FC or Infiniband) network. For now we will assume separated storage systems.

Security Q: Are there special security needs? A: All of GPCF will be within the General Computing Enclave (GCE), meaning they are treated like any other local cluster. – Only Fermilab Kerberos credentials – No grid cert access Except maybe Fermi KCA certs???

Network Topology Q: How are VMs named / addressed? A: Current plan is: – Fixed IPs for interactive VMs – Dynamic IPs for batch VMs – Fixed IPs for server VMs – Fixed IPs for network storage

Resource Provisioning Q: How many VMs/nodes/servers/…? A: Using NuComp / Lee’s numbers for IF needs. Budget request is for 2x – though may not see this Q: How are resources to be distributed among groups? A: TBD. To some level, based on contributions to purchases.

User Accounts Q: How are groups “segregated”? A: NIS domain per group. Any VM associated with one NIS domain. Privileged access restricted to admins.

VMs (#3) Q: What “fancy features” are envisioned? A: None for now… Possibilities for the future are: – High availability (HA) for services – VM failover / relocation – VM suspension / restart

Physical Location Q: Where are the physical nodes? A: There are building power constraints. FCC is the “high availability” center, but “no room at the inn”. May consider only storage in FCC, nodes in GCC.

FY10 Budget request Overlap with BlueArc, dCache requests to be resolved QtyDescriptionUnit Cost Extended Cost Fund Type 16Interactive Nodes$3,300$52,800EQ 32Local Batch Nodes$3,100$99,200EQ 4Application Servers$3,900$15,600EQ 3Disk Storage$22,000$66,000EQ 1Storage Network$10,000 EQ 1Network Infrastructure$40,000 EQ 1Racks, PDUs, etc$3,000 EQ

Schedule 2 phases: – ASAP: put out requisitions for: BlueArc disk Additional dCache disk ~1/4 total number of nodes – Spring, or as needed: Remaining number of nodes