Download presentation
Presentation is loading. Please wait.
1
Transition in Campus CyberInfrastructure: Community Clusters, Storage and Co-Loc
Introduce yourself Presented by Dwight McKay, Director of Systems Engineering ITaP Rosen Center for Advanced Computing Purdue University
2
Introduction Community Clusters Summarize model and issues
Outline Introduction Community Clusters Summarize model and issues Perspective is that of an infrastructure builder / operator Focus on computational and storage allocation, growth strategies, funding Start with our structure and the sea change we see as we move from a centrally funded world to being a resource provider on-campus and beyond
3
Introduction Community Clusters Summary
This is the overall structure of ITaP Introduction Community Clusters Summary
4
Introduction Community Clusters Summary
The Rosen Center is one of five business units. We focus on the cyberinfrastructure needs of the campus and beyond Introduction Community Clusters Summary
5
Rosen Center for Advanced Computing
This is the structure of RCAC Talk about each sub-unit Introduction Community Clusters Summary
6
Rosen Center for Advanced Computing
This is the structure of RCAC Talk about each sub-unit Introduction Community Clusters Summary
7
Transition in Research Computing Support
Move from the center to a service, a collaborator Our history has been to share systems, first with job scheduling, then as we transitioned out of the centrally funded realm we built clusters from old lab systems and then moved into the condo model that we call community clusters. Introduction Community Clusters Summary
8
Transition in Research Computing Support
Change in Direction From central purchase to researcher / project purchase From central shared facility to resource or service provider From service desk to partner Move from the center to a service, a collaborator Our history has been to share systems, first with job scheduling, then as we transitioned out of the centrally funded realm we built clusters from old lab systems and then moved into the condo model that we call community clusters. Introduction Community Clusters Summary
9
Transition Implications
Paying Customers Higher Expectations Service & Support Formal Agreements Cultural Change This transition is the driver for moving into new models of systems acquisition, resource allocation and collaboration The biggest change is that we now have customers explicitly paying for services Introduction Community Clusters Summary
10
Rosen Center for Advanced Computing
While we are structured into specific areas, the boundaries between these areas are more fluid than this diagram suggests. The structure is more of a matrix with projects and people spanning across the reporting boundaries as needed to support our customers. Also note that we incorporate a research group as well as user support. Research User Support Infrastructure Introduction Community Clusters Summary
11
Rosen Center for Advanced Computing
Project A typical project has connections into both computational infrastructure, accounts, queues, etc. AND high level support, consultation, code optimization, application support, etc. Larger, more complex projects often pull in larger sets of resources, such as project specific WAN links, project management, software development and custom infrastructure design and deployment. Project Introduction Community Clusters Summary
12
Rosen Center for Advanced Computing
Our teams have people who span groups. A person reporting to Seb to manage a system in a grid project would also participate in the system team, come to our meetings, work and act like a member of our team to provide a close connection and better achieve the support needs the grid project needs. We also embed people into research teams to provide IT expertise needed to move a project along. Project Introduction Community Clusters Summary
13
Transition Implications
New Business Models Needed HW/SW, infrastructure, support? Unpredictable Demand & Funding Planning for power/space/cooling? Non-paying “Users” How do we pay for hardware, software, services, people in this new environment? If we are not centrally funded, how do we predict the demand we will see from our customers? What about our “general” users; those who did not buy into our services? Introduction Community Clusters Summary
14
Custom Arrangements for Specific Projects
Community Clusters Condominium Model Purchase Computation “by the node” Nodes come with bundle of services Purchase Storage “by the TeraByte” Storage comes with bundle of services Custom Arrangements for Specific Projects Introduction Community Clusters Summary
15
Services Community Clusters HW Installation HW Maintenance
Facilities Support Network Connection Infiniband Connnection OS installation and management Disk Storage Archival Storage On-Call Support Disaster Recovery Security Introduction Community Clusters Summary
16
Services Node Bundle Community Clusters HW Installation HW Maintenance
Facilities Support Network Connection Infiniband Connnection OS installation and management Disk Storage Archival Storage On-Call Support Disaster Recovery Security Node Bundle Introduction Community Clusters Summary
17
Tiered Cycle Allocation
Community Clusters Tiered Cycle Allocation Owners guaranteed specific share Owners given first pick of idle cycles Owners agree to harvesting remaining idle cycles “Use the whole buffalo.” -- Brad Bird Brad Bird is the director of “The Incredibles”. Important for space/power/cooling, non-paying customers, cycles for Grid users Introduction Community Clusters Summary
18
Tiered Cycle Allocation
Community Clusters Tiered Cycle Allocation Owner Introduction Community Clusters Summary
19
Tiered Cycle Allocation
Community Clusters Tiered Cycle Allocation Owner Pre-empt Introduction Community Clusters Summary
20
Tiered Cycle Allocation
Community Clusters Tiered Cycle Allocation Owner Mention Condor here. Cycle Harvesting Pre-empt Introduction Community Clusters Summary
21
Community Clusters Challenges
Business model that spans system generations & recovers shared infrastructure costs Cluster Heterogeneity Multiple communities sharing one cluster (TeraGrid, OSG, NWICG) Other architectures & special needs? We are heading towards the end of a three year cycle on cluster building. How do we do retirement? How do we pay for interconnection infrastructure and upgrade it over time? What do we do when the particular node we use is nolonger available? What about those folks who need a shared memory system? Introduction Community Clusters Summary
22
Storage Community Clusters Three Primary Tiers
Fast for scratch files Commodity for home directories Archival for “longer term” storage Custom Storage for Specific Projects Introduction Community Clusters Summary
23
Storage Challenges Community Clusters
Business model that spans lifetime of data / researcher? Media Purchase vs. Space Rental? Data Retention Policy Initially we had researchers buy disk trays. But storage technology is progressing and there’s a potential danger in being stuck maintaining old storage. How long to keep something and how do we help a research take his data with him when he might have 10s to 100s of TB? Introduction Community Clusters Summary
24
Connectivity Challenges
Community Clusters Connectivity Challenges Multiple classes of network are needed Direct Routes Data Center to Key Research WANs Data Center to Key Research Labs Introduction Community Clusters Summary
25
Introduction Community Clusters Summary
26
Four Fold Network Community Clusters Commodity Secure Research
High Performance / Large Data Network Research Low Latency Introduction Community Clusters Summary
27
Summary Transition to resource provider / partner / service
Architecture Computation -> Community Clusters Storage -> 3 Tiers Connectivity -> Direct Data Center connections to Research WAN and Lab Access Challenges Customer expectations and implications Serving up storage and other architectures “by the slice” Recovering ALL the costs, especially support Introduction Community Clusters Summary
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.