Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009.

Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009

PY 4 Data Area Characteristics Relatively Stable software and user tools Relatively dynamic site/machine configuration –New Sites and Systems –Older systems being retired TeraGrid Emphasis on broadening participation –Campus Champions –Science Gateways –Underrepresented disciplines 2

PY4 Areas of Emphasis Improve campus-level access mechanisms Provide support for gateways and other “mobile” computing models Improve clarity of documentation Enhance user ability to manage complex datasets across multiple resources Develop comprehensive plan for future developments in the Data area Production deployments of Lustre-WAN, path to Global file systems 3

Data Working Group Coordination Led by Chris Jordan Meets bi-weekly to discuss current issues Has membership from each RP Attendees are a blend of system administrators, software developers, and users 4

Wide-Area and Global File Systems Providing a TeraGrid global file system is a highly requested service Global file systems imply that a single file system is mounted on most TeraGrid resources –No solution currently exists for production global file systems Wide area file systems give the look and feel of a single file system. Possible with technologies such as GPFS-WAN or Lustre-WAN –GPFS-WAN has licensing issues and isn’t available for all platforms –Lustre-WAN is preferable for both licensing and compatibility reasons pNFS is a possible path for global file systems, but is far away from being viable 5

Lustre-WAN Progress There is an initial production deployment of Indiana’s Data Capacitor Lustre-WAN on IU’s BigRed, PSC’s Pople –Declared to be production in PY4 (involves testing and implementation of security enhancements) In PY4, had successful testing and commitment to production on LONI’s QueenBee, TACC’s Ranger/Lonestar, NCSA’s Mercury/Abe, and SDSC’s IA64 –Additional sites (NICS, Purdue) will begin testing this year Additionally, in PY4, ongoing work to improve performance and authentication infrastructure –Work in parallel with production deployment 6

CTSS Efforts in the Data Area In PY4, created data kits –data movement kit – 20 TG resources –data management kit (SRB) – 4 TG resources –wide area file systems kits - GPFS-WAN (5), LUSTRE-WAN (2) Currently reworking data kits to include: –new client-level kits to express functionality and accessibility more clearly –new server-level kits to report more accurate information on server configurations –broadened use cases –requirements for more complex functionality (managing, not just moving, data) –improved information services to support science gateways and automated resource selection 7

Data/Collections Management PY4 Tested new infrastructure for data replication and management across TeraGrid resources (iRODS) Made assessment of archive replication and transition challenges Gathered requirements for data management clients in CTSS 8

Data Architecture Two primary categories of use for data movement tools in the TeraGrid –Users moving data to or from a location outside the TeraGrid –Users moving data between TeraGrid resources –(Frequently, users will need to do both within the span of a given workflow) Moving data to/from location outside the TeraGrid: –Tend to be smaller numbers of files and less overall data to move –Primarily encounter problems with usability due to availability or ease-of-use 9

Data Architecture (2) Moving data between TeraGrid resources –Datasets tend to be larger –Users are more concerned with performance, high- reliability and ease of use General trend that we have seen – as need for data movement has increased, both the complexity of the deployments and the frustrations of users have increased. 10

Data Architecture (3) This is an area in which we think we can have a significant impact –Users want reliability, ease of use, and in some cases high performance –How the technology is implemented should be transparent to the user. –User initiated data movement, particularly on large systems has proven to create problems with contention for disk resources 11

Data Architecture (4) Data Movement Requirements: –R1: Users need reliable, easy to use file transfer tools for user moving data from outside the TeraGrid to resources inside the TeraGrid. –R2: Users need reliable, high performance, easy to use file transfer tools for moving data from one TeraGrid resource to another. –R3: Tools for providing transparent data movement are needed on large systems with low storage to flops ratio. SSH/SCP with the High-performance networking patches (HPN-SCP) SCP-based transfers to gridFTP nodes – RSSH TGUP Data mover) 12

Data Architecture (5) Network architecture on the petascale systems is proving to be a challenge – only a few router nodes are connected to wide area networks directly and the rest of the compute nodes are routed through these. Wide area file systems often need direct connect access. It has become clear that no single solution will provide a production global wide area network file system. -R4: The “look and feel” or the appearance of a global wide area file system with high availability and high reliability (LUSTRE-WAN, pNFS). 13

Data Architecture (6) Until recently, visualization and in many cases, data analysis have been considered a post- processing task requiring some sort of data movement. With the introduction of petascale systems, we are seeing data set sizes that prohibit data movement or make it necessary to minimize the movement. It is anticipated that scheduled data movement is one way in which to guarantee that the data is present at the time it is needed. 14

Data Architecture (7) Visualization and data analysis tools have not been designed to be data aware and have made assumptions that the data can be read into memory and that the applications and tools don’t need to be concerned with exotic file access mechanisms. - R5: Ability to schedule data availability for post- processing tasks. (DMOVER) - R6: Availability of data mining/data analysis tools that are more data aware. (Currently working with VisIt developers to modify open source software. Leveraging work done on parallel Mesa) 15

Data Architecture (8) Many TeraGrid sites provide effectively unlimited archival storage to compute-allocated users. The volume of data flowing into and out of particular archives is already increasing drastically, in some cases exponentially, beyond the ability of the disk caches and tape drives currently allocated. -R7: The TeraGrid must provide better organized, more capable, and more logically unified access to archival storage for the user community. (Proposal to NSF for unified approach to archival storage and data replication) 16

Plans for PY5 Implement Data Architecture recommendations –User portal integration –Data Collections infrastructure –Archival replication services –Continued investigation of new location-independent access mechanisms (PetaShare, ReDDnet) Complete production deployments of Lustre-WAN Develop plans for next-generation Lustre-WAN and pNFS technologies Work with CTSS team on continued improvements to Data kit implementations 17

Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009.

Similar presentations

Presentation on theme: "Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009.

Similar presentations

Presentation on theme: "Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009."— Presentation transcript:

Similar presentations

About project

Feedback