Presentation on theme: "TeraGrid: A Powerful, Parallel, Fast, and Free Computational Resource A Bytes ‘n Bites Presentation Michael J. Reale and Dr. James Wolf Information Technology."— Presentation transcript:
TeraGrid: A Powerful, Parallel, Fast, and Free Computational Resource A Bytes ‘n Bites Presentation Michael J. Reale and Dr. James Wolf Information Technology Center 10/14/2010
What is TeraGrid? TeraGrid is an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource. It is the world's largest, most comprehensive distributed cyberinfrastructure for open scientific research. U.S. researchers and educators may request access to any or all of TeraGrid’s resources at no cost to their research project.
What resources does it provide? Massively parallel computing power ▫More than a petaflop (10 15 ) of total computing capability Storage space ▫More than 30 petabytes of online and archival data storage, with rapid access and retrieval over high- performance networks A large selection of software installed and ready for use Advanced Support for TeraGrid Applications (ASTA)
How do I get started? Contact us and we’ll help you get a Startup Allocation! ▫http://internet2.binghamton.edu/teragrid/contac t.cgihttp://internet2.binghamton.edu/teragrid/contac t.cgi Also, you may want to check: ▫http://internet2.binghamton.edu/teragrid/faq.cgi #howstarthttp://internet2.binghamton.edu/teragrid/faq.cgi #howstart
Outline Access Computation Queues and Wait Times Data Visualization Educational Allocations Training and Help
“Speak, friend, and enter…” -- J.R.R. Tolkien, The Fellowship of the Ring
Logging into TeraGrid There are three ways to log in that work with all TeraGrid sites: ▫The TeraGrid Portal (portal.teragrid.org) ▫Globus Toolkit Software ▫GSI-SSHTerm All of the above use Single Sign-On (SSO)
The TeraGrid Portal Log into Go to “My TeraGrid” “Accounts” Click the “Login” link for the site you wish to connect to The Java SSH terminal will open in the browser
Globus Toolkit Software: Compiling from Source Get the source tarball (4.0.8 worked for me; the latest version, did not) from ▫wget source-installer.tar.bz2 Unpack it (this will take a bit) ▫tar -xvf gt4.0.8-all-source-installer.tar.bz2 ▫cd gt*-installer Compile and install to $HOME/globus (this will also take a bit) ▫./configure --prefix=$HOME/globus ▫make gsi-myproxy gsi-openssh gridftp ▫make install Add to your path ▫export PATH=$HOME/globus/bin:$PATH You can get rid of the source directory and tarball when you’re done. The installed software takes about 91MB of space.
Globus Toolkit Software: TeraGrid Client Toolkit Installing the entire Globus Toolkit can be a bit of an ordeal. So, if you are running Linux or Mac, you can install the TeraGrid Client Toolkit, which contains a subset of the Globus Toolkit software needed to login and work with TeraGrid: ▫https://www.teragrid.org/web/user-support/sso_tg_client_toolkithttps://www.teragrid.org/web/user-support/sso_tg_client_toolkit Untar the file, and run the following: ▫cd teragrid-cleint ▫./install-teragrid-client WARNING: It is not supported on all platforms, but you may be able to trick it into installing (see Appendix A).
Globus Toolkit Software: Logging In Let us assume that the Globus Toolkit is installed in $HOME/globus. The script on the next slide will set up your environment for logging into TeraGrid sites. If you named this script “setupTG.sh”, you would call the following with your TeraGrid portal username: ▫source setupTG.sh After entering your TeraGrid portal password, you should be able to log into any site using gsissh: ▫gsissh tg-login.ranger.tacc.teragrid.org Note: depending on how you installed the Globus Toolkit, you may need to add the library path for Glite to LD_LIBRARY_PATH as well.
GSI-SSHTerm Go to https://security.ncsa.illinois.ed u/gsi-sshterm/ https://security.ncsa.illinois.ed u/gsi-sshterm/ Download the “Java Web Start Version” (make sure to pick the one using TeraGrid credentials) The program provides both an SSH terminal and an SFTP tool. Further instructions for use can be found here: mediawiki/images/a/a4/HowT oUseGSISSHApplet.pdf mediawiki/images/a/a4/HowT oUseGSISSHApplet.pdf
GSI-SSHTerm: Login Problems Make sure there are no trailing spaces in the address box when connecting. Otherwise, the program silently fails.
“On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.” -- Charles Babbage
Computational Resource Types Regular ▫SMP ▫MPP ▫Cluster Visualization Special New
SMP (Symmetric Multiprocessing) Characteristics: ▫Multiple CPUs ▫Same cabinet ▫Share the same memory Good for large memory jobs (even serial ones) ▫Caveats: Need to request enough nodes to get the desired memory Memory access outside node is slower Examples: ▫NCSA – Cobalt SGI Altix system Primarily for large shared memory application Will be replace by a new system called Ember (SGI UV System) ▫PSC – Pople SGI Altix system Primarily for shared memory and hybrid architectures ▫IU – Quarry Used for web services hosting and Science Gateways
MPP (Massively Parallel Processing) Characteristics: ▫Uses up to thousands of processors ▫Same cabinet ▫Distributed shared memory Good for jobs that need high-performance but also need lots and lots of cores Examples: ▫IU – Big Red IBM e1350 PowerPC; distributed, shared-memory cluster intended to run parallel as well as serial applications ▫NCAR – Frost IBM BlueGene/L – 8,192 processors; highly scalable platform for developing, testing, and running parallel MPI applications ▫NICS – Kraken Cray XT5 – 66,048 processors; intended for highly scalable applications (minimum startup allocation request is 100,000 SUs) ▫TACC – Ranger Sun Constellation – 62,976 processors; intended for codes scalable to thousands of cores
Cluster Characteristics: ▫No real cap on the number of processors/nodes ▫Different physical machines (may also have heterogeneous composition of nodes) ▫Memory not shared Good for massively parallel jobs with little inter-process communication Examples: ▫NCSA – Abe (Part of ASQL) Dell PowerEdge 1955 – 9,600 processors; intended for highly parallel, scalable applications ▫LONI – QueenBee (Part of ASQL) Dell PowerEdge 1950 – 5,344 processors; intended for parallel applications scalable up to 5,344 cores ▫Purdue – Steele (Part of ASQL) Dell PowerEdge 1950 – 7,144 processors (1600 are available in the longest running production queue) Suited for a wide range of serial and small/medium parallel jobs Longest Wall Time (720 hours) ▫TACC – Lonestar (Part of ASQL) Dell PowerEdge 1955 – 5,840 processors; intended primarily for applications scalable up to 4,096 cores
Visualization Intended for visualization and graphical applications Examples: ▫TACC – Longhorn Dell/NVIDIA Visualization and Data Analysis Cluster A hybrid CPU/GPU system designed for remote, interactive visualization and data analysis, but it also supports production, compute-intensive calculations on both the CPUs and GPUs via off-hour queues ▫TACC – Spur Sun Visualization Cluster 128 compute cores / 32 NVIDIA FX5600 GPUs Intended for serial and parallel visualization applications that take advantage of large per-node memory, multiple computing cores, and multiple graphics processors ▫NICS – Nautilus Visualization and Analysis SGI UltraViolet 1024 cores (Intel Nehalem) 4 TB global shared memory 1 PB file system Pre-production, but accepting allocations
Special NCSA – Lincoln ▫Dell PowerEdge 1950 / NVIDIA Tesla S1070 ▫Intended for applications that can make use of heterogeneous processors (CPU and GPU) Purdue – Condor (high throughput) ▫Pool of over 27,000+ processors ▫Various architectures and operating systems ▫Designed for high-throughput computing and is excellent for parameter sweeps, Monte Carlo simulations, and other serial applications SDSC – DASH ▫Intel Nehalem – 544 processors ▫vSMP (virtual shared memory) software from ScaleMP that aggregates memory across 16 nodes. This allows applications to address 768GB of memory ▫4 TB of Flash memory configurable as fast file I/O subsystem or extended fast virtual memory swap space.
New FutureGrid: A grid testbed (Indiana University and many other partners) ▫A collections of different systems ▫Focus on virtual machines ▫Not currently allocated via standard TeraGrid POPS allocation system ▫More info: MATLAB on the TeraGrid ▫New system from Cornell ▫Not currently allocated via standard TeraGrid POPS allocation system ▫Need a local copy of MATLAB and the MATLAB Parallel Computing Toolbox ▫More info: ▫How to request access:
Resource Selection Spreadsheet List resource sites, queues, job statistics, and links to manuals ▫http://www.teragridforum.org/mediawiki/images /7/70/ResourceInfo_SelectionAid.xlshttp://www.teragridforum.org/mediawiki/images /7/70/ResourceInfo_SelectionAid.xls
More information… The preceding slides are largely from Kim Dillman’s presentation given at TG’10: ▫http://www.teragridforum.org/mediawiki/images /1/12/ComputeSession-ChampionsTG10.pdf
“PATIENCE, n. A minor form of despair, disguised as a virtue.” -- Ambrose Bierce
Queue Policies Many different factors can limit and/or prioritize jobs. Possible limits: ▫# of jobs queued or running per user ▫# of jobs queued or running per project allocation ▫# of total nodes per user Priorities can be affected by: ▫Wall time request ▫# of CPUs (sometimes want more, sometimes less) ▫Time already spent waiting in the queue ▫Number of jobs previously run by the user
Some useful PBS Commands List queues defined: ▫qstat –Q Get details about the queues: ▫qstat –Qf List jobs in queue: ▫qstat –a List only your jobs: ▫qstat –u List details of a job: ▫qstat –f Show estimated job start (on some systems): ▫showstart
Karnak Prediction Service Gives the following: ▫System Information # of running jobs, waiting jobs, and used processors Information about status of queues ▫Wait Time Predictions ▫Start Time Predictions Warnings: ▫Not real-time information ▫Pages don’t display in IE for some reason (Firefox works)
“To know that we know what we know, and to know that we do not know what we do not know, that is true knowledge.” -- Copernicus
Transferring Data For large files, it’s best to use something that supports the GridFTP protocol: ▫Uberftp ▫Globus Toolkit functions (globus-url-copy) GridFTP supports: ▫Multithreaded transfers ▫Striping over several hosts ▫3 rd party transfers ▫Transfer rates as high as 750MB/s (network- permitting)
FTP on TeraGrid It is now possible to use an FTP client to connect directly to TeraGrid resources using the following address: ▫ftp://vfs.teragrid.orgftp://vfs.teragrid.org WARNING: This is regular FTP, NOT SFTP.
Storage Options: General HOME directory ▫Permanent (non-purged) ▫Not very big ▫Visible to all nodes in a cluster SCRATCH space ▫Temporary ▫Shared among other users ▫Fairly large ▫Visible to all nodes in a cluster Parallel file systems ▫Temporary ▫Fast ▫Large ▫Visible to all nodes in a cluster Archival (mass) storage ▫Permanent (can be replicated) ▫Slow ▫Quite large ▫Visible to all TeraGrid sites A complete list of the filesystems available at each site can be found here: https://www.teragrid.org/web/user-support/storage https://www.teragrid.org/web/user-support/storage
Storage Options: GPFS-WAN Project Space If you have a lot of data that needs to be analyzed, you can request project space on GPFS-WAN (Global Parallel File System-Wide Area Network) ▫Total size: 475 TB (quotas based on request) ▫Not purged, but also not backed up ▫Data and directories will be removed one month after the assigned TeraGrid project allocation expires ▫Mounted on IU’s Big Red site More info: ▫https://www.teragrid.org/web/user-support/gpfswan
Storage Options: Indiana University HPSS Archive Default quota: 5TB (but you can ask for more) Fast access: 100’s MB/sec ▫Recommend GridFTP clients (gridftp.mdss.iu.edu) Two copies stored on two separate sites Use if you have: ▫Files of at least 1MB (single file can be up to 10TB) that are rarely updated and need to be kept a long time ▫Files are read often (frequently accessed files tend to stay on disk cache) Not good for: ▫Small files ▫Files that will frequently change Should ask for allocation on Big Red More info: ▫http://www.teragridforum.org/mediawiki/images/e/ea/Champions- tg10-archival-storage.pdf
Storage Options: NCSA Tape Archive mss.ncsa.uiuc.edu OR mss.ncsa.teragrid.org Can login using ssh/gsissh Two copies of each file are made without requiring user interaction Directories: ▫Home directory Quota: 1 TB (cannot request more) ▫Project directories (named with three letter NCSA PSN) Quotas: Startup: 1TB TRAC: 5 TB Can ask for supplement through POPS system PI is owner of project folder; user subdirectories created under project space Soft links in the user's home directory to their subdirectory in projects area
“I see nobody on the road,” said Alice. “I only wish I had such eyes,” the King remarked in a fretful tone. “To be able to see Nobody! And at such a distance too!” -- Lewis Carroll
TACC Longhorn Visualization Portal https://portal.longhorn.tacc.utexas.edu/ TACC's Dell XD Visualization Cluster ▫2048 compute cores ▫14.5 TB aggregate memory ▫512 GPUs ▫QDR InfiniBand interconnect ▫Connected by 10GigE to Ranger's Lustre parallel file system Select session type: ▫VNC (need to create password) ▫EnVision Select number of nodes (1 to 16, which translates to processors, respectively)
ParaView Free to download Can be run on your desktop/laptop, but also can take advantage of HPC resources Already installed on TACC Longhorn Among others, loads the following file formats: ▫*.cube files (Gaussian) ▫*.vtk files (LAMMPS, with a little work) Lots of rendering and visualization options Tutorial (for version 3.8): ▫http://www.vtk.org/Wiki/The_ParaView_Tutorial
ParaView for Gaussian
ParaView for LAMMPS Your LAMMPS job must create a dump file ▫dump 1 peptide atom 10 dump.peptide Download the Pizza.py script: ▫http://www.sandia.gov/~sjplimp/pizza.html Use Pizza.py script to convert to a series of VTK files (creates peptide0000.vtk, peptide0001.vtk, etc.): ▫python –i pizza.py ▫> d = dump(“dump.peptide”) ▫> v = vtk(d) ▫> v.many(“peptide”) Load all of the files into ParaView
ParaView for LAMMPS
ParaView on TACC Longhorn Assuming you are using the Visualization Portal with a VNC session… ▫In one xterm: module load paraview vglrun paraview ▫In the other: module load paraview env NO_HOSTSORT=1 ibrun tacc_xrun pvserver ▫WARNING: You can minimize the windows that open, but do NOT close them! ▫In the ParaView GUI window: File Connect Add Server Enter a name Configure Under “Startup Type,” select “Manual” Select the name of your server configuration, and click "Connect" In the xterm where you launched ParaView server, you should see "Client connected.“ ▫To increase the image quality, go to Options on the Portal page, and select a higher number for the JPEG image quality. More info:
Longhorn Visualization Portal: Using a VNC client Although you can connect to a Longhorn VNC session through a browser, the remote screen usually doesn’t display properly (cut off, and you can’t scroll). However, when you start a job, you are given an address to connect to Longhorn through a VNC client you can run locally. ▫See the “Jobs” tab in the Longhorn portal Some example VNC clients: ▫Windows/Linux: TightVNC: TightVNC: TurboVNC: TurboVNC: UltraVNC: UltraVNC: ▫Mac: Chicken of the VNC: Chicken of the VNC: WARNING! You MUST kill the job in the portal. Exiting the VNC client will NOT end the job!
TACC Portal Demo Log into the Longhorn Visualization Portal. Start a VNC session job. Get the VNC address. Connect using TightVNC. Start and run ParaView.
“A mind once stretched by a new idea never regains its original dimensions.” -- Anonymous
Educational Allocations If you are teaching a class and would like to use TeraGrid resources in the classroom, you can! Apart from the regular Startup and Research allocations, there is an “Educational” allocation option as well. ▫It is recommended that you request it fairly early, before the semester starts. ▫Once you have your class roster, you can add your students to the account (they should have their login information within one to two weeks). For more information: ▫https://www.teragrid.org/web/user-support/startuphttps://www.teragrid.org/web/user-support/startup
“Live as if your were to die tomorrow. Learn as if you were to live forever.” -- Gandhi “I cannot teach anybody anything, I can only make them think.” -- Socrates
Training and Help CI-Train Project ▫http://www.ci-train.org/http://www.ci-train.org/ Binghamton webpage (FAQ and assorted documentation): ▫http://internet2.binghamton.edu/teragrid/index.cgihttp://internet2.binghamton.edu/teragrid/index.cgi The Campus Champions (us!) GRID-L ▫Binghamton listserv for those interested in grid and parallel computing ▫Send to with body text: SUBSCRIBE GRID-L Firstname Lastname
“The important thing is not to stop questioning. Curiosity has its own reason for existing.” -- Albert Einstein
Appendix A: TeraGrid Client Toolkit Installation Problems If you run into problems, you may have to trick the thing into installing: ▫export TERAGRID_CLIENT_PLATFORM=linux-rhel-4 ▫export VDT_ALLOW_UNSUPPORTED=1 Replace linux-rhel-4 with the operating system nearest yours. ▫See for the list.http://vdt.cs.wisc.edu/releases/2.0.0/requirements.html Then, rerun the installation script. If you still have problems, check the install.log and contact us.
Appendix B: globus-url-copy Parameters: ▫-vb display bytes transferred and average performance ▫-fast Recommended when using GridFTP servers. Use MODE E for all data transfers, including reusing data channels between list and transfer operations. ▫-stripe enable striped transfers on supported servers ▫-tcp-bs specifies the size (in bytes) of the TCP buffer to be used by the underlying ftp data channels 8M is a good value, although technically the best is determined by: bandwidth in Megabits per second (Mbs) * RTT in milliseconds (ms) * 1000 / 8 RTT = roundtrip time it takes a packet to get from the source to the destination (use ping to determine this) ▫-p specifies the number of parallel data connections that should be used