Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quality Assurance (QA) Working Group Update July 1, 2010 Kate Ericson (SDSC) Shava Smallen (SDSC)

Similar presentations


Presentation on theme: "Quality Assurance (QA) Working Group Update July 1, 2010 Kate Ericson (SDSC) Shava Smallen (SDSC)"— Presentation transcript:

1 Quality Assurance (QA) Working Group Update July 1, 2010 Kate Ericson (SDSC) Shava Smallen (SDSC)

2 Prioritize testing/debugging Reviewed sources of data to determine CTSS usage (i.e., how important they are to users) and how reliable they are 1)process accounting data being collected by TACC, NICS, and Purdue 2)our group’s own system administrator’s expertise 3)the TG ticketing system 4)survey results we collected from User Services WG 5)Inca monitoring results 6)GRAM and GridFTP usage data collected by the Operations WG 7)GRAM and GridFTP monitoring data being collected by the NanoHUB group

3 Prioritize testing/debugging Sustainable usage data collection –Because user requirements and priorities change over time, working to create a sustainable process for collecting and analyzing usage data Identify existing tests to be used and/or develop new tests –Evaluated and revised existing Inca tests (e.g. GRAM usage reporter) –Wrote new tests (e.g. condor-g matchmaking, karnak)

4 Testing/debugging Examined which services fail most often and debugged with admins and developers –Expired CRLs on the Grid nodes were often the cause of GRAM and GridFTP failures. Worked with Security WG to update tests to display a warning when a CRL is within three days of expiration and notify the appropriate contact when a CRL has expired. –WS-GRAM was the most unreliable service but since the Globus group is working to migrate users to GRAM5, the group felt it would not be a productive use of our time to debug this further. RFT was similarly unreliable but since its primary use is as part of a WS- GRAM job, the group felt debugging it was of similarly low priority. Added KB articles about known problems –Condor is a well-used service but test was periodically failing. Problem was the test - worked with admin and condor developers to get better testing.

5 GRAM5 testing/debugging Sources of testing –Inca: tests look good over all, but generally pretty lightweight tests, showing zombie ps problem –nanoHUB: look good on lonestar, some issues getting worked out on qb –Gateway scalability testing: deployed temp. test node on ranger. Began QA collaboration with Gateway developers to conduct testing and transfer knowledge. Gateway developers able to leverage test node for own testing after problems on ranger production node.

6 GRAM5 testing/debugging Debugging –Suggestions to Globus developers for improved logging of error messages –Ongoing discussion of GRAM5 experiences and sys admin updates on TeraGrid's user portal forum –QA, Gateway, Software working groups –Investigating possibility of using FutureGrid

7 Work with Related Groups Met with XD TAS group from U. Buffalo during TG quarterly meeting in February to talk about interactions between the two groups, which will be complementary. Providing Inca and some TG support (e.g. accounts). Will continue discuss collaboration in more detail when U. Buffalo team visits SDSC as part of their start up tour. TAS is also leveraging FutureGrid Inca performance benchmark work. Participated in Science Gateway meeting to discuss GRAM5 early deployments and debugging Monitoring CUE progress and need for testing

8 More Information Working Group Page: http://www.teragridforum.org/mediawiki/index.php?tit le=QA_WG http://www.teragridforum.org/mediawiki/index.php?tit le=QA_WG Deliverables: http://www.teragridforum.org/mediawiki/index.php?tit le=QA_Deliverables http://www.teragridforum.org/mediawiki/index.php?tit le=QA_Deliverables Action Items: http://www.teragridforum.org/mediawiki/index.php?title =QA_Action_Items


Download ppt "Quality Assurance (QA) Working Group Update July 1, 2010 Kate Ericson (SDSC) Shava Smallen (SDSC)"

Similar presentations


Ads by Google