Presentation is loading. Please wait.

Presentation is loading. Please wait.

KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division.

Similar presentations


Presentation on theme: "KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division."— Presentation transcript:

1 KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division

2 Outline  DrugScreener-G:  Toward Integrated Environment for Grid- enabled Large-scale Virtual Screening  Data challenge on Diabetes using the newly developed performance- enhanced WISDOM Production environment  New Developments in AMGA

3 DrugScreener-G: Toward Integrated environment for Grid-enabled large-scale Virtual Screening

4 DrugScreener-G  Objective  Providing a user-friendly integrated environment for Grid-based large-scale virtual screening for users without much knowledge of Grid computing to exploit full power of Grid computing infrastructure for drug discovery  Target users: Bioinformaticians, Biologists, Drug Chemists

5 Virtual Screening in DrugScreener-G

6 Plug-in Architecture in DrugScreener-G Plug-ins on client’s side Plug-ins on server’s side

7 Support for multiple platforms

8 Data Challenge on Diabetes

9 Data Challenge against diabetes type 2  Data challenge to Human pancreatic amylase inhibitor (1u2y), a target protein of diabetes type 2, with 308310 chemical compounds  Achieved significant improvement in efficiency and throughput WISDOM-IIDIANEKISTI Total number of dockings156,407,400308,585308,310 Estimated duration on 1 CPU413 years16.7 years39.0 years Duration of the experiment76 days30 days2.4 days Cumulative number of Grid jobs77,5042,580103,583 Maximum number of concurrent CPUs50002407,370 Number of used Computing Elements9836127 Crunching Factor19832035937 Distribution Efficiency39%84%81%

10 Performance Enhancement in the WISDOM Production Environment

11 Evolution of WISDOM Production Environment  Data/Control Flows in one slide WN SE CERB UI Production Environment Retrieve Target Submit Jobs Monitor Job Status Update Job Status Get Task Retrieve Software, Compound Send Heartbeat signal Update Job Status Store Result AMGA

12 The exploitation of AMGA for storing ligands and docking results  Why is AMGA used for the input/output data storing purpose?  AMGA performs way much better than SE in terms of throughput and latency 36.4 0.25 1107.0 6.1 4343.3 36.4 Docking Throughput Time taken to retrieve a ligandTime taken to store a docking result

13 The use of AMGA for task distribution purpose in the pull model of WISDOM PE  A new AMGA operation is designed to allow jobs to fetch a task from the AMGA server in a single operation, rather than calling  the AMGA source code modified and a new operation “updateAttr_single” added  The throughput of task retrieval using the new operation is 70 times higher than that of using the existing AMGA operations

14 Enhanced Monitoring & Bookkeeping Workload status and Job throughput monitoring Status summery and simulation throughput monitoring Job distribution and execution status with CE info

15 Additional Features relevant to Performance Improvement of WISDOM PE  Flexible fault tolerance  Can recover a previous instance using data on AMGA  Can increase/decrease the number of agents of a running previous instance  Improved Job distribution efficiency  Dynamic management of the list of available RB and CE  While the initial job submission keeps going, we can check the status of submitted jobs and resubmit failed jobs  Direct collecting the statistical data of sites from WN  Its own ranking based on previous statistical data  Timeout feature (submission timeout, heartbeat timeout)

16 New Developments in AMGA

17 AMGA 2.0  Currently preparing AMGA 2.0 release  To be released later this year  Stability of WS-DAIR server needs to be improved before release  Feature-complete 1.9 technology preview available  Support for the import of existing relational tables  WS-DAIR frontend  Native SQL support  Multi-threaded DB backend with connection pooling

18 WS-DAIR Support  WS-DAIR  OGF standard for access to relational DB's on the Grid  Background  Enable interoperability to facilitates sharing data  AMGA uses its own protocol and message patterns  Integration of WS-DAIR in AMGA will make AMGA a relational data source in a WS-based environment!  AMGA-DAIR Integration  Implementation of direct access and indirect access done.  Data returned in Sun's WebRowSet specification.  All queries use SQL.  Clients written in C++ (gSOAP)

19 SQL Queries  AMGA allows queries now in native SQL  Support for entry level SQL 92 done  Some 92 intermediate level supported  SELECT, UPDATE, INSERT, DELETE  All queries are subject to AMGA access restrictions  All queries are parsed by AMGA, AMGA “understands” security implications  Posix ACLs for tables/entries  SQL 92 query is translated into backend DB dialect

20 Multithreaded & DB Connection Pool Backend Support  AMGA 1.3 used one process per client connection  Processes allow control of misbehaving clients  But processes cannot share DB connections  Limits # concurrent clients 32 1 1000 0 2000 Max # Requests / sec # AMGA Servers AMGA 1.9: Multiple threads sharing a DB connection encapsulated in multiple processes Threads allow better resource usage on DB WISDOM measured 1500 Queries/s, more than the required 5million Queries / h peak needed for data challanges

21 Thank you


Download ppt "KISTI’s Activities on the NA4 Biomed Cluster Soonwook Hwang, Sunil Ahn, Jincheol Kim, Namgyu Kim and Sehoon Lee KISTI e-Science Division."

Similar presentations


Ads by Google