Presentation on theme: "E-Science Data Information and Knowledge Transformation BinX An edikt Project Testbed Ted Wen, Robert Carroll, Denise Ecklund, Bob Gibbins, Davy Virdee,"— Presentation transcript:
e-Science Data Information and Knowledge Transformation BinX An edikt Project Testbed Ted Wen, Robert Carroll, Denise Ecklund, Bob Gibbins, Davy Virdee, Rob Baxter
www.edikt.org 2 Presentation outline Edikt project A data problem BinX - today –language –library –applications BinX – future
www.edikt.org 3 What is edikt? e-Science Data, Information and Knowledge Transformation –a research development activity designed to bridge the gap between applications science and computer science in the realms of Grid-scale data take prototypes from CS and Grid research… …engineer them into robust tools… …for real application science problems… …test them under extreme science conditions… …and keep an eye on the commercial possibilities Team of 8 professional engineers, mgmt & staff Funded by SHEFC; Project start was May 2002
www.edikt.org 4 Current activities edikt::Eldas –proving GGFs GDSS for virtual organisations –developing scalable data access technologies edikt::BinX –data interchange for astronomy & PP edikt::Giggle and RLS –evaluation of data replication technology for PP Bioinformatics –data mediation to integrate multiple data sources –data versioning to manage changing schemas
e-Science Data Information and Knowledge Transformation eScience Data Real-World and In Silico Experiments
www.edikt.org 6 Research and discovery Workflow support tools –Format converter –Model builder Real-world Experiments Data Analysis Result Data Results In silico Experiments Generic Tools App area 2 App area 1 App area 3 App area 4 Existing tools: XML processors New tools: Perl script generators Model description generators C C C Workflow Abstract Model C C
www.edikt.org 7 Data integration & mediation Distributed Geo-sensors Real-world Experiments Data Integrator/ Mediator Integrated Data Public Biochemical Signalling DBs S1 S2 S3 S4S5 S6 D1 D2 D3 Reaction 1 D1 D2 D3Reaction 2 D1 D2 D3Reaction n D1 D2 D3 –One sensor type with overlapping observation regions –Resolve conflicting values in the overlap –Compute total space – min or max? If max, define missing values –Match the input records –Build integrated records –Detect data value conflicts –Resolve data value conflicts........
www.edikt.org 8 Data subsets Legacy data was not organized for the new analysis –Extract a data subset –Define the subset by queries Real-world Experiments 1953 Legacy Data Real-world Experiments today New Data Analysis New Analysis Data New Results Results Structural metadata query: What is the minimum geo-space data coverage? Simple semantic query: What reactions require 2 or more inhibitor agents to prevent the reaction? Complex semantic query: What objects are contained in a 3-dimensional image? S C
www.edikt.org 9 BinX for binary data BinX is a foundation tool for these problems when the data is a structured binary file. Workflow – format conversion BinX XML1 Binary data1 BinX XML2 Binary data2 BinX-based format conversion Data SubsetsData Integration BinX XML description R-W Exper Binary data Exp1 Exp2 Exp3 Binary data Integrate dBinary data D1D1 D1D1 D2D2 D2D2 D3D3 D3D3 I-D S1 S2 S3
Your consent to our cookies if you continue to use this website.