May 8, 2006 MAGE v1 and MAGE v2 Michael Miller Lead Software Developer Rosetta Biosoftware NCI MAGE Jamboree.

Slides:



Advertisements
Similar presentations
Chapter 6 Server-side Programming: Java Servlets
Advertisements

© 2011 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. Towards a Model-Based Characterization of Data and Services Integration Paul.
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
MAHDI OMAR JUNIT TUTORIAL. CONTENTS Installation of Junit Eclipse support for Junit Using Junit exercise JUnit options Questions Links and Literature.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System modeling 2.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Run time vs. Compile time
26-Jun-15 SAX. SAX and DOM SAX and DOM are standards for XML parsers--program APIs to read and interpret XML files DOM is a W3C standard SAX is an ad-hoc.
1 Data Structures Data Structures Topic #2. 2 Today’s Agenda Data Abstraction –Given what we talked about last time, we need to step through an example.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Guide To UNIX Using Linux Third Edition
Differences between C# and C++ Dr. Catherine Stringfellow Dr. Stewart Carpenter.
Java Unit 9: Arrays Declaring and Processing Arrays.
Week 4-5 Java Programming. Loops What is a loop? Loop is code that repeats itself a certain number of times There are two types of loops: For loop Used.
Proceso kintamybių modeliavimas Modelling process variabilities Donatas Čiukšys.
Practical Object-Oriented Design with UML 2e Slide 1/1 ©The McGraw-Hill Companies, 2004 PRACTICAL OBJECT-ORIENTED DESIGN WITH UML 2e Chapter 2: Modelling.
Robert Fourer, Jun Ma, Kipp Martin Optimization Services Instance Language (OSiL), Solvers, and Modeling Languages Kipp Martin University of Chicago
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
SDPL 2003Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Using JavaBeans and Custom Tags in JSP Lesson 3B / Slide 1 of 37 J2EE Web Components Pre-assessment Questions 1.The _____________ attribute of a JSP page.
HeuristicLab. Motivation  less memory pressure no DOM single pass linear process  less developer effort no interfaces to implement  modularity & flexibility.
CISC474 - JavaScript 03/02/2011. Some Background… Great JavaScript Guides: –
ADO.NET A2 Teacher Up skilling LECTURE 3. What’s to come today? ADO.NET What is ADO.NET? ADO.NET Objects SqlConnection SqlCommand SqlDataReader DataSet.
SAX Parsing Presented by Clifford Lemoine CSC 436 Compiler Design.
Polymorphism, Inheritance Pt. 1 COMP 401, Fall 2014 Lecture 7 9/9/2014.
SDPL 2002Notes 3: XML Processor Interfaces1 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser.
XML Parsers Overview  Types of parsers  Using XML parsers  SAX  DOM  DOM versus SAX  Products  Conclusion.
SAX. What is SAX SAX 1.0 was released on May 11, SAX is a common, event-based API for parsing XML documents Primarily a Java API but there implementations.
Beginning XML 4th Edition. Chapter 12: Simple API for XML (SAX)
The XML Document Object Model (DOM) Aug’10 – Dec ’10.
Openadaptor XML Support Using openadaptor for XML processing Oleg Dulin,
Project 1 Due Date: September 25 th Quiz 4 is due September 28 th Quiz 5 is due October2th 1.
Knowledge Technologies March 2001 DataChannel, Inc Preserving Process Hyperlink-Based Workflow Representation W. Eliot Kimber, DataChannel, Inc.
1 Intro to Java Week 12 (Slides courtesy of Charatan & Kans, chapter 8)
Andrew S. Budarevsky Adaptive Application Data Management Overview.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
C++ Programming Basic Learning Prepared By The Smartpath Information systems
(c) University of Washington15-1 CSC 143 Java List Implementation via Arrays Reading: 13.
CSE 6331 © Leonidas Fegaras XML Tools1 XML Tools.
1 Text Reference: Warford. 2 Computer Architecture: The design of those aspects of a computer which are visible to the programmer. Architecture Organization.
XML Study-Session: Part III
XML and SAX (A quick overview) ● What is XML? ● What are SAX and DOM? ● Using SAX.
ESDI Workshop on Conceptual Schema Languages and Tools
Ch- 8. Class Diagrams Class diagrams are the most common diagram found in modeling object- oriented systems. Class diagrams are important not only for.
Ordered Linked Lists using Abstract Data Types (ADT) in Java Presented by: Andrew Aken.
Chapter 3 Collections. Objectives  Define the concepts and terminology related to collections  Explore the basic structures of the Java Collections.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
1 Introduction JAXP. Objectives  XML Parser  Parsing and Parsers  JAXP interfaces  Workshops 2.
ICS3U_FileIO.ppt File Input/Output (I/O)‏ ICS3U_FileIO.ppt File I/O Declare a file object File myFile = new File("billy.txt"); a file object whose name.
Data Design and Implementation. Definitions Atomic or primitive type A data type whose elements are single, non-decomposable data items Composite type.
Refactoring Agile Development Project. Lecture roadmap Refactoring Some issues to address when coding.
Simple API for XML (SAX) Aug’10 – Dec ’10. Introduction to SAX Simple API for XML or SAX was developed as a standardized way to parse an XML document.
CGS 3066: Web Programming and Design Spring 2016 Programming in JavaScript.
7-Mar-16 Simple API XML.  SAX and DOM are standards for XML parsers-- program APIs to read and interpret XML files  DOM is a W3C standard  SAX is an.
Part 1: Composition, Aggregation, and Delegation Part 2: Iterator COMP 401 Fall 2014 Lecture 10 9/18/2014.
Lecture 9:FXML and Useful Java Collections Michael Hsu CSULA.
XML Schema – XSLT Week 8 Web site:
XML & JSON. Background XML and JSON are to standard, textual data formats for representing arbitrary data – XML stands for “eXtensible Markup Language”
Java API for XML Processing
The PLA Model: On the Combination of Product-Line Analyses 강태준.
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
Parsing with SAX using Java Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Functional Processing of Collections (Advanced)
XML Parsers Overview Types of parsers Using XML parsers SAX DOM
Java API for XML Processing
Introduction to Data Structure
MAPREDUCE TYPES, FORMATS AND FEATURES
Presentation transcript:

May 8, 2006 MAGE v1 and MAGE v2 Michael Miller Lead Software Developer Rosetta Biosoftware NCI MAGE Jamboree

May 8, 2006 Overview Effective MAGE—using MAGE v1 –Perception vs. Reality –Parsing –Import MAGE v2 –FuGE –MAGE v2 Links Acknowledgements

May 8, 2006 Effective MAGE—using MAGE v1 Perception vs. Reality Perception –MAGE is too complicated to be used Reality –Gene expression experiments are complex –Any attempt to fully exchange the data and annotation for gene expression will be complex –Any attempt will need to gather together the information and export it and the receiving application will have to import it –These are not MAGE problems –There are many ways MAGE can be used effectively, some of which follow

May 8, 2006 Effective MAGE—using MAGE v1 Parsing Design Principles –UML class  Java class –XML Instances  Row in Java Instance Attributes are typed lists Associations are lists of lists Lists are backed by primitive arrays (not Object arrays) –Parsing dirt simple –Data Handling Abstract the cube Concrete the mapping –Parsing application neutral

May 8, 2006 Effective MAGE—using MAGE v1 Parsing Design Principles UML class  Java class –All classes derive from a common abstract class, MAGEcache –All associations derive from a common interface, MAGEobject

May 8, 2006 Java Classes Effective MAGE—using MAGE v1 Parsing Design Principles XML Instances  Rows in Java Instance –Each element representing an instance of a UML class becomes a row in the Java class –The attributes for that class are each set in the appropriate list on the attributes List –Each nested UML class element, when it is finished being parsed, passes its current row up through the association class where it is added to the appropriate associations List –The associations List is a List of Lists UML Classes

May 8, 2006 // standard SAX interfaces public void startElement( String namespaceURI, String localName, String qName, Attributes atts) { // check for ref boolean isRef = false; int index = -1; if (-1 != (index = localName.lastIndexOf("_ref"))) { isRef = true; // look up the actual class localName = localName.substring(0, index); } MAGEobject curObject = (MAGEobject) caches.get(localName); curObject.startElement(atts, (MAGEobject) stack.get(stack.size() - 1), isRef); stack.add(curObject); } public void endElement(String namespaceURI, String localName, String qName) throws SAXException { String pcData = chars.toString(); if (0 < pcData.trim().length()) { // Have #PCData so simulate startElement() and endElement() for the // container and class PCData AttributesImpl atts = new AttributesImpl(); startElement("", "PCData_assn", "PCData_assn", atts); atts.addAttribute("", "pcData", "pcData", "CDATA", pcData); startElement("", "PCData", "PCData", atts); // Do this now or the endElement() call will fall in here again! chars.setLength(0); endElement("", "PCData", "PCData"); endElement("", "PCData_assn", "PCData_assn"); } MAGEobject curObject = (MAGEobject) stack.remove(stack.size() - 1); curObject.endElement(); } Effective MAGE—using MAGE v1 Parsing Design Principles Parsing dirt simple –Implement SAX startElement() and endElement() startElement –Resolve Java class name and retrieve instance from the cache –Call MAGEcache.startElement() with attributes, the containing class and whether to treat the record as a reference. endElement –Special case the DataInternal class to treat #PCData –Call MAGEcache.endElement() to connect the containing class to the nested class.

May 8, 2006 Effective MAGE—using MAGE v1 Parsing Design Principles Data Handling –Abstract the DataCube as a linear list –Concrete a set of Lists grouped by BioAssays with a List per QuantitationType, with the each List’s size allocated to countDE –Depending on the BioDataCube.order attribute, set up two arrays, for example for order ‘DBQ’ dimSize[] = {countDE, countBA, countQT} dimIndices = {1, 0, 2} (where 0=BA, 1=DE, 2=QT} –Then loop for each value: values[] is the set of Lists per BioAssay per QuantitationType, nextValue is the next parsed value from the linearized DataCube (details left out) int counters[] = { 0, 0, 0 }; for( counters[0] = 0; counters[0] < countIndices[0]; counters[0]++ ) { for( counters[1] = 0; counters[1] < countIndices[1]; counters[1]++ ) { for( counters[2] = 0; counters[2] < countIndices[2]; counters[2]++ ) { values[counters[dimIndices[0]] * countIndices[2] + counters[dimIndices[2]]].set(nextValue,counters[dimIndices[1]]); } –Mathmagically the values will end up in the correct place no matter the order QT countQT DE countDE BA countBA

May 8, 2006 Effective MAGE—using MAGE v1 Parsing Design Principles Parsing application neutral –Well understood point –Can implement efficiencies such as sliding windows or delayed parse, as long as application logic remains separate from parsing

May 8, 2006 Effective MAGE—using MAGE v1 Import Design Principles –Mapping Between applications From MAGE to Application –Import Parsing produces in-memory MAGE Used in a similar way as a DOM interface From MAGE to MAGE –Pipelines MAGE tailored to source of data From small pieces comes completeness –Export Use startElement() and endElement() methods Methods per association to add an association to either a reference or an owned instance

May 8, 2006 Effective MAGE—using MAGE v1 Import Design Principles Mapping –Between applications For collaborative efforts or intradepartmental integration Known source and target First things first, determine where the source data and annotation will end up at the target WITHOUT considering how TableColumnTableColumn HYBRIDIZATIONSAMPLE_IDHYBPREP_ID PREFORMED_BY_IDOPERATOR_ID MACHINE_IDHARDWARE_ID SAMPLESAMPLE_NAMEPREPIDENTIFIER MACHINESERIAL_NUMBERHARDWAREUUID MAKEMODEL Mapping –From MAGE to Application Map between table/column and location in the MAGE file Problematic in those areas where choice is possible—mitigated by determining producer of the MAGE file TableColumnXPath PREFORMED_BY_ID//Hybridization//ProtocolApplication//Perso = and alue = "Performed Hyb"? //Hybridization//ProtocolApplication//Per MACHINE_ID//Hybridization//ProtocolApplication//Ha tifier

May 8, 2006 Effective MAGE—using MAGE v1 Import Design Principles Import –Parsing produces in-memory MAGE Application neutral Can have various handlers defined: –Translate into a different representation in memory (see MAGE to MAGE) –Adjust contents to application specific requirements –Save to database by applying mapping rules –Used in a similar way as a DOM interface Not DOM but handlers can traverse the structure and obtain contents for any instance of a class that was in the XML document –From MAGE to MAGE Between a model where a single Java class for all instances and a model where there is one Java class per instance (the current STK) –Generate a method that takes the representation of the other Between a tab-delimited representation and Java representation (MAGE-TAB to MAGEstk)

May 8, 2006 Effective MAGE—using MAGE v1 Import Design Principles Pipelines –MAGE tailored to source of data The mapping then becomes between the source and the target Export is defined by the target MAGE mapping Developer on export side doesn’t need to know MAGE, it is an exercise in string formatting <Hybridization identifier = "{PREFIX}:Hyb:{HYB.IDENTIFIER})" name = "{HYB_NAME}"> <Protocol_ref identifier="{PROTOCOL.IDENTIFIER}" name="{PROTOCOL.COMMON_NAME}"/>

May 8, 2006 Effective MAGE—using MAGE v1 Import Design Principles Pipelines –From small pieces comes completeness Business rules for an application define the order of import and what place holders can be created for records that aren’t received yet. Pipelines for different packages/elements that make up an experiment, for example: One or more ArrayDesign XML documents An Experiment/ExperimentDesign XML document Multiple Array XML documents Multiple MeasuredBioAssay/MeasuredBioAssayData (from feature extraction) Multiple PhysicalBioAssay/Image XML documents Multiple BioMaterial XML documents Once pipelines for a particular experiment are all run, the experiment is completed Not even a necessity to use just MAGE format or just XML format, identifier attributes provide the glue