Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Technologies for Bioinformatics Ken Baclawski.

Similar presentations


Presentation on theme: "Web Technologies for Bioinformatics Ken Baclawski."— Presentation transcript:

1 Web Technologies for Bioinformatics Ken Baclawski

2 Data Formats n Flat files n Spreadsheets n Relational databases n Web sites

3 XML Documents n Flexible very popular text format n Self-describing records

4 XML Documents (continued) n Hierarchical structure

5 Purpose of Data n Data is collected and stored for a purpose. n The format serves that purpose. n Using data for another purpose is common. n Data presentation (such as on a Web site) is one example of such a use. n It is important to anticipate that data will be used for many purposes. n Data is reused by transforming it.

6 Statistical Analysis as a Transformation Process n Transformation consists of a series of steps. n Specialized equipment and software is used for each step. n Separation into steps reduces the overall effort.

7 Web Site Construction n Web sites can be constructed using a Web site authoring tool (e.g., Front Page). n Alternatively, one could use a transformation process to separate concerns.

8 Advantages of Transformation n Reduces the overall effort. n Presentation style is independent of the source content. n Presentation style can be changed with immediate effect. n Uniform enforcement of presentation style. n Updates to content are immediate. n Content can be used for many other purposes: – Many reports in many formats – Proposals – Data sharing with other institutions – Data mining

9 Transformation Languages n Traditional programming languages such as Perl, Java, etc. n Rule-based (declarative) languages such as the XML Transformation language (XSLT). – Rule-based rather than procedural – Transform each kind of element with a template – Matching and processing of elements is analogous to the digestion of polymers with enzymes.

10 Transformation as Digestion n The blue enzyme attacks the polymer at two locations. n The resulting three polymers are then attacked by the green enzyme.

11 XSLT “Digestion” n An XSLT program consists of templates n Each template processes a set of matching elements n A template can break up the element to be processed by other templates

12 <xsl:transform version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

13 5.67 <Concentration unit="nm">43 <interaction substrate="Sub89033"> 4.37 75 <P id="Mtr245"> 0.65 <Concentration unit="um">0.53 <interaction substrate="Sub80933"> 8.87 8.4 <S id="Sub89032"/>

14 5.67 43 4.37 75 0.65 0.53 8.87 8.4

15 Ontologies n The structure of data is its ontology. – Database schema – XML Document Type Definition (DTD) n An ontology defines the concepts and relationships between them in a domain. n Transformations are fundamental: – Queries – Organizing data (views) – Transformation for new purposes

16 Research Areas n Ontologies for bioinformatics n Ontology development in general – Constructing ontologies – Validation and testing of ontologies n New ontology languages to capture more meaning n Transformation languages

17 Research Areas n Inference and deduction – Logical inference – Probabilistic inference – Scientific inference – Other forms of inference n Integrating inference with – Data mining – Experimental processes


Download ppt "Web Technologies for Bioinformatics Ken Baclawski."

Similar presentations


Ads by Google