Presentation is loading. Please wait.

Presentation is loading. Please wait.

Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML.

Similar presentations


Presentation on theme: "Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML."— Presentation transcript:

1 Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML

2 The Basics of XML  XML – eXtensible Markup Language  Designed for the storage and transport of data  This includes passing data between services or retrieving data from a Web page  Provides a machine readable dataset  Many service providers export data in XML

3 Example Katy Paul Reminder Don't forget about Bonn Trip! The following website has lost of information about XML and tutorials: http://www.w3schools.com/xml

4 Exploring XML  Identify the root/top element in the example xml  Find all the child elements  What does each line end with ? If you get stuck, try exploring the W3Schools website for answers – the syntax page is especially good !!!

5 Workflows to retrieve XML

6  Load into Taverna the ‘Search Pubmed’ workflow, from the Bonn myExperiment group  http://www.myexperiment.org/workflows/1975 http://www.myexperiment.org/workflows/1975  Run the workflow and see what output you get from Pubmed  try “Blood Clotting” as a search term if you can’t think of anything  Find the root and child elements in the xml  See if you can find the list of Pubmed ids  How many ids did you get for your search term?  There should be a count of them somewhere Exploring XML

7  You should get something like this (with other elements too)  Familiarise yourself with this data  We’ll be extracting some of it next

8 Xpath and Getting the Data out  Xpath used to navigate through elements of XML  Used to find nodes, and data at those nodes  ‘Expressions’ are used to navigate through the document  Further details on what to use can be found at: http://www.w3schools.com/xpath/xpath_syntax.asp http://www.w3schools.com/xpath/xpath_syntax.asp  More information at: http://www.w3schools.com/xpath/ http://www.w3schools.com/xpath/ Sample Expressions

9 Lets have an example Katy Paul Reminder Don't forget about Bonn Trip! To get ‘Katy’ from the XML ‘Katy’ is under the element Navigate through the XML, starting at element, and ending at element So the Xpath expression would be: /note/to

10 Xpath in Taverna  Taverna has 2 modes of Xpath functionality  ‘XML from Text’ local java service  ‘Xpath Service’ Template  The local java service is designed for people who know the Xpath query they want to use and are confident in writing XPath  The Xpath Service Template is designed for a dynamic/exploratory retrieval of data…… and for those who are not confident writing XPath straight away  To start with, we will use only the Xpath Service Template

11 Xpath using the Service Template

12 Install the Xpath Plugin  To install the Xpath service template, you will have to update the Taverna Workbench  Click on 'Advanced', then select 'Updates and Plugins'  In the pop-up menu, click on the 'find new plugins' button  Find the Xpath update, and click 'Install'  You will need to restart Taverna for this to work correctly  Don't forget to save any workflows you have open !!!

13 Getting the Data out using the template  In Taverna, find the service template for XML data processing  Drag the service template onto an empty workflow  The configuration window should automatically open  Copy and Paste the example xml (the Katy XML from previous slide) into the relevant section of the popup box  If you haven’t got the data, you can get it from here: http://www.myexperiment.org/files/471  Press the green arrow to generate XML tree structure (on the right hand side)

14 Getting the Data out Paste here Press this

15 Getting the Data out  You should be able to see the XML tree structure  Explore it by clicking on the “+” arrows to open and close nodes  Find the node and select it  Note, it also selects the root node – making a path through the XML to the IdList node  Click the ‘Generate Xpath Expression’ button  You should see the Xpath, or path to XML element, given as: /note/to

16 Getting the Data out Xpath Expression Data from XML

17 Getting the Data out using the template  In Taverna, find the service template for XML data processing  Drag the service template onto the ‘Search Pubmed’ workflow  The configuration window should automatically open  Paste the xml from your results pane into the relevant section  If you haven’t got the data, you can get it from here: http://www.myexperiment.org/files/469  Press the green arrow to generate XML tree structure (on the right hand side)

18 Getting the Data out  What does /default:eSearchResult/default:IdList mean?  It describes how to navigate through the XML, from the root element ‘eSearchResult’ to get the IdList element.  ‘default’ represents the namespace for the elements, or a URI reference to where the data came from  Click on the ‘Show XML Tree’ button, and select ‘Show namespaces of XML elements’  This should show you the URI from where the data came from  When you have your Xpath query set up, click the apply button, close the popup window, and run the workflow  Try getting something else back from the XML by manually editing the generated Xpath query

19 XML advanced Using the native java Xpath service

20 Advanced XPath Service  Copy the XML from the results  Remove the Xpath Service template from the workflow  Locate the XPath service in the list of available services  Drag it onto the ‘Search PubMed’ workflow

21 Advanced XPath Service  Create an input for the service, called ‘xml_text’, and connect it to the port ‘xml-text’  Add another input port called ‘xpath_query’, and connect it to the ‘xpath’ port  Connect up the nodelist port to an output, called ‘element_text’  Run the workflow, using “Blood Clotting” as your search term  Enter an Xpath query that will retrieve – The TermSet counts for all terms in the TranslationStack – Re-write the Xpath to get the count only for the TermSet, whose term is: “Blood coagulation”[MeSH Terms] – Choose a data element of your own to get back from the XML


Download ppt "Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML."

Similar presentations


Ads by Google