Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits:

Similar presentations


Presentation on theme: "An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits:"— Presentation transcript:

1 An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits: Aleksandra Pawlik and Katy Wolstencroft

2  We can add input data into the workflow not only manually but also from a file. Go to myExperiment group and download a file called:  03B_species_1.txt  Click run workflow again but instead of selecting Set value select Set file location and navigate to where you saved the 03B_species_1.txt file

3 Instead of downloading the file we can point the workflow to the file’s URL (if we know it). Let’s run the workflow again but this time select “Set URL” and paste in: http://www.myexperiment.org/files/1107/versions/1/downlo ad/03B_species_1.txt

4  So far we have used simple text files, but it is also possible to use Spreadsheets as sources of input data. In order to do that we will need to add a Spreadsheet tool to our workflow.  From the myExperiment group download the file:  03C_species_list_1.xls  Open it on your machine and see what it contains (the list of the species name is in cells B3 to B6)  From the Service Templates select the Spreadsheet Import tool right-click on it and add it to the workflow

5

6  In the pop up window set the correct range for columns and rows (untick the box “all rows”)

7  We need to delete the input port for the workflow (right click on it and select Delete)  The Spreadsheet tool expects as an input the URL (or path) to the file. The best way to feed in that URL/path is to add a service called “Text constant”

8  Where it says “Add your own value here” enter: http://www.myexperime nt.org/files/1108/version s/1/download/03C_spec ies_list_1.xls http://www.myexperime nt.org/files/1108/version s/1/download/03C_spec ies_list_1.xls  If you prefer you can insert the full path to your local file  Then Apply and Close

9  Connect the Text constant with the Spreadsheet Import tool  Connect the Spreadsheet Import tool with the input to the GBIF service

10  When we run the service, we can see that there are four values for the results (as there were 4 species names that we read from the spreadsheet). Taverna implicitly iterated over these 4 input values and processed them.

11  Taverna allows you to save results in different formats and also allows you to save intermediate workflow results (which is very useful when you run a large workflow)  You can save all result values:  Taverna allows you to save values in a variety of formats

12  You can also save each single value separately:  In order to save intermediate values, in the results tab select the part of the workflow which you want to save the values for, then in the results window you should see these values and you will be able to save them

13  A shim is a service that doesn’t perform an experimental function, but acts as a connector, or glue, when 2 experimental services have incompatible outputs and inputs  A shim can be any type of service – WSDL, soaplab etc. Many are simple Beanshell scripts  Shims can also be used to preprocess data that are input into the workflow and we will use one of these shims for this exercise

14  Create a directory called “data”  Copy over the files which we used for the previous exercise in to this directory:  03B_species_1.txt  03C_species_list_1.xls  From the myExperiment group download the following files to the same directory:  03D_species_2.txt  03E_species_list_2.xls

15  Let’s assume you’re regularly having to deal with data in different formats - one of them is spreadsheet (csv or xls).  You know that the spreadsheet files always have the species names in column B starting from row 3 up to row 100 (some rows may be empty).  You can automate your workflow to pull the species names from all of these spreadsheets in a specified directory at once using a shim service.

16  Delete the Text constant service in your workflow  From the Available Services select Local Services io and List Files by Extension

17  Connect the shim service with the Spreadsheet tool  Right click on “file extension” and enter xls  Right click on directory, click constant vaue and enter the path to the Directory you just created caled “Data”.

18  We need to reconfigure the Spreadsheet service  We’ll set the rows from 3 to 100  And make the service ignore the blank rows

19  Run the workflow  When we look at the results we can see that Taverna  read the species names from both spreadsheets  ignored the text files  found the values for them using the GBIF service


Download ppt "An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits:"

Similar presentations


Ads by Google