Presentation is loading. Please wait.

Presentation is loading. Please wait.

Meandre Workbench National Center for Supercomputing Applications University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "Meandre Workbench National Center for Supercomputing Applications University of Illinois at Urbana-Champaign."— Presentation transcript:

1 Meandre Workbench National Center for Supercomputing Applications University of Illinois at Urbana-Champaign

2 Outline Overview of Workbench Overview of Repositories Designing and Constructing Flows

3 Meandre: Data Driven Execution Execution Paradigms –Conventional programs perform computational tasks by executing a sequence of instructions. –Data driven execution revolves around the idea of applying transformation operations to a flow or stream of data when it is available. Dataflow Approach –May have zero to many inputs –May have zero to many outputs –Performs a logical operation when data is available

4 Meandre: Dataflow Example Value1 Value2 Sum

5 Meandre: Dataflow Example Dataflow Addition Example –Logical Operation ‘+’ –Requires two inputs –Produces one output When two inputs are available –Logical operation can be preformed –Sum is output When output is produced –Reset internal values –Wait for two new input values to become available Value1 Value2 Sum

6 Meandre: The Dataflow Component Data dictates component execution semantics Component P Inputs Outputs Descriptor in RDF of its behavior The component implementation

7 Meandre: Component Metadata Describes a component Separates: –Components semantics (black box) –Components implementation Provides a unified framework: –Basic building blocks or units (components) –Complex tasks (flows) –Standardized metadata

8 Meandre: Components Types Components are the basic building block of any computational task. There are two kinds of Meandre components: –Executable components Perform computational tasks that require no human interactions during runtime Processes are initialized during flow startup and are fired when in accordance to the policies defined for it. –Control components Used to pause dataflow during user interaction cycles WebUI may be a HTML Form, Applet, or Other user interface

9 Meandre: Flow Connectivity Defined by connecting outputs from one component to the inputs of another. –Cyclical connections are supported –Components may have Zero to many inputs Zero to many outputs Properties that control runtime behavior Described using RDF –Enables storage, reuse, and sharing like components –Allows discovery and dynamic execution

10 Meandre: Flow (Complex Tasks) A flow is a collection of connected components Read P Merge P Do P Show P Get P Dataflow execution

11 A Little more on the Nuts & Bolts! Programming Paradigm What does Meandre Execution Engine Do? What are the possible Component Scenarios Data Driven Flow Creation (Workbench/Zigzag)

12 The Meandre Server prepares Data Intensive Flow by reading the RDF component descriptors –Executable Components and the connections between them are prepared by using a Queue mechanism to store data as it becomes available on the ports. Meandre provides each component an executing thread for processing. Meandre manages the logic queues for component connections in a flow Meandre activates component for initialization, data events, and termination Meandre provides components with access to runtime resources Context AContext BQueue Meandre Server Prepares a Flow

13 Meandre Server Infrastructure Defines Firing.Policy ALL or ANY Input & Ouput Data Ports that require a logical queue to be managed by server Component RDF Descriptor defines: Component Pull Inputs Meandre Server Push Outputs Meandre Server Component Meandre Server Relationship to Component

14 Flows can have any number of components with  “None to Many” Inputs data port s  “None to Many” Output data ports Flow components may have multiple connectors assigned to any input data port Flows may contain connectors that are cyclical over one or more components Flows are made up of “One or More” components with “None to Many” connectors that are described to the Mendre Server for management Flows must contain at minimum one component with NO Inputs to cause an Execute call to be made. *Outputs are Always Optional. Meandre Server Flows & Connectors

15 Meandre: Programming Paradigm The programming paradigm creates complex tasks by linking together a bunch of specialized components. Meandre's publishing mechanism allows components developed by third parties to be assembled in a new flow. There are two ways to develop flows : –Meandre’s Workbench visual programming tool –Meandre’s ZigZag scripting language

16 Workbench Web-based UI Components and flows are retrieved from server Additional locations of components and flows can be added to server Create flow using a graphical drag and drop interface Change property values Execute the flow

17 What is it? Visual programming environment Thankfully, no code writing skills are required. Provides a mechanism to create and execute flows Built on top of GWT (Google Web Toolkit) – accessible from all major browsers

18 Getting Started Fire up your favorite browser and connect If you installed the Workbench on your local machine, use http://localhost:1712http://localhost:1712 to access it, otherwise replace “localhost” with the correct address of the computer where the Workbench is running at. Log in

19 The Workbench

20 The Workspace used as a main staging area for building / editing flows The Output Panel The Details Panel The Repository Panel

21 The Workspace Components can be dragged into this region from the “Components” panel and interconnected to create flows.

22 The Flow Toolbar Provides access to frequently used functions –save flows –remove components –export as ZigZag or MAU –flow execution

23 Saving a Flow Required metadata: - Name - Base URL Separate tags with commas

24 Removing Components Two ways: 1.Select the component and click “Remove” on the toolbar 2.Right-click the component you want to remove and select “Remove”

25 Controlling Flow Execution Run Flow –Executes the current flow loaded in the Workspace. Any output from the flow will be displayed in the Output panel. –If the flow contains interactive components, they will be displayed automatically. –Important: Please be sure to set your browser to allow pop-ups from the Workbench, otherwise the web interactive components will not display! Stop Flow –Sends a request to the Meandre server to abort the currently executing flow. –May take a while – the server waits for components to finish their current operation.

26 The Repository Panel Three sections: Components Flows Locations Searching is supported Display is Customizable: Column selection Sorting Grouping

27 Components Software units are designed to accomplish a particular task May have inputs, outputs, and properties Components with properties can be identified by a symbol appearing in the lower left-hand side of a component icon

28 Flows A Flow is essentially an application — a group of components connected together to perform a set of tasks Click on the Flows tab in the Repository panel to view the flows in your Workbench. Double click on a flow to load that flow into the Workspace.

29 Locations Adding a repository location causes all the components and flows hosted at that location to be imported in the user’s private repository on the server Removing a location also removes the associated components and flows from the server. You can find a list of available repository locations at http://www.seasr.org/documentation

30 The Details Panel Shows the properties and description of a selected component or flow Properties Description For components, the Description displays information about the component function. For flows, the Description displays information about the flow and the components it contains and their property values.

31 The Output Panel Displays output and error messages generated by the Workbench

32 Using the Workspace Placing Components The first step in building a flow is to choose components from the Repository panel and place them into the Workspace. To place a component, click on the Components section in the Repository panel and drag the desired component over into the Workspace area. Note: A flow must have at least one component with no inputs to be able to be executed by the Meandre server. Selecting Components Components can be selected by single clicking on them in the Workspace. When a component is selected, other selected items are deselected. While selected, a component can be moved about the Workspace or deleted. A selected component (or flow) can be unselected by using CTRL+click on that component (or flow).

33 Using the Workspace Labeling Components Editing the component label only changes the name of the component in the given flow. The label must remain unique among the other component labels in the flow. The label can be edited by single-clicking on it and entering the desired text. Pressing ESC while editing a label cancels the labeling operation and restores the original label. Connecting and Disconnecting Components To make a connection, click on the output port of the desired source component (the port you clicked will be colored red), and then click on the input port to which you wish to connect. You should now have a line connecting the output and input port. If, after selecting a port, you wish to cancel the operation, simply clicking the same port again will unselect it. The ports of two components should only be connected if their data types are compatible with one another. Any errors resulting from data incompatibilities will occur at runtime. To remove a connection, simply right-click one of the ports and select “Disconnect” from the context menu. Alternatively, you can remove groups of ports by right- clicking the component and selecting the appropriate menu option.

34 Using the Workspace Connecting and Disconnecting Components A component’s output port may only be connected to one input port. However, a component’s input port may be connected to several different output ports. This could be useful when you are retrieving the same data format from multiple components. The connection line is highlighted if the user hovers over an input or output port. This is useful for verifying connections in a complex flow. When hovering over a component port, the description of that port is also briefly displayed.

35 Demonstration We will be demonstrating the use of the Workbench for creating flows –Use TagCloudViewer as an example and explain how it was created

36 Learning Exercises Explore the functionality of the Meandre Workbench –Open Meandre Workbench (WB) by navigating to http://localhost:1712 http://localhost:1712 Usage of existing components to create a data- driven flow for creating a basic Tag Cloud Viewer flow so they can become familiar with the mechanics of drag-drop, creating connections, setting properties, saving, executing Create a new tab in the WB by clicking on the first tab (with the yellow star)

37 Learning Exercise 1 Retrieve text from a url –Expand the Components section of the WB (click on the + sign) –Find the component named "Push Text" (scroll down or use the search box) and drag it onto the workspace –Find the component named "Universal Text Extractor" and add it to the flow, as before –Connect the output port "text" of "Push Text" to the input port "location" of "Universal Text Extractor" (click on each port to make a connection)

38 Learning Exercise 2 Count the words –Find the components "OpenNLP Tokenizer" and "Token Counter" and add them to the flow, as before –Connect the output port "text" of "Universal Text Extractor" to the input port "text" of "OpenNLP Tokenizer" –Connect the output port "tokens" of "OpenNLP Tokenizer" to the input port "tokens" of "Token Counter”

39 Learning Exercise 3 Visualize with the Tag Cloud Viewer Find the components "Tag Cloud Image Maker", "HTML Fragment Maker" and "HTML Viewer" and add them to the flow Change the property named "encoding" of "HTML Fragment Maker" to read "image" (no quotes) –Select the "HTML Fragment Maker" component by clicking on it –Double click on "encoding" in the Details -> Properties panel on the right side of the WB and change the value by typing "image" (no quotes) –After changing the text, press ENTER to accept the new value Connect the output port "token_counts" of "Token Counter" to the input port of "Tag Cloud Image Maker" Connect the output port "raw_data" of "Tag Cloud Image Maker" to the input port of "HTML Fragment Maker" Connect the output port "html" of "HTML Fragment Maker" to the input port of "HTML Viewer”

40 Learning Exercises 4 Improve the Tag Cloud Flow that you created to "clean" it up a bit Convert all words to lower case –Find the component "To Lowercase" and add it to the flow, connecting it between "Universal Text Extractor" and "OpenNLP Tokenizer" Click the output port "text" of "Universal Text Extractor" and then click the input port "text" of "To Lowercase" (this will remove the existing connection between "Universal Text Extractor" and "OpenNLP Tokenzier") Connect the output port of "To Lowercase" to the appropriate port of "OpenNLP Tokenizer”

41 Learning Exercise 5 Remove stop words –Add another "Push Text", "Universal Text Extractor" and "OpenNLP Tokenizer" to the flow, and connect them –Set the "message" property of this second "Push Text" to read "http://repository.seasr.org/Datasets/Text/common_words.txt" (no quotes) –Find and add the component "Token Filter" between "Token Counter" and "Tag Cloud Image Maker" –Connect the output port "token_counts" of "Token Counter" to the input port "token_counts" of "Token Filter" –Connect the output port "token_counts" of "Token Filter" to the input port of "Tag Cloud Image Maker" –Connect the output port "tokens" of the second "OpenNLP Tokenizer" to the input port "tokens_blacklist" of "Token Filter”

42 Learning Exercise 6 Filter to specific number of words –Find and add the component "Top N Filter" between "Token Filter" and "Tag Cloud Image Maker" –Connect the output port "token_counts" to the input port of "Top N Filter" –Connect the output port of "Top N Filter" to the input port of "Tag Cloud Image Maker" –Set the property "n_top_tokens" of "Top N Filter" to a number representing the number of top tokens to be displayed (ranked by token count)

43 Discussion Questions What are three advantages of using a component driven environment for text analytics? What are the possible obstacles for humanities scholars in using an environment like the Meandre Workbench to assemble and create flows for accomplishing their research needs? Are there parts of the workbench that are unclear or that need extra explanation? Do you have any feature requests? Are there any tools that you would like to see componentized such that you can work with these tools in the Meandre Workbench?


Download ppt "Meandre Workbench National Center for Supercomputing Applications University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google