Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil.

Similar presentations


Presentation on theme: "University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil."— Presentation transcript:

1 University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil

2 University of Illinois Outline

3 University of Illinois SW Silos We continue to build silos..  Why?  I’m only creating a prototype for my paper…  I want to have control…  I want to write my own code…  I can do it faster…  I’m not funded to integrate with…  … Images from Google Search

4 University of Illinois From Silos to Mashups  Definition: Mashup is a web page or application that uses and combines data, presentation or functionality from two or more sources to create new services  Why do we want this?  Enable out services in many applications and on a variety of devices (laptop, high-res display wall, ipad, iphone or the others)  Share and reuse is a good thing  Reach communities with our tools and their data!!!  What can we do to change this?  We can think and create data driven solutions so that they can be mashed up with other tools.  We can build web services that can be deployed or accessed.  We can create API’s to be used.  How can we do this?

5 University of Illinois Mashup Framework Components Virtualization Infrastructure Meandre Infrastructure Visualization Component Repository Component Discovery Meandre Data-Intensive Flows AppsServicesPlugins Web Apps AnalyticsData Developer Tools Repositories Data Analysis Components Flows User Interfaces Computational Resources Visualizations Meandre Workbench

6 University of Illinois Kepler Triana BPEL Ptolemy II Taverna Trident Meandre VisTrails David De Roure slide (slightly modified) BPEL Scientific Workflows

7 University of Illinois Meandre for Mashups  Major Capabilities  Dataflow execution  Semantic technology (using RDF for storing meta info)  Web-Oriented  Supports publishing services for data, analytics and visualization  Modular components  Encapsulation and execution mechanism  Promotes reuse, sharing, and collaboration  Cloud-friendly infrastructure  Note: (for Tom) Trading off some performance for reuse, flexibility and modular components… with option to parallelize components to improve performance

8 University of Illinois Components Analytics Unsupervised Learning Clustering Frequent Pattern Analysis (Rule Association) Supervised Learning Naïve Bayesian Support Vector Machines (Weka) Decision Trees (c4.5) Optimization Approaches Genetic Algorithm Text Analysis (POS, Entity Ext) OpenNLP Stanford NER Visualization Geographic (Google Maps) Temporal (Simile) Network Graphs – Link Nodes and Arcs (Protovis) Parallel Coordinates (Protovis) Stacked Area Chart (Flare) Tag Cloud Maker Decision Tree (Applet D2K) Naïve Bayes (Applet D2K) Rule Association (Applet) Dendogram (GWT)

9 University of Illinois Readability Analysis Meandre Services from Firefox Plugin Tag Cloud Analysis Date Entity to Simile Timeline Network Analysis Automatic Summarization Location Entity to Google Map Example: Zotero and SEASR

10 University of Illinois  Cloud Metaphor  The term cloud is used as a metaphor for the Internet, based on how it is depicted in computer network diagrams and is an abstraction for the complex infrastructure it conceals  Cloud Computing – Definition  The first academic use of this term appears to define it as a computing paradigm where the boundaries of computing will be determined by economic rationale rather than technical limits.  Cloud computing is a paradigm of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports themscalablevirtualizedas a serviceInternet http://en.wikipedia.org/wiki/Cloud_computing An Ideological Metaphor & Definition

11 University of Illinois Cloud Computing  How can we leverage these computation environments?  Known issues  Cloud mechanics have a steep learning curve..  Data movement to the cloud  Security  Next generation data-intensive applications will:  Use cloud computing technologies and conduits  Require adaptation of programming paradigms  Leverage a flexible and modular architecture  Promote processing and resources at scale  Distributed data flow designs to allow processing to be co- located with data sources and enable transparent scalability

12 University of Illinois Meandre in the Clouds  Meandre  Data-intensive execution engine  Component-based programming architecture  Orchestrate cloud deployments  Leverage cloud conduits  NCSA Virtual Machines & Enterprise Cloud  VMWare, Xen, & Eucalyptus  ElasticFox & AMS Web Application

13 University of Illinois Components for Amazon & Eucalyptus Components can be created to:  List images  Launch/termina te instances  Transfer Data or Programs to running instances  Trigger process computation  Monitor processes and/or persistent services

14 University of Illinois Cloud Orchestration Data Flow

15 University of Illinois Parallelism  Writing parallel code can be hard and debugging even harder…  But we need it because our data sets are growing…  And software tools can help  And hardware is also available  MapReduce model  a powerful abstraction (software framework) developed by Google to support distributed computing on large data sets on clusters of computers  Hadoop is an open source version  GPUs

16 University of Illinois Meandre for Parallelism  Implemented a Script Language (ZigZag)  Implemented MapReduce in Meandre  Automatic Parallelization for stateless components  Adding the operator [+4] or [+4!] would result in a directed graph # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4!] print( object:pt.string )

17 University of Illinois Scaling Genetic Algorithms in Meandre Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.

18 University of Illinois And With Hadoop 60 Dual Quad Core Xeons with 8GB RAM. GB Ethernet Resources exhaustion

19 University of Illinois Summary


Download ppt "University of Illinois Role of Mashups, Cloud Computing, and Parallelism for Visual Analytics Loretta Auvil."

Similar presentations


Ads by Google