Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University.

Similar presentations


Presentation on theme: "Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University."— Presentation transcript:

1 Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University

2 The Big Picture Problem Lots of content out there on the web –But not always in a form amenable to your needs –Ex. Easy to get a list of hotels in San Jose, not so easy to sort by distance to convention center Two observations: –In many cases, all of the data and services people need already exist, but not connected together –Unlikely that a web site can predict all possible needs

3 A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com

4

5

6

7 A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com –Ex. MySpace child predators –Ex. Friendster locations –Ex. Most popular videos on YouTube, Yahoo Video, …

8 A Solution: Mashups Rapidly growing community of users creating “mashups” combining content from multiple web sites –Ex. Housingmaps.com –Ex. MySpace child predators –Ex. Friendster locations –Ex. Most popular videos on YouTube, Yahoo Video, … ProgrammableWeb.com statistics –~1500 mashups created since April 2005 –356 open web-based APIs available

9 But Creating Mashups is Hard Requires lots of skill to create a mashup –Ex. Housingmaps creator has PhD in computer science –Ex. MySpace child predator list took months Requires programming expertise in many areas –Web crawling –Text parsing –Pattern matching –Databases –HTML

10 Marmite End-User Programming for Mashups Main idea: make it easy to create web mashups Use a dataflow approach connecting small operators –Inspired by Unix pipes and Apple’s Automator Example: –Get all events from Upcoming.org –Filter out events that are too old –Put them all onto a map Runs inside of a standard web browser

11 Set of Operators

12 Data Flow View

13 Data View

14 Using Marmite (Envisioned) Extract content from one or more web pages –names, addresses, dates, phone #, URLs Process it in a data flow manner –filtering out values or adding metadata –integrating with other data sources (similar to a database join operation) Direct the output to a variety of sinks –databases, map services, text files, visualizations, web pages, or source code that can be further edited

15 Marmite Motivation and Examples Features and Design Rationale User Evaluation

16 Features and Design Rationale Conducted a series of quick evaluations to understand design space and potential problems –Automator –Lo-fi prototypes

17 Automator

18 Informal Automator Evaluation Had three novices try three simple web-based tasks –Warm-up task –Traverse a set of web pages –Download a set of images Some findings: –Some difficulties knowing how to start and what to do next –Little feedback about state of system between operations –Difficult to iterate due to network speed issues

19 Lo-Fi Prototypes 6 paper prototypes with 20 participants

20 Design Solutions Problem: how to start and what to do next Solution: Suggest next actions –Weak data typing to find types (addresses, numbers, etc) –Filter operators to only show relevant ones –Suggest operators that might be applicable

21

22 Design Solutions Problem: little feedback about state of system between operations Solution: link data flow and data view together –Many systems take program-centric view (ex. Automator) or data-centric view (ex. spreadsheets) –Use hybrid data flow / data view, showing an operation and its effects together –Data view usually “spreadsheet”, other views possible too (for example, maps)

23

24

25 Design Solutions Problem: difficult to iterate due to network speeds Solution: cache data, let people “replay” data –Reload, pause, play

26 Other Design Findings Screen real estate issues –Collapsible operators, leaving a readable label

27 Extracting Generic Content Can’t have pre-defined extractor operators for every possible web site –Need a more general way of extracting data from pages Developed a generic wizard UI for selecting links –Content from that set could be extracted via other operators –Uses Solvent (MIT), an XPath-based algorithm for finding patterns in web pages Finds “groups” of related web content based on how HTML is structured

28 Marmite

29 Operators Operators have input types –Operator uses this to guess which columns it wants Operators have output types

30 Implementation JavaScript (for underlying code) and Extensible Binding Language (XBL for UI) Operators currently in JavaScript –Ideally could be scriptable in any programming language –Currently ~15 operators

31 Marmite Motivation and Examples Features and Design Rationale User Evaluation

32 Evaluation Informal user study with 6 people –2 novices –2 people with spreadsheet experience (formulas) –2 people with programming experience Tasks (in increasing difficulty) –Warmup task showing how to retrieve a set of addresses and how to geocode an address –Search for and filter out events further than a week away –Compile a list of events from two event services and plot them on a map –Recreate the housingmaps site

33 Results Three people able to complete all tasks in ~1 hour –First two users confused about suggested actions (automatically popped up, made manual for other 4 users) –Novice made some progress, not able to finish all tasks Able to re-create housingmaps in ~15 minutes

34 Marmite

35 More Results Biggest barrier was understanding the data flow –Did not understand input and output concept –Applied operators as one-off, did not realize that it was a static representation of flow –Did not understand data flow and data view were linked

36 Future Directions Short-term –Better screen-scraping operators –More operators –Better connection with web services (WSDL and REST) –Better help for starting a data flow Long-term –Intelligence analysis –Better visualizations –Location-based services

37 Conclusions Marmite, a tool for creating web-based mashups –Extract content from one or more web pages –Process it in a data flow manner –Direct the output to a variety of sinks Hybrid data flow / data view User evaluation shows some promising results Jeff Wong, Jason Hong, Making Mashups with Marmite: Re-purposing Web Content through End- User Programming, CHI 2007

38

39

40

41 Marmite

42 Types of Operators Sources –Add data into Marmite by querying databases, extracting information from web pages, and so on. Processors –modify, combine, or delete existing rows. Example operators include geocoding (converting street addresses to latitude and longitude) and filtering. Processor operators might add or remove columns as well Sinks –redirect the flow the data out of Marmite. Examples include showing data on a map, saving it to a file, or to a web page.


Download ppt "Making Mashups with Marmite Jeff Wong Jason I. Hong Carnegie Mellon University."

Similar presentations


Ads by Google