Presentation is loading. Please wait.

Presentation is loading. Please wait.

CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden.

Similar presentations


Presentation on theme: "CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden."— Presentation transcript:

1 CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden

2 2 Data may be Incorrectly Formatted Example: In a contextual inquiry, an end user needed to copy job title, phone number, and email address into a spreadsheet for each staff member. Notice the mis-formatted phone number and email address. A web macro for this task would need to help the user fix the data. formats ● coscripter ● topes

3 3 Data may be Inconveniently Formatted Consider all the ways that we write dates. Reformatting may be necessary when reusing a date from one web site to fill out a form in another web site. formats ● coscripter ● topes

4 4 Limitations of Web Macro Tools Right now, CoScripter cannot –Clean up incorrectly formatted data –Reformat inconveniently formatted data Like most web macro tools, CoScripter treats data as strings (so it cannot clean up or reformat data). It does not recognize –Phone numbers –Email addresses –Country currency codes –Dates etc. formats ● coscripter ● topes

5 5Topes A tope is a kind of data that has a natural place in the problem domain. (“tope” = Greek for “place”) E.g.: Phone number, state name, person name Many topes have several common formats. 1-408-927-2513 408-927-2513 (408) 927-2513 408.927.2513 408/927-2513 California CALIFORNIA CA Calif. John von Neumann JOHN VON NEUMANN von Neumann, John VON NEUMANN, JOHN formats ● coscripter ● topes

6 6 A tope is modeled as a graph of formats An example tope for CMU room numbers –3 formats (called “ isa ” functions, which recognize data) –4 transformations (called “ trf ” functions, which reformat data) –Most topes have enough trf s to form a connected graph Formal building name & room number Elliot Dunlap Smith Hall 225 Building abbreviation & room number EDSH 225 Colloquial building name & room number Smith 225 formats ● coscripter ● topes

7 7 Topes + CoScripter / Vegemite We have started integrating topes with CoScripter –To recognize a string as a certain kind of data –To clean up strings when incorrectly formatted –To reformat strings to a more convenient format Quick Demos formats ● coscripter ● topes

8 8 Other Topes Research Prior work: –Inference of formats from example strings –UI so that end users can define new topes –Using topes to validate and reformat data in spreadsheets, databases, and web applications UIs Future work: –Repository so that users can share topes with one another –Statistical techniques for automatically identifying and correcting incorrectly implemented topes formats ● coscripter ● topes


Download ppt "CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden."

Similar presentations


Ads by Google