How I was right, even when I was wrong Chris Welty IBM Research.

Slides:



Advertisements
Similar presentations
(nothing to see here). First thing you need to learn is that sysadmin is about people, not technology If youre a sysadmin so you dont have to deal with.
Advertisements

Ontology Quality and the Semantic Web Chris Welty IBM Watson Research Center.
Hypertext, hypermedia and interactivity. A brief overview and background primer.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
 To publish information for global distribution, one needs a universally understood language, a kind of publishing mother tongue that all computers may.
The Web of data with meaning... By Michael Griffiths.
WCAG 2.0 Web Content Accessibility Guidelines Update Last Updated July 2007.
CS575 Spring 2010 Week 2 K.V. Bapa Rao. Outline Administrative Review of previous class meeting Memex discussion Alan Kay’s Grand Challenges: Discussion.
Search Engines and Information Retrieval
SM2215 Fundamentals of New Media and Interactivity Mark Green School of Creative Media.
CEP Welcome September 1, Matthew J. Koehler September 1, 2005CEP Cognition and Technology Who’s Who?  Team up with someone you don’t.
Department of Computer Science, University of Maryland, College Park 1 Sharath Srinivas - CMSC 818Z, Spring 2007 Semantic Web and Knowledge Representation.
Alternatives to Metadata IMT 589 February 25, 2006.
* The basic components of a web site are: * Content – information displayed or accepted from users * Static – content that doesn’t change for different.
Mohammed Saiyeedur Rahman.  E-commerce is buying and selling goods over the internet. This could include selling/buying mobile phones, clothes or DVD’s.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
thinking hats Six of Prepared by Eman A. Al Abdullah ©
The WWW and HTML CMPT 281. Outline Hypertext The Internet The World-Wide-Web How the WWW works Web pages Markup HTML.
Computer Science 101 Introduction to Web Pages. Origins of the Web Vannevar Bush (Memex, 1945) Ted Nelson (Xanadu, 1968) Doug Englebart and Alan Kay (
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Search Engines and Information Retrieval Chapter 1.
HTML History CS 101. HTML Stands for Hypertext Markup Language A “Markup Language” dates from the early days of publishing when editing was done manually.
University of Sunderland CDM105 Session 2 Web Authoring Web Design The main principles and the main guru.
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
Programming the Web Web = Computer Network + Hypertext.
Human-Computer Interaction Introduction © Brian Whitworth.
Chloe Miles IMPROVING PRODUCTIVITY USING IT. Menu Using Word Advantages Disadvantages Conclusion E-Safety Social Media Dangers of Social Media Sites Staying.
The INTERNET how it works. the internet: defined So, what is it?
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Ontology-Based Information Extraction: Current Approaches.
Learning and Technology Hypertext, hypermedia and the web Claire O’Malley.
Introduction to World Wide Web Authoring © Directorate of Information Systems and Services University of Aberdeen, 1999 IT Training Workshop.
Web 2.0: Technologies for Learning Key Stages 3 and 4 Part II.
Unit 15 Webpage Creator. Outlines Introduction Starter Listening Language Work Work study Speaking Writing.
Software Engineering Experimentation Rules for Reviewing Papers Jeff Offutt See my editorials 17(3) and 17(4) in STVR
The Internet Do you really know what is out there?
Digital Media Dr. Jim Rowan ITEC The Internet your computer DHCP: your browser (Safari)(client) webpages and other stuff yahoo.com (server)
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Unit 1 – Improving Productivity Instructions ~ 100 words per box.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
OWL Representing Information Using the Web Ontology Language.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Web Based Systems for Engineering and Management Professors Iris D. Tommelein and Arpad Horvath Fall 2000.
 How a computer works  The Internet  Browsers  Web Pages.
World Wide Web Guide * for Students to the Internet.
Internet Literacy Evaluating Web Sites. Objective The Student will be able to evaluate internet web sites for accuracy and reliability The Student will.
Basic Blogs Jennifer Dempsey and Aimee Smith. "Can't I just you a link to my blog, miss?"
 In the 1960s, ARPANET (Advanced Research Projects Agency Network), the internet’s predecessor, was invented  ARPANET used two technologies that are.
© 2015 albert-learning.com How to talk to your boss How to talk to your boss!!
SOCIAL COMPUTING IN 2025 PRESENTED BY LATE TIMERS.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
A BRIEF HISTORY OF THE INTERNET, WEB, AND HTML. Internet vs. World Wide Web What is The Internet? The Internet is a massive network of networks, a networking.
Digital Media Dr. Jim Rowan ITEC The Internet your computer DHCP: your browser (Safari)(client) webpages and other stuff yahoo.com (server)
Introduction to the Semantic Web. Questions What is the Semantic Web? Why do we want it? How will we do it? Who will do it? When will it be done?
 World wide web is a set of protocols that allows you to access any document on the net through the naming system based on URLs. www also specifies the.
JavaScript Part 1 Introduction to scripting The ‘alert’ function.
Introduction to CSCI 1311 Dr. Mark C. Lewis
The World Wide Web.
The Semantic Web By: Maulik Parikh.
Control Choices and Network Effects in Hypertext Systems
Academic Talent Search
How to produce software for Digital Humanities?
Should all the building blocks be yellow?
HTML History CS 101.
Internet Literacy Evaluating Web Sites.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
Software Engineering Experimentation
Planning and Storyboarding a Web Site
Presentation transcript:

How I was right, even when I was wrong Chris Welty IBM Research

Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

History Remembering all these paths Remembering all these paths CSNet 1983 CSNet 1983 Database too big Database too big Chat (write/talk) Chat (write/talk) What a waste of time What a waste of time Collaborative Authoring Collaborative Authoring NYSERNet, PSI NYSERNet, PSI InterNet DNS InterNet DNS Overloaded.edu Overloaded.edu HTML/HTTP HTML/HTTP Too much porn Too much porn Semantic Web Semantic Web Just tags on old stuff Just tags on old stuff Importance of quality ontologies Importance of quality ontologies Like hypertext Like hypertext Social Tagging Social Tagging Put together collab author, web 2.0, quality ontologies, Put together collab author, web 2.0, quality ontologies,

The birth of a know-it-all Born in NY early 60s Born in NY early 60s Early expert on everything Early expert on everything Disappointment with Buck Rogers Disappointment with Buck Rogers 2 cnd grade 2 cnd grade Summed up numbers from in five minutes Summed up numbers from in five minutes Got 5100 Got 5100 The answer is 5050 The answer is 5050 First prediction (1975) First prediction (1975) I will marry I will marry Farrah Fawcett-Majors

How I did it … … 50 OOPS

Insult no group unless it includes me

Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

First exposure to uucp mail 1981 uucp mail allegro!batcave!cornell!rpics!weltyc allegro!batcave!cornell!rpics!weltyc Prediction: Prediction: No one will ever use No one will ever use Why I was right: Why I was right: Usenet paths were ridiculous Usenet paths were ridiculous What I missed: What I missed: Paths and were not tightly bound Paths and were not tightly bound People really wanted People really wanted

Next generation CSNet 1983 CSNet Prediction: Prediction: As I said… As I said… Why I was right: Why I was right: Someone still has to maintain the list Someone still has to maintain the list Won’t scale Won’t scale What I missed What I missed People really needed People really needed It was better, not perfect It was better, not perfect

Domain Naming Service Proposed to IETF in 1985 Proposed to IETF in 1985 Distributed hierarchical database Distributed hierarchical database Distributed not only the data, but the maintenance Distributed not only the data, but the maintenance might make work might make work Prediction: Prediction: The.edu top-level will become overloaded The.edu top-level will become overloaded Why I was right: Why I was right: The hierarchy was unbalanced The hierarchy was unbalanced What I missed: What I missed: People were willing to invest in scale People were willing to invest in scale Money to be made in supplying domain names! Money to be made in supplying domain names! It was better, not perfect It was better, not perfect..edu.com.org

HTTP/HTML Proposed to IETF in 1990 Proposed to IETF in 1990 Hypertext is decades old, this just adds tags Hypertext is decades old, this just adds tags Prediction: Prediction: No big deal, unimportant No big deal, unimportant Porn will make the InterNet unusable Porn will make the InterNet unusable Why I was right: Why I was right: Porn really was king of the early web (~70%) Porn really was king of the early web (~70%) What I missed: What I missed: People were willing to invest in scale People were willing to invest in scale It was more than just tags for hypercard It was more than just tags for hypercard

Web 2.0 (i.e. Social Web) Started roughly 2002 Started roughly 2002 Web of people instead of machines Web of people instead of machines Wikis, social tagging, social networks Wikis, social tagging, social networks Prediction Prediction TEENAGE NONSENSE TEENAGE NONSENSE Will be poisoned by stupidity, negativity, misdirection, spam Will be poisoned by stupidity, negativity, misdirection, spam Will not scale Will not scale Why I was right Why I was right Most of it is teenage nonsense Most of it is teenage nonsense Most people really are idiots Most people really are idiots What I missed What I missed People want to share their knowledge People want to share their knowledge People scale on the web People scale on the web Quality seems to be self governing in certain areas Quality seems to be self governing in certain areas

Blind Men and Elephants I was right about the trunk. Which one are you?

Semantic Web The idea has been around for about a decade The idea has been around for about a decade You may have heard of it You may have heard of it I got the pitch from TimBL… I got the pitch from TimBL… Prediction: Prediction: KR is decades old, this just adds tags KR is decades old, this just adds tags Will not scale (KA, Reasoning) Will not scale (KA, Reasoning) Proliferation of bad ontologies will lead to bad systems Proliferation of bad ontologies will lead to bad systems Why I was right: Why I was right: Reasoning doesn’t really scale (exptime is incomplete) Reasoning doesn’t really scale (exptime is incomplete) Bad ontologies do lead to bad systems Bad ontologies do lead to bad systems What I missed What I missed Its not just tags Its not just tags KA does scale – people want to share their knowledge KA does scale – people want to share their knowledge A lot of people don’t care about reasoning A lot of people don’t care about reasoning Better not perfect Better not perfect KA not needed – the actual vision KA not needed – the actual vision

The Semantic Web Vision ~80% of web pages are generated from back end databases ~80% of web pages are generated from back end databases Publish the semantics (schema?) as well as the data Publish the semantics (schema?) as well as the data URIs provide a web-based form of identity URIs provide a web-based form of identity It’s the semantic WEB, not the SEMANTIC web It’s the semantic WEB, not the SEMANTIC web NOT: humans will markup their web pages NOT: humans will markup their web pages NOT: NLP will populate the SW from web pages NOT: NLP will populate the SW from web pages

Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

Lessons learned People who make bad predictions still get to be invited speakers! People who make bad predictions still get to be invited speakers! The unimpressed scientist syndrome The unimpressed scientist syndrome Applications that are needed will just happen Applications that are needed will just happen Better not perfect Better not perfect People really want to share their knowledge People really want to share their knowledge Scalability of people on the web Scalability of people on the web Scale happens Scale happens

The Unimpressed Scientist Be more open minded Be more open minded Tend to “accept” rather than “reject” Tend to “accept” rather than “reject” Don’t confuse the trunk for the elephant Don’t confuse the trunk for the elephant The evaluation criteria is not whether it will work, but whether it is needed The evaluation criteria is not whether it will work, but whether it is needed

Better not Perfect Improvements are important Improvements are important So ask yourself, “Is this better” So ask yourself, “Is this better” Nit-picking usually is not important Nit-picking usually is not important The boundary conditions matter, but aren’t everything The boundary conditions matter, but aren’t everything Measurement, experimental conditions, become critical Measurement, experimental conditions, become critical What is “better”? What is “better”? NLP perhaps takes this too far NLP perhaps takes this too far

Scalability Faster, bigger computers Faster, bigger computers Better distribution Better distribution People on the web People on the web The Captchas story The Captchas story Heuristics, statistics Heuristics, statistics

People want to share their knowledge Shouldn’t be a surprise, this is what motivates us Shouldn’t be a surprise, this is what motivates us Still, most people are idiots Still, most people are idiots …so… …so… Pure openness doesn’t work, but Pure openness doesn’t work, but Reviews, feedback, “how valuable”, etc. seem to work Reviews, feedback, “how valuable”, etc. seem to work

Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

Promising trends Almost back to the 80s Almost back to the 80s KA with semantic wikis KA with semantic wikis E.g. ontoworld.org, Halo E.g. ontoworld.org, Halo NLP and KR are coming back together NLP and KR are coming back together Powerset, etc. Powerset, etc. Collaborative, large, KBs Collaborative, large, KBs Dbpedia, freebase Dbpedia, freebase Imdb, wordnet Imdb, wordnet Cyc Cyc Scalable reasoning Scalable reasoning SHER SHER Rules Rules RIF BLD released ( RIF BLD released ( RDF compatibility ( RDF compatibility (

Important Problems API incompatibility API incompatibility Connotation vs. Denotation Connotation vs. Denotation URIs provide identity, but what do they mean URIs provide identity, but what do they mean Coreference, disambiguation, word sense Coreference, disambiguation, word sense Experimental methodology, measurement Experimental methodology, measurement E.g. precision & recall E.g. precision & recall Dependencies of results Dependencies of results The very long tail The very long tail Wherefore reasoning? Wherefore reasoning? Ontology Quality, Evaluation Ontology Quality, Evaluation

History of Hypertext 1945: Vannevar Bush’s Memex 1945: Vannevar Bush’s Memex Associative Indexing and links Associative Indexing and links 1965: Ted Nelson coins hypertext 1965: Ted Nelson coins hypertext “Nonsequential writing” “Nonsequential writing” 1967: Andries van Dam’s Hypertext Editing System (sponsored by IBM). 1967: Andries van Dam’s Hypertext Editing System (sponsored by IBM). 1985: Janet Walker’s Symbolics Document Examiner 1985: Janet Walker’s Symbolics Document Examiner 1987: Bill Atkinson’s Hypercard on the Mac 1987: Bill Atkinson’s Hypercard on the Mac 1991: Tim Berners-Lee proposes HTTP, HTML, & URL 1991: Tim Berners-Lee proposes HTTP, HTML, & URL Genesis c Genesis c : Mark Andreesen releases Mosaic for Mac, Unix, Windows… 1993: Mark Andreesen releases Mosaic for Mac, Unix, Windows…

Hypertext Research Dating back at least to the late 60s Dating back at least to the late 60s Many foci Many foci Technology (mouse, software, protocols) Technology (mouse, software, protocols) User interaction User interaction Aesthetic Aesthetic Post-modern Post-modern Engineering Engineering Largely ignored by web developers Largely ignored by web developers Especially in the early days of the web (93-96) Especially in the early days of the web (93-96)

Grassroots to the Web Early web dominated by “what it looks like” in Mosaic Early web dominated by “what it looks like” in Mosaic Unimpressed UI and Hypertext researchers Unimpressed UI and Hypertext researchers Focus on spreading the word, not doing it right Focus on spreading the word, not doing it right Many early web pages didn’t have links in text at all Many early web pages didn’t have links in text at all “Catalog” pages with lists of links “Catalog” pages with lists of links “Text” pages with few or no links “Text” pages with few or no links Embedded images more interesting than links Embedded images more interesting than links Just do it rather than do it right Just do it rather than do it right But… But… When the web became serious, the research started to matter When the web became serious, the research started to matter

Ontology Research Dating back… Dating back… Multiple foci Multiple foci Technology (logics, reasoners…) Technology (logics, reasoners…) Meta-physics (what there is) Meta-physics (what there is) Knowledge Acquisition Knowledge Acquisition NLP NLP Engineering Engineering Largely ignored by SW developers Largely ignored by SW developers Web 2.0, groundswell Web 2.0, groundswell Specifically criticized by some SW pundits Specifically criticized by some SW pundits

A little semantics… The SW catchphrase The SW catchphrase “A little semantics goes a long way” “A little semantics goes a long way” Sometimes strengthened Sometimes strengthened A lot of semantics is too much A lot of semantics is too much 80/20 rule 80/20 rule Double-edged sword Double-edged sword FOAF doesn’t look like even 1% FOAF doesn’t look like even 1% The simplicity of FOAF hides any serious value proposition for SW The simplicity of FOAF hides any serious value proposition for SW SW not for people, for data SW not for people, for data Important to get it right? Important to get it right?

Some evidence Does quality matter? Does quality matter? Good quality ontologies cost more Good quality ontologies cost more Required for some applications Required for some applications Improvements in quality can improve performance [Welty, et al, 2004] Improvements in quality can improve performance [Welty, et al, 2004] 18% f-improvement in search 18% f-improvement in search Cleanup cost ~1mw/3000 classes Cleanup cost ~1mw/3000 classes BUT … low quality ontology still improved base BUT … low quality ontology still improved base

Wherefore Reasoning? Very hard to “sell” OWL reasoning Very hard to “sell” OWL reasoning Many users want very simple reasoning Many users want very simple reasoning Simple subclass Simple subclass Simple range/domain constraints Simple range/domain constraints Simple rules Simple rules Some users want more than OWL Some users want more than OWL But just to express their semantics But just to express their semantics Improving precision? Improving precision? Improving recall? Improving recall? Must be measured.

The very long tail frequency Ontologies, explicit semantics Something else?

Question Answering Q: What weapon was featured in the ballet “Fall River Legend?” Q: What weapon was featured in the ballet “Fall River Legend?” A: American Ballet Theatre A: American Ballet Theatre OK, add “weapon” to ontology… OK, add “weapon” to ontology…

Question Answering Q: What gum’s motto was “Double your pleasure, double your fun”? Q: What gum’s motto was “Double your pleasure, double your fun”? A: personal lubricant A: personal lubricant

Humans vs. Machines Vision Vision Speech Speech Natural Language Natural Language Context awareness Context awareness Tacit knowledge Tacit knowledge Learning Learning Socialization Socialization Organization Perfect memory Calculation speed Planning & scheduling Games & simulation Search Networks

Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

Question Answering Q: What president gave the longest inaugural speech? Q: What president gave the longest inaugural speech? A: Dieter Fensel A: Dieter Fensel “Improvements” need to be measured “Improvements” need to be measured P α 1/R P α 1/R Leader Talk, presentation