Presentation is loading. Please wait.

Presentation is loading. Please wait.

How I was right, even when I was wrong Chris Welty IBM Research.

Similar presentations


Presentation on theme: "How I was right, even when I was wrong Chris Welty IBM Research."— Presentation transcript:

1 How I was right, even when I was wrong Chris Welty IBM Research

2 Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

3 History Email 1981 Email 1981 Remembering all these paths Remembering all these paths CSNet 1983 CSNet 1983 Database too big Database too big Chat (write/talk) Chat (write/talk) What a waste of time What a waste of time Collaborative Authoring Collaborative Authoring NYSERNet, PSI NYSERNet, PSI InterNet DNS InterNet DNS Overloaded.edu Overloaded.edu HTML/HTTP HTML/HTTP Too much porn Too much porn Semantic Web Semantic Web Just tags on old stuff Just tags on old stuff Importance of quality ontologies Importance of quality ontologies Like hypertext Like hypertext Social Tagging Social Tagging Put together collab author, web 2.0, quality ontologies, Put together collab author, web 2.0, quality ontologies,

4 The birth of a know-it-all Born in NY early 60s Born in NY early 60s Early expert on everything Early expert on everything Disappointment with Buck Rogers Disappointment with Buck Rogers 2 cnd grade 2 cnd grade Summed up numbers from 1-100 in five minutes Summed up numbers from 1-100 in five minutes Got 5100 Got 5100 The answer is 5050 The answer is 5050 First prediction (1975) First prediction (1975) I will marry I will marry Farrah Fawcett-Majors

5 How I did it 1 2 3 4 5 … 50 100 99 98 97 96 95 … 50 OOPS

6 Insult no group unless it includes me

7 Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

8 First exposure to email 1981 uucp mail 1981 uucp mail allegro!batcave!cornell!rpics!weltyc allegro!batcave!cornell!rpics!weltyc Prediction: Prediction: No one will ever use email No one will ever use email Why I was right: Why I was right: Usenet paths were ridiculous Usenet paths were ridiculous What I missed: What I missed: Paths and email were not tightly bound Paths and email were not tightly bound People really wanted email People really wanted email

9 Next generation email 1983 CSNet 1983 CSNet weltyc@rpics weltyc@rpics Prediction: Prediction: As I said… As I said… Why I was right: Why I was right: Someone still has to maintain the list Someone still has to maintain the list Won’t scale Won’t scale What I missed What I missed People really needed email People really needed email It was better, not perfect It was better, not perfect

10 Domain Naming Service Proposed to IETF in 1985 Proposed to IETF in 1985 Distributed hierarchical database Distributed hierarchical database Distributed not only the data, but the maintenance Distributed not only the data, but the maintenance might make email work might make email work Prediction: Prediction: The.edu top-level will become overloaded The.edu top-level will become overloaded Why I was right: Why I was right: The hierarchy was unbalanced The hierarchy was unbalanced What I missed: What I missed: People were willing to invest in scale People were willing to invest in scale Money to be made in supplying domain names! Money to be made in supplying domain names! It was better, not perfect It was better, not perfect..edu.com.org

11 HTTP/HTML Proposed to IETF in 1990 Proposed to IETF in 1990 Hypertext is decades old, this just adds tags Hypertext is decades old, this just adds tags Prediction: Prediction: No big deal, unimportant No big deal, unimportant Porn will make the InterNet unusable Porn will make the InterNet unusable Why I was right: Why I was right: Porn really was king of the early web (~70%) Porn really was king of the early web (~70%) What I missed: What I missed: People were willing to invest in scale People were willing to invest in scale It was more than just tags for hypercard It was more than just tags for hypercard

12 Web 2.0 (i.e. Social Web) Started roughly 2002 Started roughly 2002 Web of people instead of machines Web of people instead of machines Wikis, social tagging, social networks Wikis, social tagging, social networks Prediction Prediction TEENAGE NONSENSE TEENAGE NONSENSE Will be poisoned by stupidity, negativity, misdirection, spam Will be poisoned by stupidity, negativity, misdirection, spam Will not scale Will not scale Why I was right Why I was right Most of it is teenage nonsense Most of it is teenage nonsense Most people really are idiots Most people really are idiots What I missed What I missed People want to share their knowledge People want to share their knowledge People scale on the web People scale on the web Quality seems to be self governing in certain areas Quality seems to be self governing in certain areas

13 Blind Men and Elephants I was right about the trunk. Which one are you?

14 Semantic Web The idea has been around for about a decade The idea has been around for about a decade You may have heard of it You may have heard of it I got the pitch from TimBL… I got the pitch from TimBL… Prediction: Prediction: KR is decades old, this just adds tags KR is decades old, this just adds tags Will not scale (KA, Reasoning) Will not scale (KA, Reasoning) Proliferation of bad ontologies will lead to bad systems Proliferation of bad ontologies will lead to bad systems Why I was right: Why I was right: Reasoning doesn’t really scale (exptime is incomplete) Reasoning doesn’t really scale (exptime is incomplete) Bad ontologies do lead to bad systems Bad ontologies do lead to bad systems What I missed What I missed Its not just tags Its not just tags KA does scale – people want to share their knowledge KA does scale – people want to share their knowledge A lot of people don’t care about reasoning A lot of people don’t care about reasoning Better not perfect Better not perfect KA not needed – the actual vision KA not needed – the actual vision

15 The Semantic Web Vision ~80% of web pages are generated from back end databases ~80% of web pages are generated from back end databases Publish the semantics (schema?) as well as the data Publish the semantics (schema?) as well as the data URIs provide a web-based form of identity URIs provide a web-based form of identity It’s the semantic WEB, not the SEMANTIC web It’s the semantic WEB, not the SEMANTIC web NOT: humans will markup their web pages NOT: humans will markup their web pages NOT: NLP will populate the SW from web pages NOT: NLP will populate the SW from web pages

16 Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

17 Lessons learned People who make bad predictions still get to be invited speakers! People who make bad predictions still get to be invited speakers! The unimpressed scientist syndrome The unimpressed scientist syndrome Applications that are needed will just happen Applications that are needed will just happen Better not perfect Better not perfect People really want to share their knowledge People really want to share their knowledge Scalability of people on the web Scalability of people on the web Scale happens Scale happens

18 The Unimpressed Scientist Be more open minded Be more open minded Tend to “accept” rather than “reject” Tend to “accept” rather than “reject” Don’t confuse the trunk for the elephant Don’t confuse the trunk for the elephant The evaluation criteria is not whether it will work, but whether it is needed The evaluation criteria is not whether it will work, but whether it is needed

19 Better not Perfect Improvements are important Improvements are important So ask yourself, “Is this better” So ask yourself, “Is this better” Nit-picking usually is not important Nit-picking usually is not important The boundary conditions matter, but aren’t everything The boundary conditions matter, but aren’t everything Measurement, experimental conditions, become critical Measurement, experimental conditions, become critical What is “better”? What is “better”? NLP perhaps takes this too far NLP perhaps takes this too far

20 Scalability Faster, bigger computers Faster, bigger computers Better distribution Better distribution People on the web People on the web The Captchas story The Captchas story Heuristics, statistics Heuristics, statistics

21 People want to share their knowledge Shouldn’t be a surprise, this is what motivates us Shouldn’t be a surprise, this is what motivates us Still, most people are idiots Still, most people are idiots …so… …so… Pure openness doesn’t work, but Pure openness doesn’t work, but Reviews, feedback, “how valuable”, etc. seem to work Reviews, feedback, “how valuable”, etc. seem to work

22 Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

23 Promising trends Almost back to the 80s Almost back to the 80s KA with semantic wikis KA with semantic wikis E.g. ontoworld.org, Halo E.g. ontoworld.org, Halo NLP and KR are coming back together NLP and KR are coming back together Powerset, etc. Powerset, etc. Collaborative, large, KBs Collaborative, large, KBs Dbpedia, freebase Dbpedia, freebase Imdb, wordnet Imdb, wordnet Cyc Cyc Scalable reasoning Scalable reasoning SHER SHER Rules Rules RIF BLD released (http://www.w3.org/TR/rif-bld) RIF BLD released (http://www.w3.org/TR/rif-bld)http://www.w3.org/TR/rif-bld RDF compatibility (http://www.w3.org/TR/rif-rdf-owl) RDF compatibility (http://www.w3.org/TR/rif-rdf-owl)

24 Important Problems API incompatibility API incompatibility Connotation vs. Denotation Connotation vs. Denotation URIs provide identity, but what do they mean URIs provide identity, but what do they mean Coreference, disambiguation, word sense Coreference, disambiguation, word sense Experimental methodology, measurement Experimental methodology, measurement E.g. precision & recall E.g. precision & recall Dependencies of results Dependencies of results The very long tail The very long tail Wherefore reasoning? Wherefore reasoning? Ontology Quality, Evaluation Ontology Quality, Evaluation

25 History of Hypertext 1945: Vannevar Bush’s Memex 1945: Vannevar Bush’s Memex Associative Indexing and links Associative Indexing and links 1965: Ted Nelson coins hypertext 1965: Ted Nelson coins hypertext “Nonsequential writing” “Nonsequential writing” 1967: Andries van Dam’s Hypertext Editing System (sponsored by IBM). 1967: Andries van Dam’s Hypertext Editing System (sponsored by IBM). 1985: Janet Walker’s Symbolics Document Examiner 1985: Janet Walker’s Symbolics Document Examiner 1987: Bill Atkinson’s Hypercard on the Mac 1987: Bill Atkinson’s Hypercard on the Mac 1991: Tim Berners-Lee proposes HTTP, HTML, & URL 1991: Tim Berners-Lee proposes HTTP, HTML, & URL Genesis c. 1989 Genesis c. 1989 1993: Mark Andreesen releases Mosaic for Mac, Unix, Windows… 1993: Mark Andreesen releases Mosaic for Mac, Unix, Windows…

26 Hypertext Research Dating back at least to the late 60s Dating back at least to the late 60s Many foci Many foci Technology (mouse, software, protocols) Technology (mouse, software, protocols) User interaction User interaction Aesthetic Aesthetic Post-modern Post-modern Engineering Engineering Largely ignored by web developers Largely ignored by web developers Especially in the early days of the web (93-96) Especially in the early days of the web (93-96)

27 Grassroots to the Web Early web dominated by “what it looks like” in Mosaic Early web dominated by “what it looks like” in Mosaic Unimpressed UI and Hypertext researchers Unimpressed UI and Hypertext researchers Focus on spreading the word, not doing it right Focus on spreading the word, not doing it right Many early web pages didn’t have links in text at all Many early web pages didn’t have links in text at all “Catalog” pages with lists of links “Catalog” pages with lists of links “Text” pages with few or no links “Text” pages with few or no links Embedded images more interesting than links Embedded images more interesting than links Just do it rather than do it right Just do it rather than do it right But… But… When the web became serious, the research started to matter When the web became serious, the research started to matter

28 Ontology Research Dating back… Dating back… Multiple foci Multiple foci Technology (logics, reasoners…) Technology (logics, reasoners…) Meta-physics (what there is) Meta-physics (what there is) Knowledge Acquisition Knowledge Acquisition NLP NLP Engineering Engineering Largely ignored by SW developers Largely ignored by SW developers Web 2.0, groundswell Web 2.0, groundswell Specifically criticized by some SW pundits Specifically criticized by some SW pundits

29 A little semantics… The SW catchphrase The SW catchphrase “A little semantics goes a long way” “A little semantics goes a long way” Sometimes strengthened Sometimes strengthened A lot of semantics is too much A lot of semantics is too much 80/20 rule 80/20 rule Double-edged sword Double-edged sword FOAF doesn’t look like even 1% FOAF doesn’t look like even 1% The simplicity of FOAF hides any serious value proposition for SW The simplicity of FOAF hides any serious value proposition for SW SW not for people, for data SW not for people, for data Important to get it right? Important to get it right?

30 Some evidence Does quality matter? Does quality matter? Good quality ontologies cost more Good quality ontologies cost more Required for some applications Required for some applications Improvements in quality can improve performance [Welty, et al, 2004] Improvements in quality can improve performance [Welty, et al, 2004] 18% f-improvement in search 18% f-improvement in search Cleanup cost ~1mw/3000 classes Cleanup cost ~1mw/3000 classes BUT … low quality ontology still improved base BUT … low quality ontology still improved base

31 Wherefore Reasoning? Very hard to “sell” OWL reasoning Very hard to “sell” OWL reasoning Many users want very simple reasoning Many users want very simple reasoning Simple subclass Simple subclass Simple range/domain constraints Simple range/domain constraints Simple rules Simple rules Some users want more than OWL Some users want more than OWL But just to express their semantics But just to express their semantics Improving precision? Improving precision? Improving recall? Improving recall? Must be measured.

32 The very long tail frequency Ontologies, explicit semantics Something else?

33 Question Answering Q: What weapon was featured in the ballet “Fall River Legend?” Q: What weapon was featured in the ballet “Fall River Legend?” A: American Ballet Theatre A: American Ballet Theatre OK, add “weapon” to ontology… OK, add “weapon” to ontology…

34 Question Answering Q: What gum’s motto was “Double your pleasure, double your fun”? Q: What gum’s motto was “Double your pleasure, double your fun”? A: personal lubricant A: personal lubricant

35 Humans vs. Machines Vision Vision Speech Speech Natural Language Natural Language Context awareness Context awareness Tacit knowledge Tacit knowledge Learning Learning Socialization Socialization Organization Perfect memory Calculation speed Planning & scheduling Games & simulation Search Networks

36 Outline Opening Joke Opening Joke Some personal history Some personal history My failed predictions My failed predictions Lessons learned? Lessons learned? A glimpse into the future A glimpse into the future Closing joke Closing joke

37 Question Answering Q: What president gave the longest inaugural speech? Q: What president gave the longest inaugural speech? A: Dieter Fensel A: Dieter Fensel “Improvements” need to be measured “Improvements” need to be measured P α 1/R P α 1/R Leader Talk, presentation


Download ppt "How I was right, even when I was wrong Chris Welty IBM Research."

Similar presentations


Ads by Google