Presentation is loading. Please wait.

Presentation is loading. Please wait.

How AI Programs Collect Concepts Ernie Davis Concept and Categories Seminar Sept 25, 2015.

Similar presentations


Presentation on theme: "How AI Programs Collect Concepts Ernie Davis Concept and Categories Seminar Sept 25, 2015."— Presentation transcript:

1 How AI Programs Collect Concepts Ernie Davis Concept and Categories Seminar Sept 25, 2015

2 Outline Part I: Case studies. Systems: Probase, NELL, ConceptNet, WordNet, WikiNet, CYC. Issues: What is a concept? How are concepts collected? How are taxonomic and other relations found? Polysemy & synonymy System features: Size, evaluation, uses. Part II: Relation to cog. psych. concepts

3 What do AI people want? Cognitive plausibility? Not much. Happy to note similarities if they come up. Philosophical coherence? No! You can quote Plato or Quine if you want to pretend to be erudite. Otherwise, reading those guys will just leave you hopelessly confused. Human or superhuman level AI in 20-50 years? Less than you would suppose. Mostly entertainment for journalists.

4 What do AI people want? To build, today, a system that does something.

5 Probase Wu et al. 2012 (Google) 2.6 million concepts (categories). 20.7 million isA relations. 92.8% accuracy. Concept: A noun (English) or a two- or three- word noun phrase. E.g. “country”, “city”, “renewable energy technology”, “common sleep disorder”, “meteorological phenomenon”.

6 Probase Taxonomy: From Hearst patterns (Hearst 1992). “Countries such as England, France and Germany”, “Bears, lions, and other animals”. Pitfalls: “animals other than dogs such as cats” “companies such as IBM, Nokia, Proctor and Gamble”, “Europe, Brazil, China, and other countries”, “dead animals such as dogs and cats”.

7 Probase Use statistical information to rule out false concepts and taxonomic relations. E.g. there are many instances of “dog” and “cat” and only a few texts that suggest that “cat” is a subcategory of “dog”.

8 Probase: Polysemy Example: “plants such as refineries and nuclear reactors” vs. “plants such as trees, grass, and cacti”. Rule 1: In each text the hypernym has only one category; you don’t have “plants such as refineries and trees”. Rule 2: If two lists have substantial overlap, then the hypernyms are the same. E.g. “plants such as trees, grass, and cacti” and “plants such as grass, bushes, and trees.” Use these, similar rules, to group together word uses into meanings.

9 NELL (Never-Ending Language Learner) Mitchell et al. 2015. (Carnegie-Mellon). Using web mining, constructs a taxonomy of concepts, and collects facts corresponding to fixed set of relations. 80 million beliefs (facts).

10 NELL: Concepts A concept is a noun (I think). Instances are proper nouns. No attempt to address polysemy. Features used for categorization, learned in a snowballing way. Name form: E.g. “…burgh” is a city. Context: E.g. “mayor of X” → X is a city. Lists and table: If you see a list “London, Paris, Prague” and you know that London and Paris are cities, infer that Prague is a city.

11 NELL: Taxonomy Seems to be largely accurate though lopsided (e.g. 9047 instances of “amphibian” vs. 0 instances of “poem”).

12 NELL: Facts Hit and miss. Mostly unexciting. I did an experiment of collecting 110 facts from NELL. 64 were true. 16 were nearly true, e.g. “Dublin Dublin is the capital city of the country Ireland”. 14 were false e.g. “dipodidae is an arthropod” (actually a rodent). 5 were hopelessly vague e.g. “David is a person who died at age 10.” 11 were meaningless. E.g. “states is a state or province located in the geopolitical location field”.

13 ConceptNet Starts with the collection of facts amassed by Open Mind Common Sense, a crowd-sourcing platform. “Concepts … could be noun phrases, verb phrases, adjective phrases, or clauses. ConceptNet defines concepts as the equivalence class of phrases after normalization removing function words, pronouns, and inflections.”

14 Relations and Patterns IsA UsedFor CapableOf Desires CreatedBy PartOf HasProperty Causes NP is a kind of NP NP is used for VP NP can VP NP wants to VP You make NP by VP NP is part of NP NP is AP The effect of NP|VP is NP|VP

15 ConceptNet

16 WordNet Hand-constructed lexicon of English (Miller 1995) Word senses disambiguated. Words sense related by synonym, antonym, hypernym, hyponym, meronymy (part/whole). A concept is a synset (a collection of synonymous word senses). 117,000 synsets WordNets exist (to some extent) for almost 50 languages.

17 WikiNet (Nastase et al. 2010) A concept “roughly corresponds to a Wikipedia article”. Defer to the wisdom of crowds ( as mediated by the abstruse, bureaucratic, contentious process that is Wikipedia editing). Relations are extracted from text and info-boxes. Multi-lingual Wikipedia. Entities and categories are aligned across languages. 90+% accuracy

18 Sample categories Members of Queen (band) Movies directed by Woody Allen Villages in Brandenburg Mixed Martial Arts Television Programs 490,215 categories. 3.3M concepts. 36M relations: instance 10M, subcategory 400K, spatial 4M, nationality 570K, topic 340K, genre 330K

19 CYC Long-term project (1985-present) to encode commonsense knowledge. Knowledge hand-coded by knowledge engineers. Concepts chosen to optimize the knowledge encoding; only secondarily related to NL words (at least in principle). Very precise conceptual distinctions.

20 CYC Concepts “Just about anything can be reified: a particular proposition, a type of predicate, a problem solving context, an inference mechanism, etc.” Some concepts (from the 1990 book) PaperboyDeliveringNewspapersAsABusinessEvent PaperboyDeliveringNewspapersAsATravelEvent EgyptIn1986, PhysicalEgyptIn1986, PoliticalEgyptIn1986 YoungerThanParentsConstraint

21 CYC Partly open, partly proprietary. Poorly described. Open CYC (public): 200K concepts, 2M facts Research CYC (can be licensed): 500K concepts, 5M facts.

22 Relation to Cog. Psych. Where are: Concept learning? Definitional vs. prototype vs. exemplar theories? Different kind of AI: Classification learning (Supervised vs. Unsupervised) Synthetic categories? Only of theoretical interest. Simple algorithms have superhuman abilities. Base level vs. sub/superordinate. ??

23 Hard Questions How many concepts? AI: 100,000s to millions Cog psych: ?? How to evaluate recall (coverage). What is a concept? Relation of concepts to Mentalese primitives. English vs. other languages: Does it make any difference?


Download ppt "How AI Programs Collect Concepts Ernie Davis Concept and Categories Seminar Sept 25, 2015."

Similar presentations


Ads by Google