Automatic Acquisition of Paradigmatic Relations using Iterated Co-occurrences Chris Biemann, Stefan Bordag, Uwe Quasthoff University of Leipzig, NLP Department.

Slides:



Advertisements
Similar presentations
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
Advertisements

By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Graphing Your Motion Using Vernier Lab Quests. Objectives In this experiment you will Use a Motion Detector to measure position, velocity and acceleration.
Introduction to Algorithms
Using the Set Operators
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
Exponents You will have 20 seconds to complete each of the following 16 questions. A chime will sound as each slide changes. Read the instructions at.
0 - 0.
ALGEBRAIC EXPRESSIONS
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
We Can Read About Mixing Colors
Geometric Networks in ArcGIS
Richmond House, Liverpool (1) 26 th January 2004.
1 Telephone Connection. 2 Introduction The section instructs you on how to install a required phone line to every receiver.
ALGEBRA TILES.
1 Printing & Imaging Technology Design Concepts: The Principles and Elements of Design Copyright © Texas Education Agency, All rights reserved. Images.
1 Verification of Parameterized Systems Reducing Model Checking of the Few to the One. E. Allen Emerson, Richard J. Trefler and Thomas Wahl Junaid Surve.
©2004 Brooks/Cole FIGURES FOR CHAPTER 16 SEQUENTIAL CIRCUIT DESIGN Click the mouse to move to the next page. Use the ESC key to exit this chapter. This.
Test on Input, Output, Processing, & Storage Devices
Defect testing Objectives
Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
1 Junior Infants Letter Sounds 2 c says /c/ as in cat c.
© S Haughton more than 3?
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Two Segments Intersect?
1 Economic Growth Professor Chris Adam Australian Graduate School of Management University of Sydney and University of New South Wales.
Squares and Square Root WALK. Solve each problem REVIEW:
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Lecture plan Outline of DB design process Entity-relationship model
PowerPoint Design Quiz True or False By PresenterMedia.comPresenterMedia.com.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
CS 240 Computer Programming 1
25 seconds left…...
True or False? 20 questions. Question 1 Sound is a transverse wave.
Lots of fun! Win valuable prizes!
Exponential and Logarithmic Functions
Test B, 100 Subtraction Facts
‘You Be George’ Activity. ProblemLearning TargetRight?Wrong?Simple mistake? More study? 1 Place Value: I can write numerals in expanded form to 10 thousands.
6.4 Logarithmic Functions
Week 1.
We will resume in: 25 Minutes.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
1 Unit 1 Kinematics Chapter 1 Day
TASK: Skill Development A proportional relationship is a set of equivalent ratios. Equivalent ratios have equal values using different numbers. Creating.
Chapter 11: The t Test for Two Related Samples
13- 1 Chapter 13: Color Processing 。 Color: An important descriptor of the world 。 The world is itself colorless 。 Color is caused by the vision system.
1 Functions and Applications
The Small World Phenomenon: An Algorithmic Perspective Speaker: Bradford Greening, Jr. Rutgers University – Camden.
Bart Jansen 1.  Problem definition  Instance: Connected graph G, positive integer k  Question: Is there a spanning tree for G with at least k leaves?
1 Algorithmic Networks & Optimization Maastricht, November 2008 Ronald L. Westra, Department of Mathematics Maastricht University.
1 Semantic Indexing with Typed Terms using Rapid Annotation 16th of August 2005 TKE-05 Workshop on Semantic Indexing, Copenhagen Chris Biemann University.
Technology and digital images. Objectives Describe how the characteristics and behaviors of white light allow us to see colored objects. Describe the.
Do Now: What is this?. Other types of Color Wheels.
LANGUAGE NETWORKS THE SMALL WORLD OF HUMAN LANGUAGE Akilan Velmurugan Computer Networks – CS 790G.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
1 Query Operations Relevance Feedback & Query Expansion.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Union-find Algorithm Presented by Michael Cassarino.
Text Summarization using Lexical Chains. Summarization using Lexical Chains Summarization? What is Summarization? Advantages… Challenges…
CS 430: Information Discovery
Semantic Indexing with Typed Terms using Rapid Annotation
Presentation transcript:

Automatic Acquisition of Paradigmatic Relations using Iterated Co-occurrences Chris Biemann, Stefan Bordag, Uwe Quasthoff University of Leipzig, NLP Department LREC 2004, Learning & Acquisition (II), 27th of May 2004

Chris Biemann 2 Sets of Words Our goal is the automatic extension of homogenous word sets, i.e. WordNet synsets or small subtrees of some hierarchy We collect methods and apply them, eventually in combination Mind experiment: the computer as associator: Input: some example concepts - Detection of the relation - Output of additional instances This can be done semi-supervised Necessary: - very large text corpus - features - methods

Chris Biemann 3 Statistical Co-occurrences occurrence of two or more words within a well-defined unit of information (sentence, nearest neighbors) Significant Co-occurrences reflect relations between words Significance Measure (log-likelihood): - k is the number of sentences containing a and b together - ab is (number of sentences with a)*(number of sentences with b) - n is total number of sentences in corpus

Chris Biemann 4 Iterating Co-occurrences (sentence-based) co-ocurrences of first order: words that co-occur significantly often together in sentences co-occurrences of second order: words that co-occur significantly often in collocation sets of first order co-occurrences of n-th order: words that co-occur significantly often in collocation sets of (n-1)th order When calculating a higher order, the significance values of the preceding order are not relevant. A co-occurrence set consists of the N highest ranked co-occurrences of a word.

Chris Biemann 5 Constructed Example I Ord 1dogterriercatmousebarkingbiteyelp dog---XxX terrier---xxX cat--x-x- mouse--X-x- barkingXX---- biteXXxx-- yelpxx---- Ord 2dogterriercatmousebarkingbiteyelp dog terrier cat mouse barking bite yelp----22

Chris Biemann 6 Constructed Example II Ord 3dogterriercatmousebarkingbiteyelp dog terrier----- cat mouse barking bite yelp Ord 2dogterriercatmousebarkingbiteyelp dogx----- terrierx----- cat mouse barking----xx bite----xx yelp----xx

Chris Biemann 7 Properties of Iterated Co-occurrences after some iterations the sets remain more or less stable the sets are somewhat semantically homogeneous sometimes, they have to do nothing with the reference word calculations performed until 10th order. Example for TOP 20 NB-collocations of 10th order forerklärte [explained]: sagte, schwärmte, lobt, schimpfte, meinte, jubelte, lobte, resümierte, schwärmt, Reinhard Heß, ärgerte, kommentierte, urteilte, analysierte, bilanzierte, freute, freute sich, Bundestrainer, freut,gefreut [said, enthused, praises, grumbled, meant, was jubilant, praised, summarized, dreamt, Reinhard Hess, annoyed, commentated, judged, analyzed, balanced, made happy, was pleased, coach of the national team, is pleased, been pleased]

Chris Biemann 8 Mapping co-occurrences to graphs For all words having co-occurrences, form nodes in a graph. Connect them all by edges, initialize edge weight with 0 For every co-occurrence of two words in a sentence, increase edge weight by significance

Chris Biemann 9 First Iteration Step The two black nodes A and B get connected in the step if there are many nodes C which are connected to both A and B The more Cs, the higher the weight of the new edge new connection existing connection

Chris Biemann 10 Second Iteration Step The two black nodes A and B get connected in the step if there are many (dark grey) nodes Ds which are connected to both A and B. The connections between the nodes Ds and the nodes A and B were constructed because of (light gray) nodes Es and Fs, respectively new connection former connection existing connection A B Ds Es Fs

Chris Biemann 11 Collapsing bridging nodes Upper bound for path length in iteration n is 2 n. However, some of the bridging nodes collapse, giving rise to self-keeping clusters of arbitrary path length, which are invariant under iteration. Upper 5 nodes: invariant cluster A, B are being absorbed by this cluster

Chris Biemann 12 Examples of Iterated Co-occurrences OrderReference word TOP-10 collocations N2winewines, champagne, beer, water, tea, coffee, Wine, alcoholic, beers, cider S10winewines, grape, sauvignon, chardonnay, noir, pinot, cabernet, spicy, bottle, grapes S1ringingphone, bells, phones, hook, bell, endorsement, distinctive, ears, alarm, telephone S2ringingrung, Centrex, rang, phone, sounded, bell, ring, FaxxMaster, sound, tolled S4ringingsounded, rung, rang, tolled, tolling, sound, tone, toll, ring, doorbell S10pressingCtrl, Shift, press, keypad, keys, key, keyboard, you, cursor, menu, PgDn, keyboards, numeric, Alt, Caps, CapsLock, NUMLOCK, NumLock, Scroll

Chris Biemann 13 Intersection of Co-occurrence Sets: resolving ambiguity Herz- Bube Stich Becker Achtelfinale - Aufschlag - Boris Becker - Daviscup - Doppel - DTB – Edberg - Finale - Graf - Haas - Halbfinale - Match - Pilic - Runde - Sampras - Satz - Tennis - Turnier - Viertelfinale - Weltrangliste - Wimbledon Alleinspieler - Herz - Herz-Dame - Herz- König - Hinterhand - Karo - Karo-As - Karo- Bube - Kreuz-As - Kreuz-Bube - Pik-As - Pik-Bube - Pik-König - Vorhand - Becker - Courier - Einzel - Elmshorn - French Open - Herz-As - ins - Kafelnikow - Karbacher - Krajicek - Kreuz-As - Kreuz-Bube - Michael Stich - Mittelhand - Pik-As - Pik-Bube - Pik-König bedient - folgenden - gereizt - Karo-Buben - Karo-Dame - Karo- König - Karte - Karten - Kreuz-Ass - Kreuz-Dame - Kreuz-Hand - Kreuz-König - legt - Mittelhand - Null ouvert - Pik - Pik-Ass - Pik- Dame - schmiert - Skat - spielt - Spielverlauf - sticht - übernimmt - zieht - Agassi - Australian Open - Bindewald - Boris - Break - Chang - Dickhaut - - gewann - Ivanisevic - Kafelnikow - Kiefer - Komljenovic - Leimen - Matchball - Michael Stich - Monte Carlo - Prinosil - Sieg - Spiel - spielen - Steeb - Teamchef Stich

Chris Biemann 14 Example: NB-collocations of 2nd order warm, k ü hl, kalt Disjunction and filtering for adjectives of collocation sets for warm, kühl, kalt [warm, cool, cold] results in: abgekühlt, aufgeheizt, eingefroren, erhitzt, erwärmt, gebrannt, gelagert, heiß, heruntergekühlt, verbrannt, wärmer [cooled down, heated, frozen, heated up, warms up, burned, stored, hot, down- cooled, burned, more warmly] emotional reading abweisend [repelling] for kühl, kalt is eliminated

Chris Biemann 15 Detection of X-onyms synonyms, antonyms, (co)-hyponyms... Idea: Intersection of co-occurrence sets of two X-onyms as reference words should contain X-onyms lexical ambiguity of one reference word does not deteriorate the result set Method: - Detect word class for reference words - calculate co-occurrences for reference words - filter co-occurrences w.r.t the word class of the reference words (by means of POS tags) - perform disjunction of the co-occurrence sets - output result ranking can be realized over significance values of the co-occurrences

Chris Biemann 16 Mini-Evaluation Experiments for different data sources, NB-collocations of 2nd and 3rd order fraction of X-onyms in TOP 5 higher than in TOP 10 ranking method makes sense disjunction of 2nd-order and 3rd-order collocations almost always empty different orders exhibit different relations satisfactory quantity, more through larger corpora quality: for unsupervised extension not precise enough

Chris Biemann 17 Word Sets for Thesaurus Expansion Application: thesaurus expansion start set: [warm, kalt] [warm, cold] result set: [heiß, wärmer, kälter, erwärmt, gut, heißer, hoch, höher, niedriger, schlecht, frei] [hot, warmer, colder, warmed, good, hotter, high, higher, lower, bad, free] start set: [gelb, rot] [yellow, red] result set: [blau, grün, schwarz, grau, bunt, leuchtend, rötlich, braun, dunkel, rotbraun, weiß] [blue, green, black, grey, colorful, bright, reddish, brown, dark, red-brown, white] start set: [Mörder, Killer] [murderer, killer] result set: [Täter, Straftäter, Verbrecher, Kriegsverbrecher, Räuber, Terroristen, Mann, Mitglieder, Männer, Attentäter] [offender, delinquent, criminal, war criminal, robber, terrorists, man, members, men, assassin

Chris Biemann 18 More Examples in English Intersection of N2-Order collocation sets

Chris Biemann 19 Questions? THANK YOU !