Overview of the merger prototype
Overview Backgrounds: The MUMIS project Cross document annotation merging Alignment of parallel fragments Unification of aligned fragments Clean up unified fragments Reasoning Evaluation Future work & Conclusions
The MUMIS project Semantic access to a multimedia database.
The MUMIS project Semantic access to a multimedia database. Subject: Soccer
The MUMIS project Semantic access to a multimedia database. Corpus: Video recordings of matches, formal texts, ‘ticker’ texts.
The MUMIS project Semantic access to a multimedia database. Approach: Extract knowledge from textual sources Align this (time based) knowledge with video Do retrieval on annotation, returning corresponding video fragments to user
The MUMIS project Semantic access to a multimedia database. Main subject of this presentation: Merging the annotations resulting from separate texts into one cross-document annotation.
Merging Intention of merging: - start with various texts - annotate each text individually - combine annotations Example match: Netherlands – Yugoslavia (European Championship 2000)
Two types of text in merger: Formal texts Ticker texts
Example formal text Netherlands-Yugoslavia Final score: 6-1 Referee: Garcia Aranda Goals: 24' Patrick Kluivert 90' Marc Overmars 91' Savo Milosevic Substitutions: 53' out : Nisa Saveljic in : Jovan Stankovic 58' out : Patrick Kluivert in : Roy Makaay Yellow Cards: Paul Bosvelt
Example ticker text (BBC) 19 mins: Bergkamp scuffs his left-foot shot but still forces Kralj into a diving save low down to his left. 20 mins: Edgar Davids wastes the best chance of the game so far when he blazes over with just the goalkeeper to beat after being put through by Bergkamp. 24 mins: Kluivert puts Holland in front after latching onto a wonderful chip from Bergkamp and then planting a right- foot shot past Kralj from eight yards. 25 mins: Boudewijn Zenden comes close to doubling Holland's lead when he fires in low, right-foot shot which Kralj just about hangs onto.
Example of parallel fragments BBC - 15: Van der Sar pulls of great save to block Mijatovic's shot after Savo Milosevic has cut through the Dutch defence like a knife. Guardian - 17: Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. Kickers 15: Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. WEBTEC 15: Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar.
Example of parallel fragments BBC - 15: Van der Sar pulls of great save to block Mijatovic's shot after Savo Milosevic has cut through the Dutch defence like a knife. Guardian - 17: Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. Kickers 15: Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. WEBTEC 15: Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar.
Merging process: overview 2 document alignment N-document alignment Unification of events from separate sources Special situations
Merging process: 2-document alignment Step 1 of the merging process: merge annotations of 2 texts
Merging process: 2-document alignment Source ASource B
Merging process: 2-document alignment The strongest binding is selected, ruling out certain other bindings.
Merging process: 2-document alignment The strongest binding is selected, ruling out certain other bindings.
Merging process: 2-document alignment The strongest binding is selected, ruling out certain other bindings.
Merging process: 2-document alignment The strongest binding is selected, ruling out certain other bindings.
Merging process: N-document Given the 2-document alignments for each pair of sources, find the n-document alignment where all fragments describing same scene in all separate sources are aligned.
Merging process: N-document
Merging process: Unification of scenes
Merging and reasoning: types of rules Within events or scenes: Player1 and Player2 will not be the same person, a player performing a save will not score a goal in the same scene, etc. Role of teams and events: offensive vs. defensive Combinations of events that probably have the same player: ShotOnGoal+Goal, Penalty+HitThePost Terminology of authors may vary: Cross—Pass, Save—Clearance
Merging and reasoning: example rules
Reasoning: mistakes in IE Sometimes the information extraction component makes mistakes. Example rules have been applied to solve some of these.
Reasoning: mistakes in IE Fix: The goal made by Kralj (Yugoslavian keeper) is removed
Evaluation: What do we want to know? Quality of the merger in itself The advantages and disadvantages of merging
Evaluation: Quality of the merger Quality of alignments Quality of unification The effect of the quality of the original information extraction on both
Evaluation: Approach Create gold standard annotations for single sources Create gold standard merged annotation of all sources Run merger in different conditions Compare everything with everything
Evaluation: Results Alignments based on machine IE Version 1Version 2Version 3 Manual210 Automatic Overlap82172 Precision Recall
Evaluation: Results Alignments based on manual IE Version 1 Manual210 Automatic188 Overlap174 Precision92.6 Recall82.9
Evaluation: Conclusions Quality of alignments is pretty good. Better IE improves alignments. Low quality IE does not degrade alignments too much.
MORE TO COME….
----- Extra Sheets -----
Extra example – 15 th min.
Extra example – graph
Extra example – unification
Extra example – the source BBC - 15: Van der Sar pulls of … Milosevic has cut … Guardian - 17: Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. Kickers 15: Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. WEBTEC 15: Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar. Pass Milosevic ShotOnGoal Mijatovic Save Van der Sar
Reasoning: incomplete graphs
Reasoning: own goal
Reordering Observation from corpus: Scenes in correct order Events within scenes often in wrong order
Reordering Manual annotation of several matches Pass, Shot-on-goal, Goal Pass, Shot-on-goal, Save Shot-on-goal, Hitting-the-post Foul, Free-kick, Shot-on-goal, Corner
Reordering
Not fully implemented yet in the merger.