Presentation is loading. Please wait.

Presentation is loading. Please wait.

First assignment: due Today

Similar presentations


Presentation on theme: "First assignment: due Today"— Presentation transcript:

1 First assignment: due Today
Go to Create an account for yourself use andrew id Go to your user page Your real name & a link to your home page Preferably a picture Who you are and what you hope to get out of the class (Let me know if you’re just auditing) Any special skills you have, research interests that you have, IE related projects you have been or might be working on, etc.

2 Outline Super-quick review of previous talk
More on NER by token-tagging Limitations of HMMs MEMMs for sequential classification Review of relation extraction techniques Decomposition one: NER + segmentation + classifying segments and entities Decomposition two: NER + segmentation + classifying pairs of entities Some case studies ACE Webmaster

3 Quick review of previous talk

4 What is “Information Extraction”
As a family of techniques: Information Extraction = segmentation + classification + association + clustering October 14, 2002, 4:00 a.m. PT For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation. Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers. "We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“ Richard Stallman, founder of the Free Software Foundation, countered saying… * Microsoft Corporation CEO Bill Gates Microsoft Gates Bill Veghte VP Richard Stallman founder Free Software Foundation * NAME TITLE ORGANIZATION Bill Gates CEO Microsoft Bill Veghte VP Richard Stallman founder Free Soft.. * * 4

5 What is “Information Extraction”
As a family of techniques: Information Extraction = segmentation + classification + association + clustering via token tagging October 14, 2002, 4:00 a.m. PT For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation. Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers. "We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“ Richard Stallman, founder of the Free Software Foundation, countered saying… * Microsoft Corporation CEO Bill Gates Microsoft Gates Bill Veghte VP Richard Stallman founder Free Software Foundation * NAME TITLE ORGANIZATION Bill Gates CEO Microsoft Bill Veghte VP Richard Stallman founder Free Soft.. * * 5

6 Token tagging and NER

7 NER by tagging tokens Given a sentence:
Yesterday Pedro Domingos flew to New York. 1) Break the sentence into tokens, and classify each token with a label indicating what sort of entity it’s part of: person name location name background Yesterday Pedro Domingos flew to New York 3) To learn an NER system, use YFCL. 2) Identify names based on the entity labels Person name: Pedro Domingos Location name: New York 7

8 HMM for Segmentation of Addresses
CA 0.15 NY 0.11 PA 0.08 Hall 0.15 Wean 0.03 N-S 0.02 Simplest HMM Architecture: One state per entity type [Pilfered from Sunita Sarawagi, IIT/Bombay] 8

9 HMMs for Information Extraction
00 : pm Place : Wean Hall Rm Speaker : Sebastian Thrun The HMM consists of two probability tables Pr(currentState=s|previousState=t) for s=background, location, speaker, Pr(currentWord=w|currentState=s) for s=background, location, … Estimate these tables with a (smoothed) CPT Prob(location|location) = #(loc->loc)/#(loc->*) transitions Given a new sentence, find the most likely sequence of hidden states using Viterbi method: MaxProb(curr=s|position k)= Maxstate t MaxProb(curr=t|position=k-1) * Prob(word=wk-1|t)*Prob(curr=s|prev=t) 9

10 “Naïve Bayes” Sliding Window vs HMMs
Domain: CMU UseNet Seminar Announcements GRAND CHALLENGES FOR MACHINE LEARNING Jaime Carbonell School of Computer Science Carnegie Mellon University 3:30 pm 7500 Wean Hall Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on. Field F1 Speaker: 30% Location: 61% Start Time: 98% Field F1 Speaker: 77% Location: 79% Start Time: 98% 10

11 Design decisions: What are the output symbols (states)
Design decisions: What are the output symbols (states) ? What are the input symbols ? Cohen => “Cohen”, “cohen”, “Xxxxx”, “Xx”, … ? 8217 => “8217”, “9999”, “9+”, “number”, … ? Sarawagi et al: choose best abstraction level using holdout set 11

12 What is a symbol? Ideally we would like to use many, arbitrary, overlapping features of words. identity of word ends in “-ski” is capitalized is part of a noun phrase is in a list of city names is under node X in WordNet is in bold font is indented is in hyperlink anchor S S S t - 1 t t+1 is “Wisniewski” part of noun phrase ends in “-ski” O O O t - 1 t t +1 Lots of learning systems are not confounded by multiple, non-independent features: decision trees, neural nets, SVMs, … 12

13 What is a symbol? identity of word ends in “-ski” is capitalized is part of a noun phrase is in a list of city names is under node X in WordNet is in bold font is indented is in hyperlink anchor S S S t - 1 t t+1 is “Wisniewski” part of noun phrase ends in “-ski” O O O t - 1 t t +1 Idea: replace generative model in HMM with a maxent model, where state depends on observations 13

14 What is a symbol? identity of word ends in “-ski” is capitalized is part of a noun phrase is in a list of city names is under node X in WordNet is in bold font is indented is in hyperlink anchor S S S t - 1 t t+1 is “Wisniewski” part of noun phrase ends in “-ski” O O O t - 1 t t +1 Idea: replace generative model in HMM with a maxent model, where state depends on observations and previous state 14

15 What is a symbol? identity of word ends in “-ski” is capitalized is part of a noun phrase is in a list of city names is under node X in WordNet is in bold font is indented is in hyperlink anchor S S t - 1 S t+1 t is “Wisniewski” part of noun phrase ends in “-ski” O O O t - 1 t t +1 Idea: replace generative model in HMM with a maxent model, where state depends on observations and previous state history 15

16 Ratnaparkhi’s MXPOST Sequential learning problem: predict POS tags of words. Uses MaxEnt model described above. Rich feature set. To smooth, discard features occurring < 10 times. 16

17 Conditional Markov Models (CMMs) aka MEMMs aka Maxent Taggers vs HMMS
St-1 St St+1 ... Ot-1 Ot Ot+1 St-1 St St+1 ... Ot-1 Ot Ot+1 17

18 Extracting Relationships

19 What is “Information Extraction”
As a family of techniques: Information Extraction = segmentation + classification + association + clustering October 14, 2002, 4:00 a.m. PT For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation. Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers. "We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“ Richard Stallman, founder of the Free Software Foundation, countered saying… * Microsoft Corporation CEO Bill Gates Microsoft Gates Bill Veghte VP Richard Stallman founder Free Software Foundation * NAME TITLE ORGANIZATION Bill Gates CEO Microsoft Bill Veghte VP Richard Stallman founder Free Soft.. * * 19

20 What is “Information Extraction”
As a task: Filling slots in a database from sub-segments of text. 23rd July :51 GMT Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week. After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater. As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. … Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. NAME TITLE ORGANIZATION Stephen Hemminger Greg Kroah-Hartman principal engineer programmer lead Vyatta Novell Linux Driver Proj. What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database.

21 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities 23rd July :51 GMT Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week. After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater. As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. … Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database.

22 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities 23rd July :51 GMT Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week. After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater. As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. … Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database.

23 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities 23rd July :51 GMT Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week. After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater. As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. … Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. Does not contain worksAt fact Does not contain worksAt fact What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database. Does contain worksAt fact Does contain worksAt fact

24 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. … Does contain worksAt fact Stephen Hemminger Is in the worksAt fact NAME TITLE ORGANIZATION Stephen Hemminger principal engineer Vyatta principal engineer What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database. Is in the worksAt fact Microsoft Is not in the worksAt fact Is in the worksAt fact Vyatta

25 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities Because of Stephen Hemminger’s discovery, Vyatta was soon purchased by Microsoft for $1.5 billion… Does contain an acquired fact Microsoft Is in a acquired fact: role=acquirer Is in the acquired fact: role=acquiree Vyatta Stephen Hemminger Is not in the acquired fact What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database. Is in the acquired fact: role=price $1.5 billion

26 What is “Information Extraction”
Technique 1: NER + Segment + Classify Segments and Entities 23rd July :51 GMT Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. Does contain worksAt fact (actually two of them) - and that’s a problem What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database.

27 What is “Information Extraction”
Technique 2: NER + Segment + Classify EntityPairs from same segment 23rd July :51 GMT Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft. Hemminger Microsoft Linux Driver Project What is IE. As a task it is… Starting with some text… and a empty data base with a defined ontology of fields and records, Use the information in the text to fill the database. programmer Novell lead Greg Kroah-Hartman

28 ACE: Automatic Content Extraction
A case study, or: yet another NIST bake-off

29 About ACE and The five year mission: “develop technology to extract and characterize meaning in human language”…in newswire text, speech, and images EDT: Develop NER for: people, organizations, geo-political entities (GPE), location, facility, vehicle, weapon, time, value … plus subtypes (e.g., educational organizations) RDC: identify relation between entities: located, near, part-whole, membership, citizenship, … EDC: identify events like interaction, movement, transfer, creation, destruction and their arguments

30

31 … and their arguments (entities)

32

33 Event = sentence plus a trigger word

34

35

36

37 Events, entities and mentions
In ACE there is a distinction between an entity—a thing that exists in the Real World—and an entity mention—which is something that exists in the text (a substring). Likewise, and event is something that (will, might, or did) happen in the Real World, and an event mention is some text that refers to that event. An event mention lives inside a sentence (the “extent”) with a “trigger” (or anchor) An event mention is defined by its type and subtype (e.g, Life:Marry, Transaction:TransferMoney) and its arguments Every argument is an entity mention that has been assigned a role. Arguments belong to the same event if they are associated with the same trigger. The entity-mention, trigger, extent, argument are markup and also define a possible decomposition of the event-extraction task into subtask.

38 The Webmaster Project: A Case Study
with Einat Minkov (LTI, now Haifa U), Anthony Tomasic (ISRI) See IJCAI-2005 paper

39 Overview and Motivations
Details: Assume DB-backed website, where schema changes over time No other changes allowed (yet) Interaction: User requests (via NL ) changes in factual content of website (assume update of one tuple) System analyzes request System presents preview page and editable form version of request Key points: partial correctness is useful user can verify correctness (vs case for DB queries, q/a,...) => source of training data What’s new: Adaptive NLP components Learn to adapt to changes in domain of discourse Deep analysis in limited but evolving domain Compared to past NLP systems: Deep analysis in narrow domain (Chat-80, SHRDLU,...) Shallow analysis in broad domain (POS taggers, NE recognizers, NP-chunkers, ...) Learning used as tool to develop non-adaptive NLP components ...something in between...

40 User C email msg C C features C LEARNER Classification requestType
offline training data LEARNER Shallow NLP Feature Building Classification C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features NER newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,... preview page user-editable form version of request Update Request Construction User confirm? web page templates database

41 Outline Training data/corpus Analysis steps: Conclusions/summary
look at feasibility of learning the components that need to be adaptive, using a static corpus Analysis steps: request type entity recognition role-based entity classification target relation finding target attribute finding [request building] Conclusions/summary

42 Training data User1 User2 User3 ....
Mike Roborts should be Micheal Roberts in the staff listing, pls fix it. Thanks - W User1 User2 On the staff page, change Mike to Michael in the listing for “Mike Roberts”. User3 ....

43 Training data User1 User2 User3 ....
Add this as Greg Johnson’s phone number: User1 Please add “ ” to greg johnson’s listing on the staff page. User2 User3 ....

44 Training data – entity names are made distinct
Add this as Greg Johnson’s phone number: User1 Please add “ ” to fred flintstone’s listing on the staff page. User2 User3 Modification: to make entity-extraction reasonable, remove duplicate entities by replacing them with alternatives (preserving case, typos, etc) ....

45 Training data message(user 1,req 1) User1 Request1
.... message(user 1,req 2) User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) message(user 2,req 3) .... .... ....

46 Training data – always test on a novel user?
message(user 1,req 1) test User1 Request1 message(user 2,req 1) .... message(user 1,req 2) train User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) train message(user 2,req 3) .... Simulate a distribution of many users (harder to learn) .... ....

47 Training data – always test on a novel request?
message(user 1,req 1) train User1 Request1 message(user 2,req 1) .... message(user 1,req 2) test User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) train message(user 2,req 3) Simulate a distribution of many requests (much harder to learn) .... 617 s total + 96 similar ones

48 Training data – limitations
One DB schema, one off-line dataset May differ from data collected on-line So, no claims made for tasks where data will be substantially different (i.e., entity recognition) No claims made about incremental learning/transfer All learning problems considered separate One step of request-building is trivial for the schema considered: Given entity E and relation R, to which attribute of R does E correspond? So, we assume this mapping is trivial (general case requires another entity classifier)

49 C email msg C C features Information Extraction C requestType POS tags
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

50 Entity Extraction Results
We assume a fixed set of entity types no adaptivity needed (unclear if data can be collected) Evaluated: hand-coded rules (approx cascaded FST in “Mixup” language) learned classifiers with standard feature set and also a “tuned” feature set, which Einat tweaked results are in F1 (harmonic avg of recall and precision) two learning methods, both based on “token tagging” Conditional Random Fields (CRF) Voted-perception discriminative training for an HMM (VP-HMM)

51 Entity Extraction Results – v2 (CV on users)

52 C email msg C C features C requestType POS tags NP chunks
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

53 Entity Classification Results
Entity “roles”: keyEntity: value used to retrieve a tuple that will be updated (“delete greg’s phone number”) newEntity: value to be added to database (“William’s new office # is 5307 WH”). oldEntity: value to be overwritten or deleted (“change mike to Michael in the listing for ...”) irrelevantEntity: not needed to build the request (“please add .... – thanks, William”) Features: closest preceding preposition closest preceding “action verb” (add, change, delete, remove, ...) closest preceding word which is a preposition, action verb, or determiner (in “determined” NP) is entity followed by ‘s

54 C email msg C C features C requestType POS tags NP chunks
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

55 Reasonable results with “bag of words” features.
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,... Reasonable results with “bag of words” features.

56 C email msg C C features C requestType POS tags NP chunks
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

57 Request type classification: addTuple, alterValue, deleteTuple, or deleteValue?
Can be determined from entity roles, except for deleteTuple and deleteValue. “Delete the phone # for Scott” vs “Delete the row for Scott” Features: counts of each entity role action verbs nouns in NPs which are (probably) objects of action verb (optionally) same nouns, tagged with a dictionary Target attributes are similar Comments: Very little data is available Twelve words of schema-specific knowledge: dictionary of terms like phone, extension, room, office, ...

58 C email msg C C features C requestType POS tags NP chunks
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

59 C email msg C C features C requestType POS tags NP chunks
Shallow NLP Feature Building C requestType POS tags msg NP chunks C targetRelation words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... entity1, entity2, .... C keyEntity1,... otherEntity1,...

60 Training data message(user 1,req 1) User1 Request1
.... message(user 1,req 2) User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) message(user 2,req 3) .... .... ....

61 Training data – always test on a novel user?
message(user 1,req 1) test User1 Request1 message(user 2,req 1) .... message(user 1,req 2) train User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) train message(user 2,req 3) .... Simulate a distribution of many users (harder to learn) .... ....

62 Training data – always test on a novel request?
message(user 1,req 1) train User1 Request1 message(user 2,req 1) .... message(user 1,req 2) test User2 Request2 message(user 2,req 2) .... User3 Request3 message(user 1,req 3) train message(user 2,req 3) Simulate a distribution of many requests (much harder to learn) .... 617 s total + 96 similar ones

63 Other issues: a large pool of users and/or requests
usr

64 Webmaster: the punchline

65 Conclusions? System architecture allows all schema-dependent knowledge to be learned Potential to adapt to changes in schema Data needed for learning can be collected from user Learning appears to be possible on reasonable time-scales 10s or 100s of relevant examples, not thousands Schema-independent linguistic knowledge is useful F1 is eighties is possible on almost all subtasks. Counter-examples are rarely changed relations (budget) and distinctions for which little data is available There is substantial redundancy in different subtasks Opportunity for learning suites of probabilistic classifiers, etc Even an imperfect IE system can be useful…. With the right interface…

66 C email msg C C features Information Extraction C requestType POS tags
Shallow NLP Feature Building C requestType POS tags msg NP chunks targetRelation micro- form C words, ... C targetAttrib features Information Extraction newEntity1,... oldEntity1,... query DB entity1, entity2, .... C keyEntity1,... otherEntity1,...

67

68

69 Webmaster: the Epilog (VIO)
Tomasic et al, IUI 2006 Faster for request-submitter Zero time for webmaster Zero latency More reliable (!) Entity F1 ~= 84, Micro-form selection accuracy =~ 80 Used UI for experiments on real people (human-human, human-VIO)

70 Conclusions and comments
Two case studies of non-trivial IE pipelines illustrate: In any pipeline, errors propogate What’s the right way of training components in a pipeline? Independently? How can (and when should) we make decisions using some flavor of joint inference? Some practical questions for pipeline components: What’s downstream? What do errors cost? Often we can’t see the end of the pipeline… How robust is the method ? new users, new newswire sources, new upsteam components… Do different learning methods/feature sets differ in robustness? Some concrete questions for learning relations between entities: (When) is classifying pairs of things the right approach? How do you represent pairs of objects? How to you represent structure, like dependency parses? Kernels? Special features?


Download ppt "First assignment: due Today"

Similar presentations


Ads by Google