Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hans Uszkoreit German Research Center for Artificial Intelligence and Saarland University at Saarbruecken Hans Uszkoreit German Research Center for Artificial.

Similar presentations


Presentation on theme: "Hans Uszkoreit German Research Center for Artificial Intelligence and Saarland University at Saarbruecken Hans Uszkoreit German Research Center for Artificial."— Presentation transcript:

1 Hans Uszkoreit German Research Center for Artificial Intelligence and Saarland University at Saarbruecken Hans Uszkoreit German Research Center for Artificial Intelligence and Saarland University at Saarbruecken The Rôle of Linguistics for the Future of Language Processing

2 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit The development of linguistics Linguistics and the computer The relevance of CL for theoretical linguistics The role of linguistics for language technology Current trends and outlook The development of linguistics Linguistics and the computer The relevance of CL for theoretical linguistics The role of linguistics for language technology Current trends and outlook Outline

3 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Data-Gathering and Maintenance automatic handling of large volumes of data Scientific Computing data and model visualization data exploitation, simulation modelling Electronic scientific information data on research (centers, people, resources, projects, literature) Electronic scientific content reports, articles, books, e-journals, e-print archives Data-Gathering and Maintenance automatic handling of large volumes of data Scientific Computing data and model visualization data exploitation, simulation modelling Electronic scientific information data on research (centers, people, resources, projects, literature) Electronic scientific content reports, articles, books, e-journals, e-print archives IT in Science

4 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Development of Linguistics first half of 20th century: linguistics becomes concrete structuralist linguistics - ontological concepts (entities and structures) second half of 20th century: linguistics becomes formal generative linguistics - formalisms for syntax and semantics first half of 21st century: linguistics becomes empirical empirical linguistics - quantitative models - graded grammaticality

5 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit The Rôle of Computation formalization led to highly complex systems of formal rules, principles or constraints that cannot be tested, validated and modified without sophisticated information processing language data of sufficient size cannot be gathered, searched, and maintained anymore without powerful computing

6 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Empirical Linguistics discrete findings statistical findings replicability shared interpretations of data connection with data and results

7 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit E MPIRICAL L INGUISTICS corpus data experimental psycholinguistic data introspective data DB of relevant data research

8 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Driving Forces of CL Cognition models of human language processing Cognition models of human language processing Engineering language technology applicationsEngineering applications Linguistics linguistic theory Linguistics

9 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Role of Computing in Linguistics theoretical linguistics applied linguistics linguistics w/o the computer linguistics with the computer

10 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Until 1980 Linguistics Computational Linguistics

11 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit 1980-1990 Linguistics Computational Linguistics

12 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit 1990 - 2000 Linguistics Computational Linguistics

13 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit LT M ETHODS discrete non-discrete hybrid shallow deep HMM-based POS Tagger

14 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit discrete non-discrete hybrid shallow deep HPSG-Parser with MRS LT M ETHODS

15 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit discrete non-discrete hybrid shallow deep PCF Parser LT M ETHODS

16 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit discrete non-discrete hybrid shallow deep syntactic LFG parser with ME selection LT M ETHODS

17 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit discrete non-discrete hybrid shallow deep LT M ETHODS (Trends)

18 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Simulation and Modelling N NP A NDetV VP NP S Sue gave Paul an old penny. NP

19 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Sue gab Paul einen alten Pfennig. NP NA NDetV S/NP NP S N A NDetV VP NP S Sue gave Paul an old penny. NP x[(old'(penny')) (x) Past(give'(sue, paul, x)))]

20 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit A PPLICATIONS Machine Translation e.g. Systran, Logos, M ETAL -Comprendium, IBM PT Access to Databases e.g. Core Language Engine New: Information Extraction and Text Enrichment e.g. W HITEBOARD, D EEP T HOUGH Machine Translation e.g. Systran, Logos, M ETAL -Comprendium, IBM PT Access to Databases e.g. Core Language Engine New: Information Extraction and Text Enrichment e.g. W HITEBOARD, D EEP T HOUGH

21 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit O NCE U PON A T IME Broad industrial research in deep parsing Xerox - LFG Siemens - LFG IBM Germany - HPSG Hewlett Packard - GPSG and HPSG IBM USA - PLNLP and Slot Grammar Very large projects EUROTRA LILOG LS-GRAM Broad industrial research in deep parsing Xerox - LFG Siemens - LFG IBM Germany - HPSG Hewlett Packard - GPSG and HPSG IBM USA - PLNLP and Slot Grammar Very large projects EUROTRA LILOG LS-GRAM

22 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit G RAMMAR F RAMEWORKS Head-Driven Phrase Structure Grammar (HPSG) Lexical Functional Grammar (LFG) Tree-Adjunction Grammar (TAG) Categorial Grammar (CG) Dependency Grammar (DG) GB-Minimalist Program Head-Driven Phrase Structure Grammar (HPSG) Lexical Functional Grammar (LFG) Tree-Adjunction Grammar (TAG) Categorial Grammar (CG) Dependency Grammar (DG) GB-Minimalist Program

23 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Problems with Deep Analysis Coverage (Development Time) Robustness (Coping with Out-of-Grammar Input) Efficiency (Runtime and Space Efficiency) Specificity (Selection among Readings) Coverage (Development Time) Robustness (Coping with Out-of-Grammar Input) Efficiency (Runtime and Space Efficiency) Specificity (Selection among Readings)

24 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit R EAL G RAMMARS LinGO - English Resource Grammar 8.000 types 100.000 lines of code 6.000 lexemes average feature structure > 300 nodes German Grammar of equal size Japanese grammar is still smaller LinGO - English Resource Grammar 8.000 types 100.000 lines of code 6.000 lexemes average feature structure > 300 nodes German Grammar of equal size Japanese grammar is still smaller

25 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Future of Linguistics Combination of discrete and nondiscrete methods All in one integrated system (such as a blackboard or manager architecture) Separate systems annotating the same input with different control schemes (whiteboard or pool architecture) Combination of discrete and nondiscrete methods All in one integrated system (such as a blackboard or manager architecture) Separate systems annotating the same input with different control schemes (whiteboard or pool architecture)

26 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Outlook Linguistics will develop hybrid discrete and nondiscrete models of language More subareas of linguistics will employ computational modelling Computational linguistics will play a central role in the emprirical branch of linguistic research Computational linguistics methods and results do have a future in language technology Language technology will have to get more deeply into semantics The field provides some grand challenges Linguistics will develop hybrid discrete and nondiscrete models of language More subareas of linguistics will employ computational modelling Computational linguistics will play a central role in the emprirical branch of linguistic research Computational linguistics methods and results do have a future in language technology Language technology will have to get more deeply into semantics The field provides some grand challenges

27 LEITLINIEN FÜR DIE HEIDELBERGER COMPUTERLINGUISTIK © 2003 H. Uszkoreit Grand Challenges hybrid models of language processing and learning, models of language change empirical methodology of language science: large multilevel linguistically interpreted data collections ambient computing -- ubiquitous natural access to information and assistance turning the WWW as well as personal and collective digital infor- mation repositories into digital memories and knowledge bases hybrid models of language processing and learning, models of language change empirical methodology of language science: large multilevel linguistically interpreted data collections ambient computing -- ubiquitous natural access to information and assistance turning the WWW as well as personal and collective digital infor- mation repositories into digital memories and knowledge bases


Download ppt "Hans Uszkoreit German Research Center for Artificial Intelligence and Saarland University at Saarbruecken Hans Uszkoreit German Research Center for Artificial."

Similar presentations


Ads by Google