References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri.

Slides:



Advertisements
Similar presentations
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Advertisements

Systems and Users in Intelligent Information Retrieval: Who does What? prof. dr. L. Schomaker I 2 RP Symposium 3/2/2003, Delft.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved Digital Enterprise Research Institute Ontologies & Natural Language.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
ANSWERING CONTROLLED NATURAL LANGUAGE QUERIES USING ANSWER SET PROGRAMMING Syeed Ibn Faiz.
Architectural Design.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Class 6 Data and Business MIS 2000 Updated: September 2012.
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Web Document Analysis: How can Natural Language Processing Help in Determining Correct Content Flow? Hassan Alam, Fuad Rahman and Yuliya Tarnikova Human.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
COMPUTER PROGRAMMING Source: Computing Concepts (the I-series) by Haag, Cummings, and Rhea, McGraw-Hill/Irwin, 2002.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Artificial intelligence project
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.
Mr C Johnston ICT Teacher BTEC IT Unit 06 - Lesson 01 Introduction to Computer Programming.
Ontology-Based Information Extraction: Current Approaches.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Understanding Natural Language
A system for generating teaching initiatives in a computer-aided language learning dialogue Nanda Slabbers University of Twente Netherlands June 9, 2005.
A Procedural Model of Language Understanding Terry Winograd in Schank and Colby, eds., Computer Models of Thought and Language, Freeman, 1973 발표자 : 소길자.
A Language Independent Method for Question Classification COLING 2004.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Grammars Grammars can get quite complex, but are essential. Syntax: the form of the text that is valid Semantics: the meaning of the form – Sometimes semantics.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정
PSY270 Michaela Porubanova. Language  a system of communication using sounds or symbols that enables us to express our feelings, thoughts, ideas, and.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Volgograd State Technical University Applied Computational Linguistic Society Undergraduate and post-graduate scientific researches under the direction.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Database Environment Chapter 2. The Three-Level ANSI-SPARC Architecture External Level Conceptual Level Internal Level Physical Data.
Artificial Intelligence Knowledge Representation.
Data Models. 2 The Importance of Data Models Data models –Relatively simple representations, usually graphical, of complex real-world data structures.
Welcome to OBJECT ORIENTED PROGRAMMING Prepared By Prepared By : VINAY ALEXANDER PGT(CS) KV jhagrakhand.
Advanced Computer Systems
Chapter 2: Database System Concepts and Architecture - Outline
Improving a Pipeline Architecture for Shallow Discourse Parsing
LING 581: Advanced Computational Linguistics
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Software Connectors – A Taxonomy Approach
An ICALL writing support system tunable to varying levels
Metadata Framework as the basis for Metadata-driven Architecture
Problem Solving Skill Area 305.1
Ontology-Based Approaches to Data Integration
Chapter 2 Database Environment Pearson Education © 2009.
Information Retrieval
Presentation transcript:

References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri. (Eds.), Computational Linguistics in the Netherlands Amsterdam: Rodopi. Harbusch, Karin & Kempen, Gerard (2002). A quantitative model of word order and movement in English, Dutch and German complement constructions. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), Taipei (Taiwan). [pp ] Project Objective Problem For the user of a Data Base it is best if the textual part of results of a query is given as a well organized set of natural language sentences. The usual ways of dealing with this problem combine sentences from the database, but cannot produce new ones or combine parts of those already available. The goal of the project is to develop a generating system which will be fed with semantically rich information. In this way a closer and more content-sensitive relation can be established between the query and the information in the Data Base. This kind of generator would in the same time be fed more of the syntactically relevant information like the logical form, the temporal structure of the event, the Argument Structure and others, that help not only the process of generation, but also possibly the discourse organization of the text and information interchange with different ontologies and semantic webs. The model of the semantic representation and procedures developed should be as close as possible to the universal compatibility with other generators. One way to test this property is based on the availability of two different generators at the project: Performance Grammar Generator and Delilah, with both of which the semantic model should be able to work successfully. An important question that will be tackled is whether, and to what extent, it is possible to recover the complex entailment calculations by procedural means in the system operating over the Logical Form and semantic representation. Objective Build a generating system which operates with both the conceptual level and Logical Form and which is able to generate natural language sentences from the selected semantic material. The important tasks within this objective are to keep the entailing relations, i.e. make sure that the output preserves the information from the Data Base without adding any new meaning to it and to keep the compatibility with the other projects in I2RP working on related projects. The Data Base used for testing the system and applying it to is the Rijksmuseum’s Data Base on Rembrandt, the same one that is used by the Cuypers project of CWI. Principles Base the model on the semantics and logical form of the information from the Data Base instead of the statistic and other shallow methods. Keep all the modules universally applicable to any natural language and not only adjusted to Dutch, English or other language of application. Use the semantic base of the generator to make it able to establish the relations between different lexical contents denoting the same concepts. Follow the findings of experimental and theoretical sciences like psycholinguistics, semantics and syntax in the architecture of the system instead of the simple result oriented procedures which usually lead to oversimplifications and therefore to less robust applications. Current Work Parsing-Generating System with the conceptual interface Two software applications are being developed and upgraded: the Performance Grammar Generator, a generator based on the Performance Grammar model of Gerard Kempen and Delilah, the work of Cremers and Hijzelendoorn, which does both parsing and generating. The current work is based on the latter and presents an interface system that allows Delilah to do the generation on the basis of a parsed natural language input, e.g. its conceptual material. Delilah’s Parsing-Generating system takes a sentence of Dutch as its input and gives the output of new grammatical sentences of Dutch containing all the concepts from the input. At the point of its development, the system cannot control the process of generation so that only the sentences entailed by the input would be generated. It means that among the sentences generated by the system, a large number will have meaning contradictory or orthogonal to the ones from the input: based on the same concepts, but in very different relations and realizations. There are two plausible ways to solve this problem. The first one is Generate and Test, which means introducing another module which would test the generated sentences for their entailment and end the recursive generation when a satisfying sentence is made. The other two are the following. Future Plans A richer semantic representation Research in the areas of Logical Form, thematic roles, Aspectual and Argument Structure as well as in the Event Structure and nominalization are expected to lead o a richer semantic representation which would include event structure and in this way improve both the parser and the generator. It will also contribute to enriching the lexical information and allowing for entailment to be calculated on a more fine grained level. Improving entailment calculations Te general question of to what extent the information can be preserved in the processes of parsing and generation will be more thoroughly explored in order to find the optimal way to make the generation process controlled and strictly dependent on the entailment relations with the Data Base. Synchronizing the work with the other projects in I2RP There already is a significant overlapping in the areas of interests and problems to be solved with other projects inside I2RP. We are trying to establish a common ground in the approach to discourse structure, conceptual networks and ontologies with the other groups, especially with the Cuypers project at CWI, with which our project overlaps the most. We are also going to organize the cooperation so that we work on the same Data Bases and try to establish a compatibility between each other’s applications. Covering the Rembrandt Data Base The lexical and conceptual world of the Rembrandt Data Base will be further incorporated in the developed systems. One of the next steps is to impose a certain hierarchical or ontological organization of the set of concepts on the generating system, especially its semantic component. Semantically Based Parsing-Generating System Boban Arsenijević, dr. Crit Cremers, prof. Gerard Kempen with help of Hilke Reckman (a member of ToKeN’s Narator project), The hybrid solution would still use the conceptual level, but the selection of the material for the base of generation would go on the level of the semantic form, which should allow it to keep the advantages of both other solutions. Delilah’s Parser-Generator is a parser and generator in the same time for which the parsing segment introduces the Data Base, which can be any text in Dutch. It parses the input into a semantic representation, and then further extracts the concepts related to the meaning of the sentence. No matter how many concepts extracted, it is able to make a new sentence with possibly new relations between the concepts and new words realizing them, but which will contain all the concepts extracted from the input. The current version prefers an unstructured Data Base to a structured one. Controlled procedure is supposed to work only on the level of the sematic representation, without the concepts, and generate the sentence only once the proper material is selected. It is much faster than the Generate and Test model, but also has certain disadvantages. De Spreekbuis