Presentation is loading. Please wait.

Presentation is loading. Please wait.

DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua.

Similar presentations


Presentation on theme: "DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua."— Presentation transcript:

1 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 1 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz (DELi) COLING 2002 W8: NLP XML Sept. 1st, 2002

2 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 2 Introduction – System overview......

3 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 3 Introduction - Corpus Multilingual parallel corpus or master document –Gross-grained RST to represent the gross-grained discourse structure. –XML-DTD to represent digitally the gross-grained RST. Text > Data > In between tags Discourse structure > Metadata > XML tags –Gross-grained RST provides the framework for an isomorphic multilingual corpus.

4 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 4 What is knowledge management? Knowledge, in a business context, is the organizational memory, which people know collectively and individually Management is the judicious use of means to accomplish an end Knowledge management is the combination of those concepts, KM = knowledge + management Introduction: Multilingual parallel corpus with gross-grained RST in XML ENESEU

5 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 5 What is knowledge management? Knowledge, in a business context, is the organizational memory, which people know collectively and individually Management is the judicious use of means to accomplish an end Knowledge management is the combination of those concepts, KM = knowledge + management ¿Qué es gestión del conocimiento? Conocimiento, en el contexto de los negocios, es la memoria de la organización, lo que la gente sabe colectiva e individualmente Gestión es el uso juicioso de recursos para alcanzar un fin Gestión del conocimiento es la combinación de esos dos conceptos, GC = gestión + conocimiento ENESEU Introduction: Multilingual parallel corpus with gross-grained RST in XML

6 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 6 What is knowledge management? Knowledge, in a business context, is the organizational memory, which people know collectively and individually Management is the judicious use of means to accomplish an end Knowledge management is the combination of those concepts, KM = knowledge + management ¿Qué es gestión del conocimiento? Conocimiento, en el contexto de los negocios, es la memoria de la organización, lo que la gente sabe colectiva e individualmente Gestión es el uso juicioso de recursos para alcanzar un fin Gestión del conocimiento es la combinación de esos dos conceptos, GC = gestión + conocimiento Zer da ezagutzaren kudeaketa? Kudeaketa, negozioetan, erakundearen memoria da, jendeak bakarka eta taldeka dakiena Kudeaketak erabideen erabilera zuzena du helburu Ezagutzaren kudeaketa bi kontzeptu hauen nahasketa da, EK = ezagutza + kudeaketa ENESEU Introduction: Multilingual parallel corpus with gross-grained RST in XML

7 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 7 Introduction – User Aspects Specific User Aspects Discrete values SubjectLanguage processors Moment in timeBefore the course / Period 1 / Period 2 / … / After the course (review) LanguagesEN/ ES/ EU General User Aspects Discrete values Level of expertiseNull / Basic / Medium / High Reason to readTo get an idea / To get deep into it BackgroundNot related to the subject / Related to the subject Opinion or motivationAgainst / Without an opinion or motivation / In favour Time availableA little bit of time / Quite some time / Enough time

8 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 8 CSA – Parallel selection

9 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 9 CSA – Horizontal filtering

10 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 10 CSA – Vertical filtering

11 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 11 CSA – Vertical filtering Level of expertise If level_expertise = “null” or level_expertise = “basic” Then no relation-satellite is discarded; If level_expertise = “medium” or level_expertise = “high” Then discard example, exercise, background and preparation relation-satellites; Rationale for the rule: Any user with a null or basic level of expertise on the selected subject will need all the information available to understand the text. Alternatively, a user with a medium or high level of expertise will not require examples, exercises, background, preparation and similar relation-satellites.

12 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 12 CSA – Vertical filtering Reason to read If reason_to_read = “to get an idea” Then discard exercise and elaboration (all the types of elaboration: textual elaboration, link elaboration and image elaboration) relation-satellites; If reason_to_read = “to get deep into it” Then no relation-satellite is discarded; Rationale: Any user wishing to broaden his knowledge in the selected subject will need additional information. Conversely, a user with the intention of just getting an idea does not need any exercise, elaboration, or similar relation- satellites, which often require a more active role on the part of the user.

13 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 13 CSA – Vertical filtering Professional background If job_studies = “not related subject” Then no relation-satellite is discarded; If job_studies = “related subject” Then discard background and preparation relation-satellites; Rationale: Any user whose professional background is not related to the subject will need all the additional supporting text to understand its meaning. Conversely, if the user is related to the selected subject, we may assume that background, preparation and similar relation-satellites will be unnecessary.

14 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 14 CSA – Vertical filtering Opinion or motivation If opinion_motivation = “against” or opinion_motivation = “without an opinion or motivation” Then no relation-satellite is discarded; If opinion_motivation = “in favour” Then discard motivate, antithesis, concession and justify relation-satellite; Rationale: A motivated or favourable user will not require additional motivation and, therefore, the motivate, antithesis, concession, justification, and similar relation-satellites will be disregarded, since they play a role in changing the opinion of the user to be in favour of the course material.

15 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 15 CSA – Vertical filtering Time available If time_available = “a little bit of time” Then discard all the relation-satellites; If time_available = “quite some time” Then discard exercise relation-satellite; If time_available = “enough time” Then no relation-satellite is discarded; Rationale: Time availability is a crucial user aspect. If the user is in a rush or has little time, the system has to provide only the most elementary information. In such case only nuclei will be generated. If the user has a bit more time, but not much, exercises are not offered, since they are usually quite time consuming and they require an active participation of the user. Finally, if the user has plenty of time, all the additional information is delivered.

16 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 16 CSA – Vertical filtering Comments The order of application of the filters is irrelevant, each filter acts upon certain parts of the text independently.

17 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 17 Implementation Javascript implementation XSL implementation objData.loadXML(sResult); objStyle.load(sXSL1); sResult=objData.transformNode(objStyle);

18 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 18 Experimentation The main objective of the experimentation is to validate the hypothesis expressed in the filtering rules letting people judge the generated document and also the actual filtering mechanism of the CSA.

19 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 19 Demo

20 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 20 Demo

21 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 21 Conclusions Increase the size of the corpus: –As long as this is done following the same DTD and RST model, the algorithm will not have to change at all. Augment the user model: –New user aspect requires only a new filter –New values for an existing user aspect requires a change in the corresponding filter Therefore none of this modifications increase the complexity of the system and are not difficult to implement.

22 DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua & J. Díaz 22 Questions Comments Further information Suggestions Thank you for your attention. This research work has been partly supported by the Basque Goverment


Download ppt "DELi COLING 2002 - W8: NLP & XML - Sept. 1st, 2002 Cascading XSL filters for content selection in multilingual document generation G. Barrutieta, J. Abaitua."

Similar presentations


Ads by Google