Presentation is loading. Please wait.

Presentation is loading. Please wait.

Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual.

Similar presentations


Presentation on theme: "Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual."— Presentation transcript:

1 Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual Web 5.3 Multilingual Web content management - methods, tools and processes

2 The information today  Flood of multilingual and heterogeneous information  The challenge: The information has to be processed and analyzed in order to be used more efficiently

3 The information today  Increasing amount of multilingual and heterogeneous information

4 The information today

5 The Language Technologies (LT)  The computers process the information; humans do understand it.  The computers has limited resources to understand the information; the humans has limited resources to process the information.  The NLP technologies optimizes the level of understanding of the computers and thus increase the productivity of the humans.

6 Overview  The NLP technologies by examples  NLP in practice – the ATLAS project  Conclusions  Questions

7 NLP by examples (1)  Divide and Conquer  Grouping the information:  By importance

8 NLP by examples (1)  Divide and Conquer  Grouping the information  By importance  Automatic text categorization  Politics (24)  Sports (5)  Entertainment (5)  Technologies (12)  Science (20)  Rumors (6)  Other (10)

9 NLP by examples (1)  Divide and Conquer  Grouping the information:  By importance  Automatic categorization  Text clustering  Politics (24)  International affairs (12), Conflicts (3), Terrorism (5), Nature and Environment (8),...  Science (20)  Math (2), Physics (5), Nature and Environment(3), NLP technlologies (4),...  Other (10)  Money and Banks (3), Richard Branson (4), Learning materials (3),...

10 NLP by examples (1)  Temporal dynamics  Before, Now, Tomorrow?  Politics (24 + 3)  International affairs (10 -2), Конфликити (3), Terrorism (6 +1), Nature and Environment (10 +2),...

11 NLP by examples (2)  We do value your opinion!  Positive, negative or objective?

12 NLP by examples (3)  Salient excepts  Persons  politics, actors, scientists, fictions characters  Organizations and Institutions  NATO, EU, BAS, Bank of England, Google, Apple, …  Geographical locations  Bulgaria, Sofia, EU, Western Europe, Tibet  Dates  Steven Paul Jobs was born in San Francisco on February 24, 1955 personcitydate

13 NLP by examples (3)  Salient excepts  Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style...  As of October 9, 2011, Jobs is listed as primary inventor related to a range of technologies from actual computer and portable devices to user interfaces...

14 NLP by examples (3)  Salient excepts  Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style...  As of October 9, 2011, Jobs is listed as primary inventor related to a range of technologies from actual computer and portable devices to user interfaces...

15 NLP by examples (4)  You might be also interested in this and that …  Suggestions for similar content  According to the textual information  According to the persons, locations and dates  According to the key concepts and ideas  According to the genre and fictions characters  Cross-lingual Information Retrieval

16 NLP by examples (5)  Machine translation  Text summarization  Of a single document  Of a collection of documents

17 NLP in practice – ATLAS project  ATLAS – multilingual content management system which harnesses NLP technologies  Supported languages: Bulgarian, English, German, Polish, Romanian and Greek. www.atlasproject.eu ATLAS extracts and provides Key phrases and names entities A list of similar documents The automatic categorization and text summary Machine translation Using ATLAS Software-as-a-service: http://i-publisher.atlasproject.eu API for integration with 3 rd party systems

18 The ATLAS project  ICT PSP project  ATLAS consortium:  Coordinator: Tetracom Interactive Solutions – Bulgaria  DFKI - Deutsches Forschungszentrum Fuer Kuenstliche Intelligenz GmbH – Germany  Atlantis Consulting SA – Greece  Institute for Bulgarian Language “Professor Luybomir Andreychin” at the Bulgarian Academy of Sciences – Bulgaria  Instytut Podstaw Informatyki Polskiej Akademii Nauk – Poland  Universität Hamburg – Germany  Universitatea Alexandru Ioan Cuza – Romania  Sveucilište u Zadru – Croatia  ITD - Institute of Technologies and Development – Bulgaria  Project duration  3 years, counting from 1 st March, 2010

19 Conclusion?  What are the NLP technologies?  They provide a way to harness the computational resources of the computers for better information understanding  What can they be used for?  More effective way to handle the increasing amount of multilingual information  Who can use these technologies?  Libraries  Publishing houses  Medias  Online bookstores  Layers  Banks, companies and organization

20 Questions... ?


Download ppt "Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual."

Similar presentations


Ads by Google