Download presentation
Presentation is loading. Please wait.
Published byDomenic Robinson Modified over 8 years ago
1
Разширяване на кръгозора: Използване на лингвистични технологии в системи за публикации ICT PSP call identifier: CIP-ICT-PSP-2009-3 Theme 5: Multilingual Web 5.3 Multilingual Web content management - methods, tools and processes
2
The information today Flood of multilingual and heterogeneous information The challenge: The information has to be processed and analyzed in order to be used more efficiently
3
The information today Increasing amount of multilingual and heterogeneous information
4
The information today
5
The Language Technologies (LT) The computers process the information; humans do understand it. The computers has limited resources to understand the information; the humans has limited resources to process the information. The NLP technologies optimizes the level of understanding of the computers and thus increase the productivity of the humans.
6
Overview The NLP technologies by examples NLP in practice – the ATLAS project Conclusions Questions
7
NLP by examples (1) Divide and Conquer Grouping the information: By importance
8
NLP by examples (1) Divide and Conquer Grouping the information By importance Automatic text categorization Politics (24) Sports (5) Entertainment (5) Technologies (12) Science (20) Rumors (6) Other (10)
9
NLP by examples (1) Divide and Conquer Grouping the information: By importance Automatic categorization Text clustering Politics (24) International affairs (12), Conflicts (3), Terrorism (5), Nature and Environment (8),... Science (20) Math (2), Physics (5), Nature and Environment(3), NLP technlologies (4),... Other (10) Money and Banks (3), Richard Branson (4), Learning materials (3),...
10
NLP by examples (1) Temporal dynamics Before, Now, Tomorrow? Politics (24 + 3) International affairs (10 -2), Конфликити (3), Terrorism (6 +1), Nature and Environment (10 +2),...
11
NLP by examples (2) We do value your opinion! Positive, negative or objective?
12
NLP by examples (3) Salient excepts Persons politics, actors, scientists, fictions characters Organizations and Institutions NATO, EU, BAS, Bank of England, Google, Apple, … Geographical locations Bulgaria, Sofia, EU, Western Europe, Tibet Dates Steven Paul Jobs was born in San Francisco on February 24, 1955 personcitydate
13
NLP by examples (3) Salient excepts Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style... As of October 9, 2011, Jobs is listed as primary inventor related to a range of technologies from actual computer and portable devices to user interfaces...
14
NLP by examples (3) Salient excepts Jobs was a demanding perfectionist who always aspired to position his businesses and their products at the forefront of the information technology industry by foreseeing and setting trends, at least in innovation and style... As of October 9, 2011, Jobs is listed as primary inventor related to a range of technologies from actual computer and portable devices to user interfaces...
15
NLP by examples (4) You might be also interested in this and that … Suggestions for similar content According to the textual information According to the persons, locations and dates According to the key concepts and ideas According to the genre and fictions characters Cross-lingual Information Retrieval
16
NLP by examples (5) Machine translation Text summarization Of a single document Of a collection of documents
17
NLP in practice – ATLAS project ATLAS – multilingual content management system which harnesses NLP technologies Supported languages: Bulgarian, English, German, Polish, Romanian and Greek. www.atlasproject.eu ATLAS extracts and provides Key phrases and names entities A list of similar documents The automatic categorization and text summary Machine translation Using ATLAS Software-as-a-service: http://i-publisher.atlasproject.eu API for integration with 3 rd party systems
18
The ATLAS project ICT PSP project ATLAS consortium: Coordinator: Tetracom Interactive Solutions – Bulgaria DFKI - Deutsches Forschungszentrum Fuer Kuenstliche Intelligenz GmbH – Germany Atlantis Consulting SA – Greece Institute for Bulgarian Language “Professor Luybomir Andreychin” at the Bulgarian Academy of Sciences – Bulgaria Instytut Podstaw Informatyki Polskiej Akademii Nauk – Poland Universität Hamburg – Germany Universitatea Alexandru Ioan Cuza – Romania Sveucilište u Zadru – Croatia ITD - Institute of Technologies and Development – Bulgaria Project duration 3 years, counting from 1 st March, 2010
19
Conclusion? What are the NLP technologies? They provide a way to harness the computational resources of the computers for better information understanding What can they be used for? More effective way to handle the increasing amount of multilingual information Who can use these technologies? Libraries Publishing houses Medias Online bookstores Layers Banks, companies and organization
20
Questions... ?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.