Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 INTEGRATION OF THE TEXTUAL DATA FOR INFORMATION RETRIEVAL : RE-USE THE LINGUISTIC INFORMATION OF VICINITY Omar LAROUK ELICO -ENS SIB University of Lyon-France.

Similar presentations


Presentation on theme: "1 INTEGRATION OF THE TEXTUAL DATA FOR INFORMATION RETRIEVAL : RE-USE THE LINGUISTIC INFORMATION OF VICINITY Omar LAROUK ELICO -ENS SIB University of Lyon-France."— Presentation transcript:

1 1 INTEGRATION OF THE TEXTUAL DATA FOR INFORMATION RETRIEVAL : RE-USE THE LINGUISTIC INFORMATION OF VICINITY Omar LAROUK ELICO -ENS SIB University of Lyon-France Computational linguistics & Information Science École Nationale Supérieure des Sciences de l’Information et des Bibliothèques ENSSIB-Lyon, Université de Lyon, France

2 FRAMEWORK PRESENTATION - IMPORTANT PRODUCTION OF TEXTS (web) -IMPORTANT RETRIEVAL OF TEXTS BY USERS INFORMATION AIM OF PROJECT :-LOOK FOR A SYSTEM WHO : -TO AUTOMATE MANUAL PROCESS OF INDEXING -TO STRUCTURE THE INFORMATION -TO PERMIT TO OBTAIN ‘pertinent’ INFORMATION

3 FRAMEWORK PRESENTATION METHOD : Elaboration of Filtering DATABASES from Elaboration of Filtering DATABASES from informations extracted from the textual data –Use of linguistic techniques for processing documents (textual data). –Search strategies that allow the user to exploit documents.

4 Why linguistic reference in information retrieval We think that looking of information is a presupposition of the speaker in a context at first of indexing of the document (during the creation) as if the designer pronounce a contextual sentence. The absence of words presupposed in the answer given by the server (search engines) can be interpreted as the lack at the truth of this sentence in this context of indexation.

5 Information Retrieval on the WEB Co-operative process for reformulation WEB Answers A 0, A 1,...A k,..., A n user(s) Queries Q 0, Q 1,...Q k,..., Q n ex: Google, Yahoo Title, Abstract,

6 A Search Engine is used to retrieve electronic documents on the Web but they not use linguistic analysis. When a user types a query: – the engine returns a ranked list of documents that contain exactly all the words of its request (query processing). –looking for documents made by the user requires writing a query in the form of few words, so that the search engine is to return the URL of classified documents containing those words exactly. Linguistic problem of Search Engines :

7 20 user-Queries system-Answers steps steps : Q0Q0 A0A0 Q1Q1 A1A1 A2A2 A k+1 Q2Q2 QkQk Q n-1 A n-1 s0s0 s1s1 sksk snsn s n-1 A n =  n   n is the filtered final answers with the oriented needs where  n   et  n   QnQn AnAn...

8 Documentary Information system Documentary automatization is blocked by the problem of indexing and interrogation Use natural language as textual data for automatic indexing because language integrates the contextual and temporal factor through connectors, verbs, adverbs, etc.. We use the LINGUISTICAL APPROACH

9 Predicates Queries : simple or complex Predicates Questions : simple Predicates –/project/, /law/, /station/, /flag/, /black/, /white/,.. combining predicate : –/ white flag/, /white and black/, /..... Propagation of the predicates left and/or right...... –/... project of law... / ;...... / ; – /... flag white and black/... / ;...

10 15 LINGUISTIC INFORMATION OF VICINITY: Hierarchic informational Les produits informatiques de la société in French computer products company in English

11

12

13

14

15

16 The problem of the linearity in searching information multilingual The reach can exist to left/right or to straight of the center of NP: predicate -n, …, predicate -1,predicate 0, predicate +1, …, predicate -n predicate -n, …, predicate -1, Center_NP, predicate +1, …, predicate -n Linearity in French : / une jeune -1 fille 0 blonde +1 dynamique +2 / Linearity in English : / A Young -1 blonde +1 girl 0 dynamic +2 /

17 Reformation of questions We propose to processing natural language using the contribution of reformulate the queries for computer-human dialogue. The reformulation of questions in natural language using classical technique for information retrieval in databases, but the new systems of question/answers are a very open in the web content. we present a study based on the formulation of ‘questions key’ based on ‘core predicates’ enrichment by left or right of center of terms.

18 How the search engine gives responses ? For example, if you search 'informati * ', it does not contain, computerization (informatisation in French, ….).

19 Lack of technical language in search engines __________________________________________ The search tools and web database following are not equipped with linguistic tools : -Information retrieval in structured documents, -Multilingual Information retrieval, - Personalized Information Retrieval, - Information automatic processing of natural language, - Search for information based on Ontology, -Multimedia Information Retrieval, -Answers/Questions, -etc.


Download ppt "1 INTEGRATION OF THE TEXTUAL DATA FOR INFORMATION RETRIEVAL : RE-USE THE LINGUISTIC INFORMATION OF VICINITY Omar LAROUK ELICO -ENS SIB University of Lyon-France."

Similar presentations


Ads by Google