LRI Université Paris-Sud ORSAY Nicolas Spyratos Philippe Rigaux
Université Paris-Sud One of the largest scientific Universities in France Five campuses Scientific campus located at Orsay (about 25 Kms south of Paris) Students Over ten departments (physics, mathematics, computer science…)
Department of Computer Science 250 members (researchers, teachers) Currently offering 16 programs Two laboratories: LRI (11 research groups) LIMSI (8 research groups) Fundings: Government, CNRS, European projects
SeLeNe related activities Nicolas Databases Conceptual modeling Information integration Philippe Databases (including spatial DB) A strong practical experience in Web environments based on XML Nicolas + Philippe : document integration and restructuring
Motivation In a nutshell: collaborative production of [e- Learning] documents Some preliminary ideas … Authors produce documents A system manages the set of documents Users create new documents by assembling/restructuring existing ones A scenario based on a cooperative, distributed, e-learning system. … and many questions
Preliminary ideas: authors Author = content producer Uses his own structure and vocabulary Stores his documents in his own repository Author = a conscious part of a collaborative system Provides a description of his documents to the system Commits to maintain an up-to-date and available version of each document
Preliminary ideas: the system The system enables cooperation between authors It knows the description provided by each author It can access (and possibly store locally) the documents The system acts as a mediator for users It defines a uniform view for all the documents It provides querying and restructuring services to create new documents
Preliminary ideas: the user The user publishes documents In a specific form (a book, a portal, a set of slides) Using specific choices for the content and the structure The user creates new (derived) documents by Extracting fragments from the documents managed by the system Authoring his own fragments, then integrating them with the extracted ones Materializing at will the result
Keywords Content management How to structure (e-Learning) content and how to describe this structure Content integration How to provide a uniform “view” to query documents and extract fragments Deriving and restructuring document How to create new documents by assembling fragments of existing ones
A simple scenario Three authors, A, B and C, cooperate to produce a course on database systems Author A produces content on data modeling An introduction to the topic Chapters on database design, the relational model and SQL Author B produces content on system aspects Database indexing Query processing and optimization
Contents description Each author uses his own terminology to describe his documents A fragment is any identifiable subset of a document Any fragment must be indexed under some term.
The system We assume a commonly agreed structure for the area of databases Each author must provide a mapping between his terminology and the systems’ terminology The system provides query facilities
Deriving new documents Structure: The user is free to choose the structure of the document he composes Composition: Each fragment is Either directly provided by the user Or chosen from the answer to a query addressed to the system
Query refinement A multi-step process: Initial query shows all the relevant fragments known to the system Subsequent steps restrict the fragments to those considered as relevant to the user Ideally: the refined query delivers exactly the relevant fragments and in the right order
Example (user/teacher) Author C is now a teacher, creating an introduction to DB. It contains A general introduction (written by C) A query retrieving introduction written by A A query selecting fragments on database design (retrieved from A’s documents) An introduction to query processing, with queries retrieving figures from B documents. Questions: assuming a query returns a set of fragments, how can we make a sub- selection
Example (user/learner) Author C is now a learner. He will create a document summarizing the courses he is interested in, namely A query retrieving the general introduction to DB (written by C) His own annotations Several queries, whose results will be mixed with the annotations Questions: how can we make queries “user- friendly”? E.g., as a “path” to the relevant fragment? Relying on “metadata”?
Example: personalized documents Author C is now a learner. The system knows the courses followed by C, maybe with other information (frequency, success, whatever) relates to “knowledge trajectories”? => the system maintains and updates automatically the document summarizing the course’s material instance of the “learning trail” concept?
Questions Primitive versus derived documents (problem of cycles)? How can we select a subpart of a result set? Should we allow users to browse directly the sources? What is the granularity of documents? Is there a need for user’s views? Should we introduce replication of content, and how?