Presentation is loading. Please wait.

Presentation is loading. Please wait.

JTS 2010, 3 May 2010 Context Sensitive Archiving of Videos on the Web Paper authors: Thomas Drugeon Valentine Frey Jérôme Thièvre Matteo Treleani.

Similar presentations


Presentation on theme: "JTS 2010, 3 May 2010 Context Sensitive Archiving of Videos on the Web Paper authors: Thomas Drugeon Valentine Frey Jérôme Thièvre Matteo Treleani."— Presentation transcript:

1 JTS 2010, 3 May 2010 Context Sensitive Archiving of Videos on the Web Paper authors: Thomas Drugeon Valentine Frey Jérôme Thièvre Matteo Treleani

2 Ina collections 2 Current collections 60 years of TV program and 70 years of radio program Legal deposit since 1992 4,500,000 hours of TV and radio + 1,000,000 hours captured live from 102 TV and radio channels each year Context sensitive archiving on the web| 2 mai 2010 Extension to the Web Web legal deposit law (2006), shared between BnF and Ina, as an extension to their current collections Ina is developing specialized tools and methods to collect, archive, preserve, and give access to this archived web collection → Preserve, promote, transmit

3 Web Legal Deposit 3 Archiving French audiovisual information on the web → Focus on audiovisual contents Context sensitive archiving on the web| 2 mai 2010 Why not only archive video and audio contents from the web? The web is not just a way to access contents, it is a media → Archiving websites related to French audiovisual media Operational since February 2009 as of april 2010: 6000 websites (3000 at start) ‏ 2,500,000,000 “objects”, 260 TB 10,000,000 video objects, 100 TB 19,000,000 autio objects, 100 TB → 260 TB compressed to only 21 TB of storage (DAFF) ‏

4 Methods 4 The web is not a broadcast media: no stream to capture, no explicit path to follow Context sensitive archiving on the web| 2 mai 2010 The web responds to interactions We have to discover and recreate these interactions to archive it → crawling Websites grow and change in heterogeneous ways We have to visit a page to know it was updated → sampling Accessing the archive means browsing it We have to recreate the interactions to make the archive browsable → simulating

5 Limits 5 Crawling Sampling Simulating Context sensitive archiving on the web| 2 mai 2010 Some updates will be missing Linked pages are crawled at a different date from the original page Some interactions cannot be crawled, and thus some contents will be missing or altered in the archive (pages or parts of pages) ‏ Dead web (train reservation, google search, etc.) ‏ Some interactions are lost (crawling issues) ‏ Temporal inconsistencies between pages (sampling issues) ‏

6 Web Archaeology 6 Authenticity: the document is what it pretends to be (Duranti, 2001) ‏ Reliability: we can trust the document and its content (Bachimont, 2009) ‏ Non-Integrity of web documents Integrity: the document hasn’t been altered (Lynch, 1994) ‏ The consequence of technical problems: How to preserve authenticity and reliability without depending on material integrity? Reconstructing the meaning of the document through traces (a sort of archaeological practice) ‏ DlWeb archives traces Context sensitive archiving on the web| 2 mai 2010

7 7 Context influences the meaning of a video posted on the web But not all the items of the context have the same impact on interpretation. Example Preserving the meaning of a video posted on the web means to preserve the significant elements of the context Meaning precedes the material form. Web Archiving: pre-eminence of the meaning We thus have to find the elements influencing the meaning. Context sensitive archiving on the web| 2 mai 2010

8 Example: The relocation of The Eiffel Tower 8 Ina.fr posted a news programme from 1964: the Eiffel Tower was to be relocated. The video provoked a buzz on the Web. Context sensitive archiving on the web| 2 mai 2010

9 9 Example: The relocation of The Eiffel Tower A methodological approach: The commutation test (from linguistics): The substitution of an item of the expression can cause a possible modification of the meaning Ex. changing a phoneme of a word (peer – beer). How to find which elements of the context to preserve in order to safeguard the archival value of the video (its correct interpretation) ? Context sensitive archiving on the web| 2 mai 2010

10 How to reconstruct the meaning in complex documents? 10 Where is the document and where the context? Web Documents are often complex and referring to a large spectre of cultural elements. Hypothesis We can reconstruct the meaning through a narrativization. Narrativization can be based on the research of clues It’s the critical historical approach called by Ginzburg “evidential paradigm” (clues are in this case the significant elements found through the commutation test). A Sherlock Holmes’ approach… Context sensitive archiving on the web| 2 mai 2010

11 11 Example: narrativization based on clues The Dailymotion channel of Gameblog.fr posts a news report on France 2 from the 21 st of November 2004, and explains that the content was an amalgam of fake news. It announces a collective suicide in Japan: 147 people committed suicide because of a delay in the release of a videogame (Dead or Alive). They swallowed some sachets of silicon… Context sensitive archiving on the web| 2 mai 2010

12 12 A link in a comment allows us to better understand what happened. France 2 cited an articled which appeared in the newspaper Libération, reporting a collective suicide in Mars 2004. The source of the article was a Blog post. Example: narrativization based on clues Context sensitive archiving on the web| 2 mai 2010

13 13 The post was satirical: it appeared on the webzine Xbox Mag to mock the excessive interest in the release of this product by videogamers. Example: narrativization based on clues Context sensitive archiving on the web| 2 mai 2010

14 14 The editors of Xbox Mag advised France 2 and Libération about the error. The 25 th of November Libération presented a rectification. The 26 th of November France 2 announces the error blaming the “Anglo-Japanese press” (their only source was Libération) Example: narrativization based on clues Context sensitive archiving on the web| 2 mai 2010

15 The complexity of a web document The problem of the completeness of traces To understand the facts we need no less than 3 web pages often not interrelated: -The video posted on Dailymotion -The original post on Xbox Mag -The post on Xbox Mag explaining the errors The Web always refers to (and remediates) other medias: -The archival video of France 2 (conserved at Inathèque) ‏ -The press: Libération 15 The Intrinsic Value of a Web Document Web Archiving is the most complete way to reconstruct these events (TV and press are not sufficient) ‏ The example reveals: Context sensitive archiving on the web| 2 mai 2010

16 How to help reconstructing the narration? 16 Give access to the researcher to all available technical and methodological information (ie archiving context) ‏ → clues Context sensitive archiving on the web| 2 mai 2010 DlWeb archives traces Develop tools to help the researcher to organise and exploit these clues → Methodological DlWeb workshops with audiovisual researchers, archivists and documentalists Improve completeness


Download ppt "JTS 2010, 3 May 2010 Context Sensitive Archiving of Videos on the Web Paper authors: Thomas Drugeon Valentine Frey Jérôme Thièvre Matteo Treleani."

Similar presentations


Ads by Google