Presentation is loading. Please wait.

Presentation is loading. Please wait.

Managing Change on the Web Luis Francisco-Revilla Frank M. Shipman Richard Furuta Unmil Karadkar Avital Arora Center for the Study of Digital Libraries.

Similar presentations


Presentation on theme: "Managing Change on the Web Luis Francisco-Revilla Frank M. Shipman Richard Furuta Unmil Karadkar Avital Arora Center for the Study of Digital Libraries."— Presentation transcript:

1 Managing Change on the Web Luis Francisco-Revilla Frank M. Shipman Richard Furuta Unmil Karadkar Avital Arora Center for the Study of Digital Libraries Texas A&M University

2 What is this talk about? A system approach to help in managing digital libraries with collections of fluid resources with distributed location and ownership

3 Modern paradigms of digital libraries Pointers rather than the resources Web-based collections NSDL (http://www.ehr.nsf.gov/due/programs/nsdl/) Meta-documents High fluidity Changes vary in relevance Little system aid for assessing relevance of changes

4 This is a problem everybody has: Bookmark lists Yahoo! catalogues Search engines indices

5 Related work David Johnson PhD Dissertation, University of Washington Document distance Weighted, asymmetric Change monitoring systems AIDE, URL Minder, WatzNew Fine-grained yes/no detection WebWatcher (evolving) “Interesting” Identification Syskill & Webert, Do-I-Care-Agent, Letizia Personal, reader specific, profile-based

6 Motivation Managing Walden’s Paths collection Paths are meta-documents Sequential arrangement of Web pages Rhetorically coherent Contextualized Distributed ownership Distributed authorship Continuous revision of the collection

7 Mechanisms for addressing the issue Caching the pages Caching strategies Some changes are desirable Fluid paths Ephemeral paths Rhetorical coherence

8 The real issue Mechanisms only allowed limited reaction to changes Detecting changes is easy but determining the relevance is difficult Humans are still required to determine the significance of changes In order to react to changes the assessment of their relevance is required

9 The perception of change (overview) Observe how humans perceive changes of Web pages Inform and evaluate the approach and design Questions 1. Do people view the same changes in a different way when given different amounts of time? 2. What kind of changes are easily perceived? 3. Of what kind of changes do users want to be notified?

10 Kinds of change Content changes (what) Presentation changes (how) Structural changes (linking) Behavioral changes

11 Results and implications Presentation changes were usually perceived as irrelevant The desire of notification and the perception of overall change increased as the degree of content change did Time played a larger role for the perception of structural changes than for the content changes As the degree of structural change increased, so did the desire of notification Links are useful metrics

12 Path Manager: the system Java based Paths or bookmark lists HTML pages Functional state of the document Original Valid Last-time

13 Algorithms Variation of Johnson Weighted sum of additions, deletions and modifications for each metric Added metric for structure changes Flexible Asymmetric Lack normalization Proportional Determines the proportion of modification for each metric Simple Symmetrical Normalized

14 Initial interface

15 Overall change relevance assessment

16 Document signatures Paragraph checksums Headlines Links Keywords Global checksum

17 View of change metrics

18 Detailed view of page metrics

19 Path information

20 Web page retrieval and connectivity Potentially slow and unpredictable Parallel retrieval Multi-threaded Multiple attempts and retries Different states Connection state Retrieval state Analysis state

21 Challenges and limitations Heuristic identification of document structure (I.e. headings) Indirection Behavior Dynamic pages

22 Conclusions Managing distributed collections of documents remains challenging and time consuming requiring the assistance of humans The Path Manager supports the maintenance of collection of Web pages by recognizing, evaluating and informing the user of relevant changes keeps track of the original, valid and last-time state of Web pages The study conducted indicated the desire for structural changes to be included in the determination of overall change

23 Contact information Luis Francisco-Revilla l0f0954@csdl.tamu.edu Frank M. Shipman, III shipman@csdl.tamu.edu Richard Furuta furuta@csdl.tamu.edu Unmil Karadkar unmil@csdl.tamu.edu Avital Arora avital@csdl.tamu.edu


Download ppt "Managing Change on the Web Luis Francisco-Revilla Frank M. Shipman Richard Furuta Unmil Karadkar Avital Arora Center for the Study of Digital Libraries."

Similar presentations


Ads by Google