Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)

Similar presentations


Presentation on theme: "The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)"— Presentation transcript:

1 The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)

2 Data Sharing Communities Examples: UCSC genome browser, SwissProt, Flickr Interesting data management problem –Shared information is heterogeneous –Data is distributed and dynamic –Lack of central administration –Users are not database savvy Data sharing community: a group of users that share and query information within some domain

3 The Data Ring P2P middleware system that provides: –Monitoring –Querying –…and other database-like services over the distributed information Main goal: simplicity of use

4 Data abstraction in the data ring Topological layer Physical layer External layer

5 Data abstraction in the data ring Declarative query services Data and query model based on XML Topological Layer

6 Data abstraction in the data ring Basic service is distributed query evaluation Comprises the overlay network (DHT), physical access structures (indices, replicas, views), and the catalog. Physical Layer

7 Data abstraction in the data ring Provides semantically richer data models External Layer

8 Data abstraction in the data ring Our focus is on the topological and physical layer External layer is equally important and an active research area Physical Layer Topological Layer

9 Thesis #1: formalism for distributed XML data and queries

10 Distributed XML data and queries What made the relational model successful: –A logic for describing tables –An algebra for query optimization We need the equivalent for trees in a distributed context: –A logic for describing distributed XML data –An algebra for optimizing distributed XML queries

11 Desiderata for description logic Seamless transition between data and services –Important for loose data integration Support for XML streams –Streams are essential for subscription services –They are also necessary to support recursion

12 Starting point: AXML AXML: XML tree with embedded web service calls –Seamless transition between intentional and extensional data –Provides a simple mechanism for loose data integration Core concept: XML streams –A web service call returns a stream of elements –Support for both push and pull semantics

13 Desiderata for algebra Be amenable to rewrites Capture the topology of distributed computation Allow seamless transition between logical and physical state –Plans may need to be re-optimized in mid-flight –It may be necessary to perform partial optimization –Error recovery

14 A proposal based on AXML A distributed plan is a workflow of web services … which is exactly a AXML tree Components: –An encoding of distributed plans in AXML –Rewrite rules A nice bonus: plans can be readily exchanged between nodes

15 Disclaimer AXML is a starting point, not a panacea Bottom line: we need formalisms for distributed XML queries

16 Thesis #2: autonomic administration

17 Autonomic administration Users are not database experts –Typically, scientists with computer experience Users are averse to too many “knobs” No central authority that is responsible for administration Autonomic administration is a necessity -- not a gadget

18 Facets of autonomy Self-monitoring Self-tuning Self-healing

19 Some issues System integration Distribution On-line tuning Pro-active tuning

20 Distributed vs. local tuning Distributed tuning –Based on the global workload –Catalog organization, replication Local tuning –Based on local workload –Physical design tuning

21 Data activation for files A large portion of the data is expected to be in files We need to develop query processors for data residing in files File activation: optimize access to the file based on the local workload –E.g., instantiate an index on file contents or materialize a relational view Local tuning is essential in this context


Download ppt "The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)"

Similar presentations


Ads by Google