Presentation is loading. Please wait.

Presentation is loading. Please wait.

U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003.

Similar presentations


Presentation on theme: "U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003."— Presentation transcript:

1 U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003

2 Peer-to-peer File-sharing Exploit storage capability of the edge Balance load Robustness to failure Weaknesses: Search and Communities

3 Search Problem Lack of structured metadata  Filenames, Keyword matching  Opaque identifiers  Support for popular formats Ignoring structured metadata  Implicit indicators  Collaborative filtering

4 State of the Art: Search MetadataNapster, Kazaa, Limewire, JxtaSearch Query RoutingGnutella, Routing Indices, Limewire, Neurogrid CommunitiesJxtaSearch, Alpine, Associative P2P Search in DHTsPIER, FASD, Inverted Indices

5 Community Problem Not simple to create a community for sharing a new file format Current state  Different protocols/apps (gnutella, fasttrack, jxtasearch)  Inadequate metadata (filename matching, limited schemas)  Ad-hoc attempts aimed at specific domains Scattered and isolated – there is no easy way to discover communities

6 State of the Art: Communities OpaqueNo existing rich metadata search, no way to add it LimitedRich metadata search for some formats but no way to support new formats ImplicitImplicit indicators are used to identify communities, no way to specify explicitly PartialUsers can explicitly form groups but each grouping is in the eye of the beholder UnsharedUsers can explicitly direct rich metadata queries to a community, but response format is not specified

7 Improving Search Standard metadata layer  Explicit structured metadata  All resources are XML files  XML Schema used to describe format (e.g. MP3, design pattern)

8 Schema instantiates resource singleton gang of four when creating a new class… ensure a class only has… make the class itself responsible…

9 Automated interface generation resource xml schema resource create form resource search form resource resource view instantiates xslt

10 resource xml schema resource create form resource search form resource resource view instantiates xsl

11 resource xml schema resource create form resource search form resource resource view instantiates xsl

12 Community Creation and Discovery: What is a Community? Concrete object with defined tuple of attributes Simplest form: (format, protocol, …)  Known examples: (mp3, napster) (video, kazaa)  Examples that don’t exist: (design patterns, gnutella) (p2p papers, jxtasearch) Tuple is specified as a XML file

13 Simplifying Community Creation designpatterns designpattern.xsd gnutella designpattern.stylesheet User-designed communities  Compose schema to describe format  Compose community XML file

14 Community as class mp3 mp3 community mp3 mp3 class

15 Metaclass analogy mp3 mp3 community mp3 mp3 class communityclass

16 Community discovery is File discovery MP3 community shares MP3 files Community community shares communities mp3 mp3 community communitycommunity

17 Simplifying Community Discovery A Community for Communities: The Root Community Communities are files shared in a real community Root Community includes schema for communities (format, protocol) = (community, centralized db)

18 Schema for Communities root community community.xsd central-db community.stylesheet The Root Community

19 What is U-P2P? A framework that breathes life into these ideas Explicit metadata search and creation for every Community Creation of Community tuples  (format, protocol etc… ) Discovery of Community tuples

20 Design

21 Technologies Java Tomcat Servlet Container Java Server Pages (JSP) + Servlets XSLT (transforms), XPath (queries) Java components for XSLT, XPath (Xerces, Xalan) eXist XML Database Log4j (logging infrastructure), JUnit (unit testing)

22 Evaluation and Validation: Areas of Interest Publish and Search times as Community size increases Breaking down Publish and Search operations Community effect Multiple central servers

23 Publish

24 Search

25 Community Effect Average Publish Time Multiple communities 356 ms Single community485 ms

26 Multiple Central Servers

27 Publish with Multiple Servers ServerProcessorSpeedOS 1Pentium 41.8 GHzWindows Pentium II250 MHzLinux (RH7) 3Celeron1 GHzWindows XP

28 Vs. Without Multiple Central Servers ServerAvg. time to publish a file (750 files published) S1455 ms S21355 ms S3645 ms S1, S2, S3 (load-balanced) 517 ms

29 Contributions Standard Metadata Layer  All communities include support for explicit metadata search and creation User-designed Communities  Users can easily share new formats with full support for metadata Community for Communities  Prevents fragmented, isolated communities by providing metadata about communities and a standard method for discovering them Performance and Scalability Gains  Communities can improve performance and scalability vs. systems where resources are undifferentiated

30 Future Work Performance improvements Protocol independence (adapters for Gnutella, Freenet, etc.) Community-aware Gnutella routing More Community parameters (security, authentication, etc.)

31 Future Work continued Trust metrics (to differentiate between communities, metadata quality) Community evolution Inheritance and multiple inheritance for Communities

32 U-P2P Publications A. Mukherjee, B. Esfandiari, N. Arthorne, “U-P2P: A Peer-to-peer System for Description and Discovery of Resource-sharing Communities”, ICDCS Workshops 2002: , July Neal Arthorne, Babak Esfandiari and Aloke Mukherjee, "U-P2P: A Peer-to-peer Framework for Universal Resource Sharing and Discovery”, Proceedings of Freenix track of Usenix 2003, 29-38, June

33 Backup slides

34 WebAdapter: User Interaction Model

35 Repository Design

36 Repository Design: Resource IDs

37 Repository Design: XML Database Requirements  Flexibility to store wide variety of formats  Handle powerful queries over all metadata XML Database better suited than RDBMS  Difficult to map fields to rows and columns Chose eXist XML database  Open source  Written in Java  Support for XML:DB API

38 Network Adapter Design Abstract interface to Peer-to-peer Network  Routing search requests, handling results, handle incoming search requests, etc. Only implemented Hybrid model (Napster model) All peers can act as client and/or server

39 Network Adapter: Protocol

40 Evaluation and Validation: Challenges Finding large XML collections  Berkeley Drosophila Genome Project: genome annotations  Other sources: DBLP (CS papers), EDGAR (SEC filings), GeneOntology (gene-related concepts)  Transforming DTDs to XML Schema (DTDXS package) Automation  XML-RPC interface for publish and search

41 Publish: Breakdown of Operations

42 Publish: Client Timings

43 Publish: Server Timings

44 Network Adapter: Protocol

45 Search: Breakdown of Operations

46 Search: Total vs. Server Timings


Download ppt "U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003."

Similar presentations


Ads by Google