Presentation is loading. Please wait.

Presentation is loading. Please wait.

FeedTree: Sharing Web micronews with peer-to-peer event notification Dan Sandler, Alan Mislove, Ansley Post, Peter Druschel Presented by: Anupama Atmakur.

Similar presentations


Presentation on theme: "FeedTree: Sharing Web micronews with peer-to-peer event notification Dan Sandler, Alan Mislove, Ansley Post, Peter Druschel Presented by: Anupama Atmakur."— Presentation transcript:

1 FeedTree: Sharing Web micronews with peer-to-peer event notification Dan Sandler, Alan Mislove, Ansley Post, Peter Druschel Presented by: Anupama Atmakur Pooja Adudodla

2 Overview Introduction Background  Issues with RSS  RSS Bandwidth Enhancement to RSS Design of FeedTree FeedTree Implementation Conclusions and Future Work CS-791/891 Web Syndication Formats, ODU, Spring 2008

3 Early Stage of Web…… HTML pages being static. News web sites updated their content once or twice a day. User’s visit each individual site for updated content. CS-791/891 Web Syndication Formats, ODU, Spring 2008

4 Current Trend of Web….. Explosion of Micronews  Highly focused chunks of content.  Information being updated Frequently and irregularly scattered over multiple sites. CS-791/891 Web Syndication Formats, ODU, Spring 2008

5 How RSS saved the Web RSS Feed a popular way to handle this information flow. Latest stories of the publisher in an XML based format. By subscribing to the url of RSS feed user instructs the application to fetch data at regular intervals. CS-791/891 Web Syndication Formats, ODU, Spring 2008

6 Special reader software collects periodically the latest information. Its like an email for web news… Img:http://apadiv20.phhp.ufl.edu/newgif/rss2007.gif http://www.cs.rice.edu/~dsandler/talks/rss-iptps05.pdf

7 Issue with RSS Usage Due to incredible usage of RSS readers. Serious impact of bandwidth usage for the RSS providers. Popular provider of RSS feed have begun to eliminate their feed to reduce bandwidth stress of polling clients. CS-791/891 Web Syndication Formats, ODU, Spring 2008

8 RSS Bandwidth RSS uses polling based retrieval architecture. Based on the number of subscribers bandwidth is scaled linearly. CS-791/891 Web Syndication Formats, ODU, Spring 2008

9 Polling Architecture RSS reader poll the feed’s web server independently. Img: http://feedtree.net/images/diagrams/syndication-simple.png

10 Unable to Handle Web servers providing RSS feed tend to suffer with greater traffic load. Load on the web servers is affected due to below reasons: CS-791/891 Web Syndication Formats, ODU, Spring 2008

11 Polling  RSS application issues repeated HTTP requests for subscribed feed according to some set schedule. Superfluity  Each feed is limited to N most recent entries. For every request previous N entries are emitted irrespective of the fresh entries are new to the client.  Same content is being refreshed resulting in waste of bandwidth and load on servers. CS-791/891 Web Syndication Formats, ODU, Spring 2008

12 Stickiness  User subscribing to a feed of a popular website and not using the content after period of time.  Leading to unending load of RSS client. Twenty-Four-Hour-Traffic  RSS client running on the desktop computer even if user not present.  RSS reader generate 24 hours traffic from all over the earth. CS-791/891 Web Syndication Formats, ODU, Spring 2008

13 Real Scenario…. The New York Times front page alone claims 7,800 subscribers. Sum of subscribers to all its feeds comes to 24,000. Feeds from the Times tend to be around 3 KB, or 3.5 GB of data per day with 30-minute polling.

14 Improving the Polling Process Enhancing Polling  Avoiding transmission of feed content if the requested content is same as old content.  Gzip: server compresses the feed before returning it to the newsreader, thus decreasing the bandwidth usage.  Polling in a particular time. CS-791/891 Web Syndication Formats, ODU, Spring 2008

15 Outsourcing Aggregation End-user application is build upon the Central server. Central server provide remote procedural interface. This server is polled for updated data and takes in charge for polling authoritative RSS feed in wider internet. CS-791/891 Web Syndication Formats, ODU, Spring 2008

16 RSS Providers Outsourcing Aggregator Readers Central Server Img:http://apadiv20.phhp.ufl.edu/newgif/rss2007.gif http://www.cs.rice.edu/~dsandler/talks/rss-iptps05.pdf

17 Bandwidth Issue….. As the end user start polling the central server instead the website main server. The operation at the central server with have heavy traffic load at its end. CS-791/891 Web Syndication Formats, ODU, Spring 2008

18 Danger Inherits here…. Central RSS aggregator  Experience unavailability or outright failure.  Change in service any time.  Modifying, omit or augment RSS data without the user’s knowledge or concert. CS-791/891 Web Syndication Formats, ODU, Spring 2008

19 Solution-Distributed Approach Publisher web site distributing the new feed content to the list of subscribers  Good for small subscription lists but not for large groups.  Avoids unnecessary fetches but does not offer a solution for necessary fetches. FeedTree - replaces the polling component of news feeds with peer-to-peer multicast. CS-791/891 Web Syndication Formats, ODU, Spring 2008

20 FeedTree A birds eye view Based on p2p multicast network Band width cost distributed among peers Less load on network links close to source Subscribers receive content immediately as available Img: http://feedtree.net/images/diagrams/syndication-simple.png

21 Feed Tree Technical details Feed tree is a p2p overlay network based on the Scribe Scribe is a group communication and event notification protocol built on Pastry overlay network Pastry provides a self-organizing p2p network of nodes CS-791/891 Web Syndication Formats, ODU, Spring 2008

22 Why Pastry? Unstructured Networks  No underlying structure or organization in the network  Locating a node is problematic  Exhaustive search of the network  Maintenance of central index of all nodes Structured Overlay networks  Nodes are decentralized and self-organizing  Data can be received in logarithmic number of steps Choosing structured overlay networks like Pasty is thus attractive CS-791/891 Web Syndication Formats, ODU, Spring 2008

23 Pastry- Structured Overlay p2p Network Efficient request routing  Each Pastry node has a unique 128-bit nodeId  Pastry node can route a message with a numeric 128-bit key to the node with a nodeId that is closest to the key in O(log N) forwarding steps ( N is the number of live Pastry nodes in the overlay network) Load balancing  Size of the routing table maintained in each Pastry node is only O(log N) Allows application-specific computations  Ex: Allows Scribe to multicast data CS-791/891 Web Syndication Formats, ODU, Spring 2008

24 Work Flow of FeedTree Creation of RSS document with a time stamp and a sequence number Sign the RSS document with the publisher’s private key Multicast the RSS document in the Pastry overlay network using Scribe to the members of the Scribe group On receiving the document the authenticity is checked by verifying its signature and then added to the RSS client application. CS-791/891 Web Syndication Formats, ODU, Spring 2008

25 Implementation of FeedTree Full FeedTree - Implements full FeedTree architecture Incremental FeedTree - Publishers or Readers working on conventional RSS polling based retrieval architecture can also be renovated to take advantage of FeedTree architecture CS-791/891 Web Syndication Fomats, ODU, Spring 2008 Img: http://feedtree.net

26 FeedTree Design and Implementation Img: http://www.cs.rice.edu/~dsandler/pub/FeedTree-MSThesis-2007.pdf

27 Bootstrapping the Subscription Process How will the client application know if the feed is published through FeedTree?  FeedTree metadata added to the RSS document Metadata can be IP address or DNS name of a host that is already a member of a FeedTree network  The client application starts the subscription by making a conventional HTTP request to the publisher  Using the metadata, further updates are taken from the FeedTree. CS-791/891 Web Syndication Formats, ODU, Spring 2008

28 Heartbeat Time-to-live: is the maximum interval between consequent FeedTree events When there is no new data in “Time-to-live”, the publisher will send a heartbeat through the FeedTree The purpose of the heartbeat is to let the peers know that an authoritative feed publisher exists CS-791/891 Web Syndication Formats, ODU, Spring 2008

29 Benefits Providers  Low cost as the bandwidth is shared by all the participants  Opportunity to provide differentiated RSS services Users  Receive timely updates thus better news services CS-791/891 Web Syndication Formats, ODU, Spring 2008

30 Recovery of Lost Data Reasons for loss of data  Failures of nodes  Departures of nodes Detection of lost data  A missing sequence number or  A missing heartbeat Recovery of lost data  The client application polls the RSS publisher for retrieval CS-791/891 Web Syndication Formats, ODU, Spring 2008

31 Development Status The conventional RSS reader application is augmented with an intermediary tool called “HTTP Proxy” which serves the HTTP requests Working:  The HTTP proxy joins the FeedTree network.  The HTTP proxy actively listens to any request made by the reader application for a conventional feed via HTTP.  Any request made is immediately served by the proxy application by fetching the latest feed from the FeedTree network. CS-791/891 Web Syndication Formats, ODU, Spring 2008

32 Conclusions FeedTree is a very good alternative to conventional polling mechanism due to following reasons: It efficiently utilizes the available bandwidth without the need for adding any expansion of hardware or networking capabilities. Reduces load on network links near the server, by evenly distributing it among participating nodes. Can be scaled to accommodate large number of users while still maintaining low latency for arrival of new messages. CS-791/891 Web Syndication Formats, ODU, Spring 2008

33 References http://www.mpi-sws.mpg.de/~abpost/papers/RSS- IPTPS-draft.pdf http://www.mpi-sws.mpg.de/~abpost/papers/RSS- IPTPS-draft.pdf http://www.feedtree.net/ http://freepastry.rice.edu/SCRIBE/default.htm http://research.microsoft.com/~antr/pastry/default.ht m http://research.microsoft.com/~antr/pastry/default.ht m CS-791/891 Web Syndication Formats, ODU, Spring 2008

34 Questions? How is the adoption of FeedTree going to effect the conventional readers who don’t use FeedTree? CS-791/891 Web Syndication Formats, ODU, Spring 2008


Download ppt "FeedTree: Sharing Web micronews with peer-to-peer event notification Dan Sandler, Alan Mislove, Ansley Post, Peter Druschel Presented by: Anupama Atmakur."

Similar presentations


Ads by Google