Scaling Microblogging Services with Divergent Traffic Demands Presented by Tianyin Xu Tianyin Xu, Yang Chen, Lei Jiao, Ben Zhao, Pan Hui, Xiaoming Fu University.

Slides:



Advertisements
Similar presentations
The Future of Communications in Higher Education Joseph Hice, APR, CPRC Chief Communications Officer Associate Vice Chancellor NC State University.
Advertisements

Exploring User Social Behavior in Mobile Social Applications Konglin Zhu *, Pan Hui $, Yang Chen *, Xiaoming Fu *, Wenzhong Li + * University of Goettingen,
Cong Ding, Yang Chen*, and Xiaoming Fu University of Göttingen
Cuckoo – Decentralized and Socio-Aware Online Microblogging Services Xiaoming Fu Computer Networks Group, Institute of Computer Science University of Göttingen,
Presentation | P2P Media Summit CacheLogic Advanced Solutions for P2P Networks Presentation by Andrew Parker, CTO
PeerApp Proprietary and Confidential P2P Application Management for Service Providers P4PWG, January 2008 Alan Arolovitch.
Abacast - Confidential1 Hybrid Content Delivery Network (CDN) Technologies and Services.
1Abacast - Confidential1 Hybrid Content Delivery Network (CDN) Technologies and Services.
1Abacast - Confidential1 Hybrid Content Delivery Network (CDN) Technologies and Services.
Getting Started with Social media YouTube, We Tube Jayme Swain Director, Engage.
High-Fidelity Switch Models for SDN Emulation
1 Jin Li Microsoft Research. Outline The Upcoming Video Tidal Wave Internet Infrastructure: Data Center/CDN/P2P P2P in Microsoft Locality aware P2P Conclusions.
What every School District should know about the Social Web February 25, 2011.
Inner Architecture of a Social Networking System Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner.
Boosting your journals presence through social media.
Taming User-Generated Content in Mobile Networks via Drop Zones Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
Proposal by CA Technologies, IBM, SAP, Vnomic
Opportunistic Multipath Forwarding in Publish/Subscribe Systems Reza Sherafat Kazemzadeh AND Hans-Arno Jacobsen Middleware Systems Research Group University.
Queuing and Caching to Scalability James Kovacs
Advanced 5G Network Infrastructure for the Future Internet From IoT to U-HDTV, ubiquity Restless Pressure on bandwidth, spectrum crunch Complex traffic.
Modeling User Activities in a Large IPTV System Tongqing Qiu, Jun (Jim) Xu (Georgia Tech) Zihui Ge, Seungjoon Lee, Jia Wang, Qi Zhao (AT&T Lab – Research)
謝文婷 SocialTube: P2P-assisted Video Sharing in Online Social Networks Authors: Ze Li ; Haiying Shen ; Hailang Wang ; Guoxin Liu ; Jin Li.
PSRC Technology Integration Team TWITTER 101.  Twitter is a social networking tool or microblog.  It is composed of short text, pictures, and URLs called.
By Lee Betancourt Director of Communications and Public Relations Jane Myers Public Relations, Communications and Social Media Coordinator Social Media.
Twittering by Cuckoo – Decentralized and Socio-Aware Online Microblogging Services Tianyin Xu Yang Chen Nanjing University, University of Goettingen University.
SOCIAL MEDIA. TODAY Business Today Social Media Importance What is Social Media Social Media Platforms Facebook & Twitter Accounts.
Social Media Networking Sites Charlotte Jenkins Designing the Social Web
TDTS21: Advanced Networking Lecture 8: Online Social Networks Based on slides from P. Gill Revised 2015 by N. Carlsson.
A Hierarchical Characterization of a Live Streaming Media Workload IEEE/ACM Trans. Networking, Feb Eveline Veloso, Virg í lio Almeida, Wagner Meira,
Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.
DotSlash: Providing Dynamic Scalability to Web Applications Weibin Zhao and Henning Schulzrinne Department of Computer Science, Columbia University More.
 Why would you want to be connected? o To make online connections that will improve your efficiency and speed o To provide a near instant platform.
Welcome Social Media Basics. Who: Three social chicks with a passion for engagement, collaboration and building relationships online and
First of all….. it’s never too late to start. “blogger” is not “viral” is not “avatar” is not.
Can Internet Video-on-Demand Be Profitable? SIGCOMM 2007 Cheng Huang (Microsoft Research), Jin Li (Microsoft Research), Keith W. Ross (Polytechnic University)
Social Media Marketing Plan By D.U.E.S. (Tony, Sara, Shannon, MaryAnn)
Yang Chen.  More and more people are using online SNS to share their photos, news, …  Large Amount of data from the SNS site to the end users  How.
1/36. 2/36 Towards Sustained Scalability of Communication Networks Mike P. Wittie
Web 2.0 for Businesses How You Can Use Social Media to Bring in Money & Promote Your Brand Kimberly L. Sanberg Director of Online Strategy, Ignitus presentation.
1 Speaker : 童耀民 MA1G Authors: Ze Li Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA Haiying Shen ; Hailang Wang ; Guoxin.
Welcome to Social Media How to facebook, link, and tweet your way around the web.
Computer Applications Unit B
How CareSearch uses social media to promote palliative care and interact with consumers and health professionals Tieman JJ, Koop E CNSA Conference July.
Social Media is: ? Social Media: are media designed to be disseminated through social interaction, created using highly accessible and scalable publishing.
Travis Portz.  Large, sudden increases in the traffic to a website  Low-traffic website being linked to by a popular news feed  “Slashdot Effect” 
DELAYED CHAINING: A PRACTICAL P2P SOLUTION FOR VIDEO-ON-DEMAND Speaker : 童耀民 MA1G Authors: Paris, J.-F.Paris, J.-F. ; Amer, A. Computer.
Gil EinzigerRoy Friedman Computer Science Department Technion.
Brad Blake Director, New Media & Online Strategy Massachusetts Office of the Governor
Cuckoo: Towards Decentralized, Socio-Aware Online Microblogging Services and Data Measurements Tianyin Xu Yang Chen Nanjing University, University of Goettingen.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Main trends affecting research and innovation in the communications networks area Societal drivers: Urbanisation Smart cities Mobility Information availability.
LENS: Leveraging Social Networking and Trust to Prevent Spam Transmission Sufian Hameed Advisors Prof. Xiaoming Fu (UoG) Dr. Pan Hui (Deutsche Telekom.
SocialTube: P2P-assisted Video Sharing in Online Social Networks
Networked Games Objectives – –Understand the types of human interaction that a network game may provide and how this influences game play. –Understand.
Social Media and the Internet By: Trevor Babich. Social Media Background Social media are Internet sites where people interact freely, sharing and discussing.
Stefanos Antaris A Socio-Aware Decentralized Topology Construction Protocol Stefanos Antaris *, Despina Stasi *, Mikael Högqvist † George Pallis *, Marios.
CoopNet: Cooperative Networking
Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang Miriam Metzger, Haitao Zheng and Ben Y. Zhao Computer.
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
Our Place in the Cloud DCIA P2P & Cloud Market Conference March 9, 2010.
Using IT for Effective Communication Promotion and Communication Strategies for CLCs.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
NET8 Protocol Analysis & Emulation Guided by Dr. Ran Giladi Students: Michal Bukai Ran Steinherz.
3 | Analyzing Server, Network, and Client Health
Social Media.
Dissemination of Dynamic Data on the Internet
Challenges with developing a Commercial P2P System
Submitted To: Submitted By: Seminar On Social Media.
Presentation transcript:

Scaling Microblogging Services with Divergent Traffic Demands Presented by Tianyin Xu Tianyin Xu, Yang Chen, Lei Jiao, Ben Zhao, Pan Hui, Xiaoming Fu University of Goettingen, UC San Diego UC Santa Barbara, Deutsche Telekom

Year User Population 1, , ,000,000 Microblogging services are growing at exponential rates! 75,000,000 5,000,000

Not only Twitter!

Current Architecture polling nothing new!!! All is traffic waste!

Current Architecture polling something new!!! Still traffic waste!

What about millions of users polling?

How is the availability and performance of these microblogging services? (Measurement Study on Twitter) Measurement period: Jun. 4 – Jul. 18, Including a flash crowd event: FIFA World Cup 2010

Measurement Study on Twitter Twitters performance and availability is not satisfying (even at normal time). The flash crowd event has an obvious impact on both performance and availability. 6/4 6/11 7/11 7/18 World Cup 6/4 6/11 7/11 7/18 World Cup

Twitters Short-Term Solutions - Rate limit Only allows clients to make a limited number of calls in a given period Twitter: 150 requests per hour, 2,000 requests for whitelist - Upper limit on the number of followees Orkut: 1000, Flickr: 3000, Facebook: 5000, Twitter: 2000 before 2009, now using a more sophisticated strategy 2. Network usage monitoring 3. Doubling the capacity of internal network 1. Per-user request and connection limits identi.ca jaiku emote.in Chinese Sina microblogging The Problems are still there!!!

How about push? I have 5 followers! Social Network Usage

How about these guys? 16,000, ,000, 000 5,000, Either celebrities or news media outlets.

What will happen when ladygaga has something to say? I have 16,000,000 followers! News Media Usage

How different the two kinds of usage models contribute to the traffic? Analysis of large-scale Twitter user trace - 3,117,750 users profiles, social links, tweets Consider two built-in Twitter interaction models - POST and REQUEST Differentiate social network usage and news media usage by threshold Only users with followers <1000 show assortativity* *H. Kwak et al., What is Twitter, a Social Network or a News Media? WWW 2010.

The results of the divergent traffic Social network usage holds the majority of incoming server load (~95%). News media usage occupies a great proportion of outgoing server load (~63%). Incoming traffic load (10^3) Outgoing Traffic Load (10^3)

The difference between the two components. Social Network Component News Media Component a few followerslarge numbers of followers most symmetric linksmost asymmetric links not active in updating statuses very active in reporting news great incoming trafficgreat outgoing traffic

16 What makes microblogging systems like Twitter hard to scale? They are being used as both the social network and the news media infrastructure at the same time! There is NO single dissemination mechanism can really address both two at the same time!!

Decouple the two components - Complementary delivery mechanisms direct unicast push gossip dissemination

18 System Architecture (Cuckoo) Cloud servers (a small server base) Ensure high data availability Maintain asynchronous consistency Host all the user contents Cuckoo peers (peers at network edge) Data delivery -Abandon naïve polling Decentralized user lookup

Unicast Delivery for SocialNet follower Serial unicast delivery …… 1. simple 2. reliable

Gossip Dessimination for MediaNet …… follower partner Gossip dissemination Pros: 1. scalable 2. resilient to network dynamics 3. load balance Cons: 1. each node has to maintain partners

21 2. Due to uncertainty of gossip and unreliable channels Message Loss -- regain lost tweets in offline period efficient inconsistency checking based on the timeline 1.Due to asynchronous access -- exploit unique statusId to check UserIdSequence Number gap between the sequence number means message loss

22 Differentiate user clients into three categories: Support for Client Heterogeneity Cuckoo-Comp Stable nodes Construct DHT and provide DHT-based user lookup Participate in message dissemination Cuckoo-Lite Lightweight clients (i.e., laptops) Do not join DHT Only participate in message dissemination Cuckoo-Mobile Mobile nodes Neither join DHT nor message dissemination Over 40% of all tweets were from mobile devices, up from only 25% a year ago.

23 Dataset 1.Twitter dataset containing 30,000 user information 2.MySpace dataset to model session durations 3.Classify the three categories of Cuckoo users according to their daily online time ~50% of Cuckoo peers are Cuckoo-Mobile clients.

24 Implementation and Deployment A prototype of Cuckoo using Java comprises both the Cuckoo peer and the server cloud. - Cuckoo client: 5000 lines of code - Server cloud: 1500 lines of code 30,000 Cuckoo clients on 12 machines 4 machines to build the server cloud

25 Server Cloud Performance - Resource Usage CPUMemoryIncoming/Outgoing traffic 2. ~50%/~16% memory usage reduction at peak/leisure time3. ~50% bandwidth savings for incoming/outgoing traffic Results 1. ~50% CPU usage reduction

Cuckoo Peer Performance - Message Sharing Results - 30% of users get more than 5% of tweets from other peers - 20% of users get more than 10% of tweets from others -> The performance is mainly impacted by user online durations -> The MySpace duration dataset leads to a pessimistic deviation jaiku emote.in Chinese Sina microblogging Expecting better performance in Cuckoo

Cuckoo Peer Performance - Micronews Dissemination Results 1.95+% coverage rate of content dissemination 2. 90% of valid micronews received are within 8 hops 3. 89% of users receive less than 6 redundant tweets per dissemination round

Related Work Microblogging and Pub-Sub Systems –[Rama_NSDI2006], [Sandler_IPTPS2005], Measurement Study on Microblogging –[Ghosh_WOSN2010], [Krish_WOSN2007], [Kwak_WWW2009], [Cha_ICWSM2010], Decentralized Microblogging –[Sandler_IPTPS2009], [Buchegger_SNS2009], [Shakimov_WOSN2009]

Conclusion A detailed measurement of Twitter A novel system architecture tailored for microblogging to address scalability issues Relieve main server burden Achieve scalable content delivery Decoupling the dual functionality components A prototype implementation and trace-driven emulation over 30,000 Twitter users Notable bandwidth savings Notable CPU and memory reduction Good performance of content delivery/dissemination

Acknowledgement Opera Group U.C. San Diego, USA Middleware Conference

Thank you very much!!