Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.

Similar presentations


Presentation on theme: "Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries."— Presentation transcript:

1 Scalable Data Management@facebook

2 Scale

3 #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries Over 300 million active users More than 2 32 photos … 100 million search queries per day > 3.9 trillion feed actions processed per day 2 billion pieces of content per week 6 billion minutes per day

4 Growth Rate 2009 300M Active Users

5 Social Networks

6 nikos | METIS | 2012 6 OSNs are popular! OSNs have become wildly popular over last few years, FB > 800M, Twitter > 230M etc. Distributed across the planet Changed how content is created + consumed: inherently long-tailed as only ‘ friends ’ are interested Explosion of smartphones: -Photos/HD videos easy to shoot and share

7 Scaling Social Networks ▪ Much harder than typical websites where... ▪ Typically 1-2% online: easy to cache the data ▪ Partitioning & scaling relatively easy ▪ What do you do when everything is interconnected?

8 name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo

9 System Architecture

10 Overall achitecture: Facebook ▪ Facebook has 2 datacenters, 1 per coast ▪ reads spread across both ▪ writes only to W. Coast; periodically (~10 minutes) replicated to E. Coast ▪ >2000 MySQL servers, >25TB RAM for memcached ▪ Challenge: inconsistency due to stale data ▪ I change status message => Friends on East Coast datacenter don’t see change for 10 min ▪ What if E.Coast person changes own status?? 10

11 Web at 100 feet: georeplication & CDN’s Source: “How Facebook Works”,Technology Review, Jul/Aug 2008 11

12 Architecture Database (slow, persistent) Load Balancer (assigns a web server) Web Server (PHP assembles data) Memcache (fast, simple)

13 ▪ Simple in-memory hash table ▪ Supports get/set,delete,multiget, multiset ▪ Not a write-through cache ▪ Pros and Cons ▪ The Database Shield! ▪ Low latency, very high request rates ▪ Can be easy to corrupt, inefficient for very small items Memcache

14 ▪ Multithreading and efficient protocol code - 50k req/s ▪ Polling network drivers - 150k req/s ▪ Breaking up stats lock - 200k req/s ▪ Batching packet handling - 250k req/s ▪ Breaking up cache lock - future Memcache Optimization

15 Network Incast Many Small Get Requests Memcache Switch PHP Client

16 Memcache Switch PHP Client Many big data packets Network Incast

17 Memcache Switch PHP Client Network Incast

18 Memcache Switch PHP Client Network Incast

19 Memcache 3 Objects PHP Client 3 round trips total1 round trip per server 4 Objects Memcache 3 Objects Memcache Clustering

20 ScribeScribeScribe ScribeScribeScribe ScribeScribeScribe Thousands of MySQL servers in two datacenters MySQL has played a role from the beginning

21 Photos

22 Photos + Social Graph = Awesome!

23 Photos: Scale ▪ 20 billion photos x4 = 80 billion ▪ Would wrap around the world more than 10 times! ▪ Over 40M new photos per day ▪ 600K photos / second

24 Photos Scaling - The easy wins ▪ Upload tier - handles uploads, scales images, stores on NFS ▪ Serving tier: Images served from NFS via HTTP ▪ However... ▪ File systems are not good at supporting large number of files ▪ Metadata too large to fit in memory causing too many IOs for each file read ▪ Limited by I/O not storage density ▪ Easy wins ▪ CDN ▪ Cachr (http server + caching) ▪ NFS file handle cache


Download ppt "Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries."

Similar presentations


Ads by Google