Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Similar presentations


Presentation on theme: "Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson."— Presentation transcript:

1

2 Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson

3 Me Director of Engineering –Scaling and Performance –Site Security –Site Reliability –Distributed Systems –Development tools –Customer Service Tools Took Facebook from 7M users to 120M.

4

5 Architecture Load Balancer (assigns a web server) Web Server (PHP assembles data) Memcache (fast) Database (slow, persistent) Other services Search, Feed, etc (ignore for now)

6 - 1/2 the time is in PHP - 1/4 is in memcache - 1/8 is in database

7 One year ago, almost half the time was memcache

8 Network Incast PHP Client Switch memcache Many Small Get Requests

9 Network Incast PHP Client Switch memcache Many big data packets

10 Clustering PHP Client memcache 10 objects 1 round trip for 10 objects

11 Clustering PHP Client memcache 5 objects - 2 round trips total - 1 round trip per server - longest request is 5

12 Clustering PHP Client memcache 3 objects 4 objects - 3 round trips total - 1 round trip per server - longest request is 4 memcache 3 objects

13 Clustering If objects are small, round trips dominate so you want objects clustered If objects are large, transfer time dominates so you want objects distributed In a web application you will almost always be dealing with small objects

14 Caching -Basic tools are parallelism and clustering -Clustering is a latency/throughput tradeoff -Application code must be aware -Networking is a burst problem -Dropped packets kill you -TCP quick ack

15 PHP CPU

16 Application Improvements

17 know what your libraries do $results = get_search_results( $needle ); foreach ( $results as $result ) { if ( is_pending_friend( $result[id] ) ) { // well change the links based on this $result[pending] = true; }

18 know what your libraries do function is_pending_friend( $id ) { // this is short-lived, so dont cache expensive_db_query( $id …)

19 Databases -Tend to be slower than lighter weight alternatives, so avoid using them -If you do use them partition them right from the start -If a query is _really_ slow, like a few seconds or a few minutes, you probably have a bug where youre scanning a table -The db should have a command to tell you what index its using for a query, and how many rows its examining

20 General Lessons Your best tool is parallelism Look at your data Build tools to look at your data Dont make assumptions about what components are doing Algorithmic and system improvements are almost always better than micro- optimization


Download ppt "Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson."

Similar presentations


Ads by Google