Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Web Architectures Common Patterns & Approaches Cal Henderson.

Similar presentations


Presentation on theme: "Scalable Web Architectures Common Patterns & Approaches Cal Henderson."— Presentation transcript:

1 Scalable Web Architectures Common Patterns & Approaches Cal Henderson

2 Web 2.0 Expo NYC, 16th September 20082 Hello

3 Web 2.0 Expo NYC, 16th September 20083 Flickr Large scale kitten-sharing Almost 5 years old –4.5 since public launch Open source stack –An open attitude towards architecture

4 Web 2.0 Expo NYC, 16th September 20084 Flickr Large scale In a single second: –Serve 40,000 photos –Handle 100,000 cache operations –Process 130,000 database queries

5 Web 2.0 Expo NYC, 16th September 20085 Scalable Web Architectures? What does scalable mean? What’s an architecture? An answer in 12 parts

6 Web 2.0 Expo NYC, 16th September 20086 1. Scaling

7 Web 2.0 Expo NYC, 16th September 20087 Scalability – myths and lies What is scalability?

8 Web 2.0 Expo NYC, 16th September 20088 Scalability – myths and lies What is scalability not ?

9 Web 2.0 Expo NYC, 16th September 20089 Scalability – myths and lies What is scalability not ? –Raw Speed / Performance –HA / BCP –Technology X –Protocol Y

10 Web 2.0 Expo NYC, 16th September 200810 Scalability – myths and lies So what is scalability?

11 Web 2.0 Expo NYC, 16th September 200811 Scalability – myths and lies So what is scalability? –Traffic growth –Dataset growth –Maintainability

12 Web 2.0 Expo NYC, 16th September 200812 Today Two goals of application architecture: Scale HA

13 Web 2.0 Expo NYC, 16th September 200813 Today Three goals of application architecture: Scale HA Performance

14 Web 2.0 Expo NYC, 16th September 200814 Scalability Two kinds: –Vertical (get bigger) –Horizontal (get more)

15 Web 2.0 Expo NYC, 16th September 200815 Big Irons Sunfire E20k $450,000 - $2,500,000 36x 1.8GHz processors PowerEdge SC1435 Dualcore 1.8 GHz processor Around $1,500

16 Web 2.0 Expo NYC, 16th September 200816 Cost vs Cost

17 Web 2.0 Expo NYC, 16th September 200817 That’s OK Sometimes vertical scaling is right Buying a bigger box is quick (ish) Redesigning software is not Running out of MySQL performance? –Spend months on data federation –Or, Just buy a ton more RAM

18 Web 2.0 Expo NYC, 16th September 200818 The H & the V But we’ll mostly talk horizontal –Else this is going to be boring

19 Web 2.0 Expo NYC, 16th September 200819 2. Architecture

20 Web 2.0 Expo NYC, 16th September 200820 Architectures then? The way the bits fit together What grows where The trade-offs between good/fast/cheap

21 Web 2.0 Expo NYC, 16th September 200821 LAMP We’re mostly talking about LAMP –Linux –Apache (or LightHTTPd) –MySQL (or Postgres) –PHP (or Perl, Python, Ruby) All open source All well supported All used in large operations (Same rules apply elsewhere)

22 Web 2.0 Expo NYC, 16th September 200822 Simple web apps A Web Application –Or “Web Site” in Web 1.0 terminology InterwebnetApp serverDatabase

23 Web 2.0 Expo NYC, 16th September 200823 Simple web apps A Web Application –Or “Web Site” in Web 1.0 terminology InterwobnetApp serverDatabase Cache Storage array AJAX!!!1

24 Web 2.0 Expo NYC, 16th September 200824 App servers App servers scale in two ways:

25 Web 2.0 Expo NYC, 16th September 200825 App servers App servers scale in two ways: –Really well

26 Web 2.0 Expo NYC, 16th September 200826 App servers App servers scale in two ways: –Really well –Quite badly

27 Web 2.0 Expo NYC, 16th September 200827 App servers Sessions! –(State) –Local sessions == bad When they move == quite bad –Centralized sessions == good –No sessions at all == awesome!

28 Web 2.0 Expo NYC, 16th September 200828 Local sessions Stored on disk –PHP sessions Stored in memory –Shared memory block (APC) Bad! –Can’t move users –Can’t avoid hotspots –Not fault tolerant

29 Web 2.0 Expo NYC, 16th September 200829 Mobile local sessions Custom built –Store last session location in cookie –If we hit a different server, pull our session information across If your load balancer has sticky sessions, you can still get hotspots –Depends on volume – fewer heavier users hurt more

30 Web 2.0 Expo NYC, 16th September 200830 Remote centralized sessions Store in a central database –Or an in-memory cache No porting around of session data No need for sticky sessions No hot spots Need to be able to scale the data store –But we’ve pushed the issue down the stack

31 Web 2.0 Expo NYC, 16th September 200831 No sessions Stash it all in a cookie! Sign it for safety –$data = $user_id. ‘-’. $user_name; –$time = time(); –$sig = sha1($secret. $time. $data); –$cookie = base64(“$sig-$time-$data”); –Timestamp means it’s simple to expire it

32 Web 2.0 Expo NYC, 16th September 200832 Super slim sessions If you need more than the cookie (login status, user id, username), then pull their account row from the DB –Or from the account cache None of the drawbacks of sessions Avoids the overhead of a query per page –Great for high-volume pages which need little personalization –Turns out you can stick quite a lot in a cookie too –Pack with base64 and it’s easy to delimit fields

33 Web 2.0 Expo NYC, 16th September 200833 App servers The Rasmus way –App server has ‘shared nothing’ –Responsibility pushed down the stack Ooh, the stack

34 Web 2.0 Expo NYC, 16th September 200834 Trifle

35 Web 2.0 Expo NYC, 16th September 200835 Trifle Sponge / Database Jelly / Business Logic Custard / Page Logic Cream / Markup Fruit / Presentation

36 Web 2.0 Expo NYC, 16th September 200836 Trifle Sponge / Database Jelly / Business Logic Custard / Page Logic Cream / Markup Fruit / Presentation

37 Web 2.0 Expo NYC, 16th September 200837 App servers

38 Web 2.0 Expo NYC, 16th September 200838 App servers

39 Web 2.0 Expo NYC, 16th September 200839 App servers

40 Web 2.0 Expo NYC, 16th September 200840 Well, that was easy Scaling the web app server part is easy The rest is the trickier part –Database –Serving static content –Storing static content

41 Web 2.0 Expo NYC, 16th September 200841 The others Other services scale similarly to web apps –That is, horizontally The canonical examples: –Image conversion –Audio transcoding –Video transcoding –Web crawling –Compute!

42 Web 2.0 Expo NYC, 16th September 200842 Amazon Let’s talk about Amazon –S3 - Storage –EC2 – Compute! (XEN based) –SQS – Queueing All horizontal Cheap when small –Not cheap at scale

43 Web 2.0 Expo NYC, 16th September 200843 3. Load Balancing

44 Web 2.0 Expo NYC, 16th September 200844 Load balancing If we have multiple nodes in a class, we need to balance between them Hardware or software Layer 4 or 7

45 Web 2.0 Expo NYC, 16th September 200845 Hardware LB A hardware appliance –Often a pair with heartbeats for HA Expensive! –But offers high performance –Easy to do > 1Gbps Many brands –Alteon, Cisco, Netscalar, Foundry, etc –L7 - web switches, content switches, etc

46 Web 2.0 Expo NYC, 16th September 200846 Software LB Just some software –Still needs hardware to run on –But can run on existing servers Harder to have HA –Often people stick hardware LB’s in front –But Wackamole helps here

47 Web 2.0 Expo NYC, 16th September 200847 Software LB Lots of options –Pound –Perlbal –Apache with mod_proxy Wackamole with mod_backhand –http://backhand.org/wackamole/ –http://backhand.org/mod_backhand/http://backhand.org/mod_backhand/

48 Web 2.0 Expo NYC, 16th September 200848 Wackamole

49 Web 2.0 Expo NYC, 16th September 200849 Wackamole

50 Web 2.0 Expo NYC, 16th September 200850 The layers Layer 4 –A ‘dumb’ balance Layer 7 –A ‘smart’ balance OSI stack, routers, etc

51 Web 2.0 Expo NYC, 16th September 200851 4. Queuing

52 Web 2.0 Expo NYC, 16th September 200852 Parallelizable == easy! If we can transcode/crawl in parallel, it’s easy –But think about queuing –And asynchronous systems –The web ain’t built for slow things –But still, a simple problem

53 Web 2.0 Expo NYC, 16th September 200853 Synchronous systems

54 Web 2.0 Expo NYC, 16th September 200854 Asynchronous systems

55 Web 2.0 Expo NYC, 16th September 200855 Helps with peak periods

56 Web 2.0 Expo NYC, 16th September 200856 Synchronous systems

57 Web 2.0 Expo NYC, 16th September 200857 Asynchronous systems

58 Web 2.0 Expo NYC, 16th September 200858 Asynchronous systems

59 Web 2.0 Expo NYC, 16th September 200859 5. Relational Data

60 Web 2.0 Expo NYC, 16th September 200860 Databases Unless we’re doing a lot of file serving, the database is the toughest part to scale If we can, best to avoid the issue altogether and just buy bigger hardware Dual-Quad Opteron/Intel64 systems with 16+GB of RAM can get you a long way

61 Web 2.0 Expo NYC, 16th September 200861 More read power Web apps typically have a read/write ratio of somewhere between 80/20 and 90/10 If we can scale read capacity, we can solve a lot of situations MySQL replication!

62 Web 2.0 Expo NYC, 16th September 200862 Master-Slave Replication

63 Web 2.0 Expo NYC, 16th September 200863 Master-Slave Replication Reads and Writes Reads

64 Web 2.0 Expo NYC, 16th September 200864 Master-Slave Replication

65 Web 2.0 Expo NYC, 16th September 200865 Master-Slave Replication

66 Web 2.0 Expo NYC, 16th September 200866 Master-Slave Replication

67 Web 2.0 Expo NYC, 16th September 200867 Master-Slave Replication

68 Web 2.0 Expo NYC, 16th September 200868 Master-Slave Replication

69 Web 2.0 Expo NYC, 16th September 200869 Master-Slave Replication

70 Web 2.0 Expo NYC, 16th September 200870 Master-Slave Replication

71 Web 2.0 Expo NYC, 16th September 200871 Master-Slave Replication

72 Web 2.0 Expo NYC, 16th September 200872 6. Caching

73 Web 2.0 Expo NYC, 16th September 200873 Caching Caching avoids needing to scale! –Or makes it cheaper Simple stuff –mod_perl / shared memory Invalidation is hard –MySQL query cache Bad performance (in most cases)

74 Web 2.0 Expo NYC, 16th September 200874 Caching Getting more complicated… –Write-through cache –Write-back cache –Sideline cache

75 Web 2.0 Expo NYC, 16th September 200875 Write-through cache

76 Web 2.0 Expo NYC, 16th September 200876 Write-back cache

77 Web 2.0 Expo NYC, 16th September 200877 Sideline cache

78 Web 2.0 Expo NYC, 16th September 200878 Sideline cache Easy to implement –Just add app logic Need to manually invalidate cache –Well designed code makes it easy Memcached –From Danga (LiveJournal) –http://www.danga.com/memcached/

79 Web 2.0 Expo NYC, 16th September 200879 Memcache schemes Layer 4 –Good: Cache can be local on a machine –Bad: Invalidation gets more expensive with node count –Bad: Cache space wasted by duplicate objects

80 Web 2.0 Expo NYC, 16th September 200880 Memcache schemes Layer 7 –Good: No wasted space –Good: linearly scaling invalidation –Bad: Multiple, remote connections Can be avoided with a proxy layer –Gets more complicated »Last indentation level!

81 Web 2.0 Expo NYC, 16th September 200881 7. HA Data

82 Web 2.0 Expo NYC, 16th September 200882 But what about HA?

83 Web 2.0 Expo NYC, 16th September 200883 But what about HA?

84 Web 2.0 Expo NYC, 16th September 200884 SPOF! The key to HA is avoiding SPOFs –Identify –Eliminate Some stuff is hard to solve –Fix it further up the tree Dual DCs solves Router/Switch SPOF

85 Web 2.0 Expo NYC, 16th September 200885 Master-Master

86 Web 2.0 Expo NYC, 16th September 200886 Master-Master Either hot/warm or hot/hot Writes can go to either –But avoid collisions –No auto-inc columns for hot/hot Bad for hot/warm too Unless you have MySQL 5 –But you can’t rely on the ordering! –Design schema/access to avoid collisions Hashing users to servers

87 Web 2.0 Expo NYC, 16th September 200887 Rings Master-master is just a small ring –With 2 nodes Bigger rings are possible –But not a mesh! –Each slave may only have a single master –Unless you build some kind of manual replication

88 Web 2.0 Expo NYC, 16th September 200888 Rings

89 Web 2.0 Expo NYC, 16th September 200889 Rings

90 Web 2.0 Expo NYC, 16th September 200890 Dual trees Master-master is good for HA –But we can’t scale out the reads (or writes!) We often need to combine the read scaling with HA We can simply combine the two models

91 Web 2.0 Expo NYC, 16th September 200891 Dual trees

92 Web 2.0 Expo NYC, 16th September 200892 Cost models There’s a problem here –We need to always have 200% capacity to avoid a SPOF 400% for dual sites! –This costs too much Solution is straight forward –Make sure clusters are bigger than 2

93 Web 2.0 Expo NYC, 16th September 200893 N+M –N = nodes needed to run the system –M = nodes we can afford to lose Having M as big as N starts to suck –If we could make each node smaller, we can increase N while M stays constant –(We assume smaller nodes are cheaper)

94 Web 2.0 Expo NYC, 16th September 200894 1+1 = 200% hardware

95 Web 2.0 Expo NYC, 16th September 200895 3+1 = 133% hardware

96 Web 2.0 Expo NYC, 16th September 200896 Meshed masters Not possible with regular MySQL out-of- the-box today But there is hope! –NBD (MySQL Cluster) allows a mesh –Support for replication out to slaves in a coming version RSN!

97 Web 2.0 Expo NYC, 16th September 200897 8. Federation

98 Web 2.0 Expo NYC, 16th September 200898 Data federation At some point, you need more writes –This is tough –Each cluster of servers has limited write capacity Just add more clusters!

99 Web 2.0 Expo NYC, 16th September 200899 Simple things first Vertical partitioning –Divide tables into sets that never get joined –Split these sets onto different server clusters –Voila! Logical limits –When you run out of non-joining groups –When a single table grows too large

100 Web 2.0 Expo NYC, 16th September 2008100 Data federation Split up large tables, organized by some primary object –Usually users Put all of a user’s data on one ‘cluster’ –Or shard, or cell Have one central cluster for lookups

101 Web 2.0 Expo NYC, 16th September 2008101 Data federation

102 Web 2.0 Expo NYC, 16th September 2008102 Data federation Need more capacity? –Just add shards! –Don’t assign to shards based on user_id! For resource leveling as time goes on, we want to be able to move objects between shards –Maybe – not everyone does this –‘Lockable’ objects

103 Web 2.0 Expo NYC, 16th September 2008103 The wordpress.com approach Hash users into one of n buckets –Where n is a power of 2 Put all the buckets on one server When you run out of capacity, split the buckets across two servers Then you run out of capacity, split the buckets across four servers Etc

104 Web 2.0 Expo NYC, 16th September 2008104 Data federation Heterogeneous hardware is fine –Just give a larger/smaller proportion of objects depending on hardware Bigger/faster hardware for paying users –A common approach –Can also allocate faster app servers via magic cookies at the LB

105 Web 2.0 Expo NYC, 16th September 2008105 Downsides Need to keep stuff in the right place App logic gets more complicated More clusters to manage –Backups, etc More database connections needed per page –Proxy can solve this, but complicated The dual table issue –Avoid walking the shards!

106 Web 2.0 Expo NYC, 16th September 2008106 Bottom line Data federation is how large applications are scaled

107 Web 2.0 Expo NYC, 16th September 2008107 Bottom line It’s hard, but not impossible Good software design makes it easier –Abstraction! Master-master pairs for shards give us HA Master-master trees work for central cluster (many reads, few writes)

108 Web 2.0 Expo NYC, 16th September 2008108 9. Multi-site HA

109 Web 2.0 Expo NYC, 16th September 2008109 Multiple Datacenters Having multiple datacenters is hard –Not just with MySQL Hot/warm with MySQL slaved setup –But manual (reconfig on failure) Hot/hot with master-master –But dangerous (each site has a SPOF) Hot/hot with sync/async manual replication –But tough (big engineering task)

110 Web 2.0 Expo NYC, 16th September 2008110 Multiple Datacenters

111 Web 2.0 Expo NYC, 16th September 2008111 GSLB Multiple sites need to be balanced –Global Server Load Balancing Easiest are AkaDNS-like services –Performance rotations –Balance rotations

112 Web 2.0 Expo NYC, 16th September 2008112 10. Serving Files

113 Web 2.0 Expo NYC, 16th September 2008113 Serving lots of files Serving lots of files is not too tough –Just buy lots of machines and load balance! We’re IO bound – need more spindles! –But keeping many copies of data in sync is hard –And sometimes we have other per-request overhead (like auth)

114 Web 2.0 Expo NYC, 16th September 2008114 Reverse proxy

115 Web 2.0 Expo NYC, 16th September 2008115 Reverse proxy Serving out of memory is fast! –And our caching proxies can have disks too –Fast or otherwise More spindles is better We can parallelize it! –50 cache servers gives us 50 times the serving rate of the origin server –Assuming the working set is small enough to fit in memory in the cache cluster

116 Web 2.0 Expo NYC, 16th September 2008116 Invalidation Dealing with invalidation is tricky We can prod the cache servers directly to clear stuff out –Scales badly – need to clear asset from every server – doesn’t work well for 100 caches

117 Web 2.0 Expo NYC, 16th September 2008117 Invalidation We can change the URLs of modified resources –And let the old ones drop out cache naturally –Or poke them out, for sensitive data Good approach! –Avoids browser cache staleness –Hello Akamai (and other CDNs) –Read more: http://www.thinkvitamin.com/features/webapps/serving-javascript-fast

118 Web 2.0 Expo NYC, 16th September 2008118 CDN – Content Delivery Network Akamai, Savvis, Mirror Image Internet, etc Caches operated by other people –Already in-place –In lots of places GSLB/DNS balancing

119 Web 2.0 Expo NYC, 16th September 2008119 Edge networks Origin

120 Web 2.0 Expo NYC, 16th September 2008120 Edge networks Origin Cache

121 Web 2.0 Expo NYC, 16th September 2008121 CDN Models Simple model –You push content to them, they serve it Reverse proxy model –You publish content on an origin, they proxy and cache it

122 Web 2.0 Expo NYC, 16th September 2008122 CDN Invalidation You don’t control the caches –Just like those awful ISP ones Once something is cached by a CDN, assume it can never change –Nothing can be deleted –Nothing can be modified

123 Web 2.0 Expo NYC, 16th September 2008123 Versioning When you start to cache things, you need to care about versioning –Invalidation & Expiry –Naming & Sync

124 Web 2.0 Expo NYC, 16th September 2008124 Cache Invalidation If you control the caches, invalidation is possible But remember ISP and client caches Remove deleted content explicitly –Avoid users finding old content –Save cache space

125 Web 2.0 Expo NYC, 16th September 2008125 Cache versioning Simple rule of thumb: –If an item is modified, change its name (URL) This can be independent of the file system!

126 Web 2.0 Expo NYC, 16th September 2008126 Virtual versioning Database indicates version 3 of file Web app writes version number into URL Request comes through cache and is cached with the versioned URL mod_rewrite converts versioned URL to path Version 3 example.com/foo_3.jpg Cached: foo_3.jpg foo_3.jpg -> foo.jpg

127 Web 2.0 Expo NYC, 16th September 2008127 Reverse proxy Choices –L7 load balancer & Squid http://www.squid-cache.org/ –mod_proxy & mod_cache http://www.apache.org/ –Perlbal and Memcache? http://www.danga.com/

128 Web 2.0 Expo NYC, 16th September 2008128 More reverse proxy HA proxy Squid & CARP Varnish

129 Web 2.0 Expo NYC, 16th September 2008129 High overhead serving What if you need to authenticate your asset serving? –Private photos –Private data –Subscriber-only files Two main approaches –Proxies w/ tokens –Path translation

130 Web 2.0 Expo NYC, 16th September 2008130 Perlbal Re-proxying Perlbal can do redirection magic –Client sends request to Perlbal –Perlbal plugin verifies user credentials token, cookies, whatever tokens avoid data-store access –Perlbal goes to pick up the file from elsewhere –Transparent to user

131 Web 2.0 Expo NYC, 16th September 2008131 Perlbal Re-proxying

132 Web 2.0 Expo NYC, 16th September 2008132 Perlbal Re-proxying Doesn’t keep database around while serving Doesn’t keep app server around while serving User doesn’t find out how to access asset directly

133 Web 2.0 Expo NYC, 16th September 2008133 Permission URLs But why bother!? If we bake the auth into the URL then it saves the auth step We can do the auth on the web app servers when creating HTML Just need some magic to translate to paths We don’t want paths to be guessable

134 Web 2.0 Expo NYC, 16th September 2008134 Permission URLs

135 Web 2.0 Expo NYC, 16th September 2008135 Permission URLs (or mod_perl)

136 Web 2.0 Expo NYC, 16th September 2008136 Permission URLs Downsides –URL gives permission for life –Unless you bake in tokens Tokens tend to be non-expirable –We don’t want to track every token »Too much overhead But can still expire –But you choose at time of issue, not later Upsides –It works –Scales very nicely

137 Web 2.0 Expo NYC, 16th September 2008137 11. Storing Files

138 Web 2.0 Expo NYC, 16th September 2008138 Storing lots of files Storing files is easy! –Get a big disk –Get a bigger disk –Uh oh! Horizontal scaling is the key –Again

139 Web 2.0 Expo NYC, 16th September 2008139 Connecting to storage NFS –Stateful == Sucks –Hard mounts vs Soft mounts, INTR SMB / CIFS / Samba –Turn off MSRPC & WINS (NetBOIS NS) –Stateful but degrades gracefully HTTP –Stateless == Yay! –Just use Apache

140 Web 2.0 Expo NYC, 16th September 2008140 Multiple volumes Volumes are limited in total size –Except (in theory) under ZFS & others Sometimes we need multiple volumes for performance reasons –When using RAID with single/dual parity At some point, we need multiple volumes

141 Web 2.0 Expo NYC, 16th September 2008141 Multiple volumes

142 Web 2.0 Expo NYC, 16th September 2008142 Multiple hosts Further down the road, a single host will be too small Total throughput of machine becomes an issue Even physical space can start to matter So we need to be able to use multiple hosts

143 Web 2.0 Expo NYC, 16th September 2008143 Multiple hosts

144 Web 2.0 Expo NYC, 16th September 2008144 HA Storage HA is important for file assets too –We can back stuff up –But we tend to want hot redundancy RAID is good –RAID 5 is cheap, RAID 10 is fast

145 Web 2.0 Expo NYC, 16th September 2008145 HA Storage But whole machines can fail So we stick assets on multiple machines In this case, we can ignore RAID –In failure case, we serve from alternative source –But need to weigh up the rebuild time and effort against the risk –Store more than 2 copies?

146 Web 2.0 Expo NYC, 16th September 2008146 HA Storage

147 Web 2.0 Expo NYC, 16th September 2008147 HA Storage But whole colos can fail So we stick assets in multiple colos Again, we can ignore RAID –Or even multi-machine per colo –But do we want to lose a whole colo because of single disk failure? –Redundancy is always a balance

148 Web 2.0 Expo NYC, 16th September 2008148 Self repairing systems When something fails, repairing can be a pain –RAID rebuilds by itself, but machine replication doesn’t The big appliances self heal –NetApp, StorEdge, etc So does MogileFS (reaper)

149 Web 2.0 Expo NYC, 16th September 2008149 Real Life Case Studies

150 Web 2.0 Expo NYC, 16th September 2008150 GFS – Google File System Developed by … Google Proprietary Everything we know about it is based on talks they’ve given Designed to store huge files for fast access

151 Web 2.0 Expo NYC, 16th September 2008151 GFS – Google File System Single ‘Master’ node holds metadata –SPF – Shadow master allows warm swap Grid of ‘chunkservers’ –64bit filenames –64 MB file chunks

152 Web 2.0 Expo NYC, 16th September 2008152 GFS – Google File System 1(a)2(a) 1(b) Master

153 Web 2.0 Expo NYC, 16th September 2008153 GFS – Google File System Client reads metadata from master then file parts from multiple chunkservers Designed for big files (>100MB) Master server allocates access leases Replication is automatic and self repairing –Synchronously for atomicity

154 Web 2.0 Expo NYC, 16th September 2008154 GFS – Google File System Reading is fast (parallelizable) –But requires a lease Master server is required for all reads and writes

155 Web 2.0 Expo NYC, 16th September 2008155 MogileFS – OMG Files Developed by Danga / SixApart Open source Designed for scalable web app storage

156 Web 2.0 Expo NYC, 16th September 2008156 MogileFS – OMG Files Single metadata store (MySQL) –MySQL Cluster avoids SPF Multiple ‘tracker’ nodes locate files Multiple ‘storage’ nodes store files

157 Web 2.0 Expo NYC, 16th September 2008157 MogileFS – OMG Files Tracker MySQL

158 Web 2.0 Expo NYC, 16th September 2008158 MogileFS – OMG Files Replication of file ‘classes’ happens transparently Storage nodes are not mirrored – replication is piecemeal Reading and writing go through trackers, but are performed directly upon storage nodes

159 Web 2.0 Expo NYC, 16th September 2008159 Flickr File System Developed by Flickr Proprietary Designed for very large scalable web app storage

160 Web 2.0 Expo NYC, 16th September 2008160 Flickr File System No metadata store –Deal with it yourself Multiple ‘StorageMaster’ nodes Multiple storage nodes with virtual volumes

161 Web 2.0 Expo NYC, 16th September 2008161 Flickr File System SM

162 Web 2.0 Expo NYC, 16th September 2008162 Flickr File System Metadata stored by app –Just a virtual volume number –App chooses a path Virtual nodes are mirrored –Locally and remotely Reading is done directly from nodes

163 Web 2.0 Expo NYC, 16th September 2008163 Flickr File System StorageMaster nodes only used for write operations Reading and writing can scale separately

164 Web 2.0 Expo NYC, 16th September 2008164 Amazon S3 A big disk in the sky Multiple ‘buckets’ Files have user-defined keys Data + metadata

165 Web 2.0 Expo NYC, 16th September 2008165 Amazon S3 ServersAmazon

166 Web 2.0 Expo NYC, 16th September 2008166 Amazon S3 ServersAmazon Users

167 Web 2.0 Expo NYC, 16th September 2008167 The cost Fixed price, by the GB Store: $0.15 per GB per month Serve: $0.20 per GB

168 Web 2.0 Expo NYC, 16th September 2008168 The cost S3

169 Web 2.0 Expo NYC, 16th September 2008169 The cost S3 Regular Bandwidth

170 Web 2.0 Expo NYC, 16th September 2008170 End costs ~$2k to store 1TB for a year ~$63 a month for 1Mb ~$65k a month for 1Gb

171 Web 2.0 Expo NYC, 16th September 2008171 12. Field Work

172 Web 2.0 Expo NYC, 16th September 2008172 Real world examples Flickr –Because I know it LiveJournal –Because everyone copies it

173 Web 2.0 Expo NYC, 16th September 2008173 Flickr Architecture

174 Web 2.0 Expo NYC, 16th September 2008174 Flickr Architecture

175 Web 2.0 Expo NYC, 16th September 2008175 LiveJournal Architecture

176 Web 2.0 Expo NYC, 16th September 2008176 LiveJournal Architecture

177 Web 2.0 Expo NYC, 16th September 2008177 Buy my book!

178 Web 2.0 Expo NYC, 16th September 2008178 Or buy Theo’s

179 Web 2.0 Expo NYC, 16th September 2008179 The end!

180 Web 2.0 Expo NYC, 16th September 2008180 Awesome! These slides are available online: iamcal.com/talks/


Download ppt "Scalable Web Architectures Common Patterns & Approaches Cal Henderson."

Similar presentations


Ads by Google