Presentation is loading. Please wait.

Presentation is loading. Please wait.

RAMCloud Overview and Status John Ousterhout Stanford University.

Similar presentations


Presentation on theme: "RAMCloud Overview and Status John Ousterhout Stanford University."— Presentation transcript:

1 RAMCloud Overview and Status John Ousterhout Stanford University

2 DRAM in Storage Systems June 3, 2011RAMCloud Overview & StatusSlide 2 19701980199020002010 UNIX buffer cache Main-memory databases Large file caches Web indexes entirely in DRAM memcached Facebook: 200 TB total data 150 TB cache! Main-memory DBs, again

3 DRAM in Storage Systems ● DRAM usage limited/specialized ● Clumsy (must manage consistency with backing store) ● Lost performance (cache misses, slow writes) June 3, 2011RAMCloud Overview & StatusSlide 3 19701980199020002010 UNIX buffer cache Main-memory databases Large file caches Web indexes entirely in DRAM memcached Facebook: 200 TB total data 150 TB cache! Main-memory DBs, again

4 Harness full performance potential of large-scale DRAM storage: ● General-purpose storage system ● All data always in DRAM (no cache misses) ● Durable and available (no backing store) ● Scale: 1000+ servers, 100+ TB ● Low latency: 5-10µs remote access Potential impact: enable new class of applications June 3, 2011RAMCloud Overview & StatusSlide 4 RAMCloud

5 May 27, 2011RAMCloud: GSRC Mid-Year ReviewSlide 5 RAMCloud Architecture Master Backup Master Backup Master Backup Master Backup … Appl. Library Appl. Library Appl. Library Appl. Library … Datacenter Network Coordinator 1000 – 10,000 Storage Servers 1000 – 100,000 Application Servers 32-64GB/server

6 create(tableId, blob) => objectId, version read(tableId, objectId) => blob, version write(tableId, objectId, blob) => version cwrite(tableId, objectId, blob, version) => version delete(tableId, objectId) June 3, 2011RAMCloud Overview & StatusSlide 6 Data Model Tables Identifier (64b) Version (64b) Blob (≤1MB) Object (Only overwrite if version matches) Richer model in the future: Indexes? Transactions? Graphs? Richer model in the future: Indexes? Transactions? Graphs?

7 June 3, 2011RAMCloud Overview & Status Slide 7 RPC Transport Architecture TcpTransport InfRcTransport FastTransport Kernel TCP/IPInfiniband verbs Reliable queue pairs Kernel bypass Mellanox NICs UdpDriver InfUdDriver Kernel UDP Infiniband unreliable datagrams Transport API: Reliable request/response ClientsServers getSession(serviceLocator) clientSend(reqBuf, respBuf) wait() handleRpc(reqBuf, respBuf) Driver API: Unreliable datagrams InfEthDriver 10GigE packets via Mellanox NIC

8 ● Implemented skeletal system  Fast RPC  Log-structured data management  Simple servers  But, not yet complete enough for production use ● Installed 40-node cluster  Mellanox Infiniband (32 Gb/sec, NICs bypass kernel)  10G Ethernet (Arista switch) ● Demonstrated fast recovery  Why? Only one copy of data in DRAM  Goal: recover 64GB from a failed server in 1-2 seconds  Basic recovery mechanism works, seems to scale  Submitted paper to SOSP June 3, 2011RAMCloud Overview & StatusSlide 8 Progress over the Last Year

9 June 3, 2011RAMCloud Overview & StatusSlide 9 Implementation Status Throw-away first version A few ideas Mature First real implementation RPC Architecture Recovery: Masters Master Server Threading Cluster Coordinator Log Cleaning Backup Server Higher-level Data Model Recovery: Backups Performance Tools RPC Transports Failure Detection Multi-object Transactions Multi-Tenancy Access Control/Security Split/move Tablets Tablet Placement Administration Tools Recovery: Coordinator Cold Start Tub Dissertation- ready (ask Diego)

10 Code36,900 lines Unit tests16,500 lines Total53,400 lines June 3, 2011RAMCloud Overview & StatusSlide 10 RAMCloud Code Size

11 ● Latency for 100-byte reads (1 switch): InfRc4.9 µs TCP (1GigE)92 µs TCP (Infiniband)47 µs Fast + UDP (1GigE)91 µs Fast + UDP (Infiniband)44 µs Fast + InfUd4.9 µs ● Server throughput (InfRc, 100-byte reads, one core): 1.05 × 10 6 requests/sec ● Recovery time (6.6GB data, 11 recovery masters, 66 backups) 1.15 sec June 3, 2011RAMCloud Overview & StatusSlide 11 Selected Performance Metrics

12 ● Fast RPC is within reach ● NIC is biggest long-term bottleneck: must integrate with CPU ● Can recover fast enough that replication isn’t needed for availability ● Randomized approaches are key to scalable distributed decision-making June 3, 2011RAMCloud Overview & StatusSlide 12 Lessons/Conclusions (so far)

13 ● Get experience with applications  Joint project at Facebook over summer  Finish “least usable system” ● Pick next research question(s) to address  What is the right transport protocol for the datacenter?  Cluster management?  Higher-level operations? June 3, 2011RAMCloud Overview & StatusSlide 13 Plans for the Next Year

14 ● Performance measurements Nandu Jayakumar ● Fast recovery Ryan Stutsman Diego Ongaro ● Simulating larger RAMCloud clusters Asaf Cidon ● RAMCloud’s transports Diego Ongaro ● Multi-read operations Ankita Kejriwal ● Tablet profiling Steve Rumble ● Low-level latency measurements Mario Flajslik June 3, 2011RAMCloud Overview & StatusSlide 14 Upcoming RAMCloud Talks


Download ppt "RAMCloud Overview and Status John Ousterhout Stanford University."

Similar presentations


Ads by Google