Presentation is loading. Please wait.

Presentation is loading. Please wait.

FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri.

Similar presentations


Presentation on theme: "FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri."— Presentation transcript:

1 FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri

2 Motivation Large-scale data-intensive applications like high performance key-value storage systems are being used by Facebook, LinkedIn, Amazon with more regularity. Being I/O, Requiring RA over large DB, performing parallel, concurrent and mostly independent operations, requiring large clusters and storing small sized objects are several common features these workloads share. System performance: queries/sec Energy efficiency: queries/joule CPU performance and I/O bandwidth Gap : For data intensive computing workloads, storage, network and memory bandwidth bottlenecks lead to low CPU utilization Solution: wimpy processors to reduce I/O induced idle cycles CPU Power consumption: operating processors at higher freq requires more energy. techniques to mask CPU bottleneck cause energy inefficiency branch prediction, speculative execution – more processor die area Solution: slower CPUs execute more instructions per joule 1 billion vs. 100 million instructions per Joule

3 FAWN Efficient – 1W at heavy load Vs 10W at load Fast random reads – up to 175 times faster Slow random writes – updating a single page means erasing an entire block before writing the modified block in its place Cluster of embedded CPUs using flash storage Efficient – 1W at heavy load Vs 10W at load Fast random reads – up to 175 times faster Slow random writes – updating a single page means erasing an entire block before writing the modified block in its place FAWN-KeyValue nodes organized into a ring using consistent Hashing physical node is a collection of virtual node FAWN-DS Log structured key-value stores contains values for key range associated with VID

4 FAWN - DS Uses as in-memory Hash Index to map 160-bit key to a value stored in the data log stores only a fragment of the actual key. Hash Index bucket = i low order index bits key fragment = next 15 low order bits Each bucket -6 bytes - stores frag, valid bit and 4-byte pointer

5 FAWN - DS Basic Functions: Store Lookup Delete Concurrent operations Virtual Node Maintenance: Split Merge Compact

6 FAWN-KV organizes the back-end VIDs into a storage ring- structure using consistent hashing Management node assigns each front-end to circular key space Front-end node manages fraction of key-space manages the VID membership list forwards out-of-range request Back-end nodes – VIDs owns a key range contacts front-end when joining FAWN - KV

7 Chain replication FAWN - KV

8 Join split key range pre-copy chain insertion log flush Leave merge key range Join into each chain FAWN - KV

9 Individual Node Performance Lookup speed Bulk store speed: 23.2 MB/s, or 96% of raw speed

10 Individual Node Performance Put speed Compared to BerkeleyDB: 0.07 MB/s – shows necessity of log-based filesystems

11 Individual Node Performance Read- and write-intensive workloads

12 System Benchmarks System throughput and power consumption

13 Impact of Ring Membership Changes Query throughput during node join and maintenance operations

14 Alternative Architectures Large Dataset, Low Query → FAWN+Disk number of nodes dominated by storage capacity per node has the lowest total cost per GB Small Dataset, High Query → FAWN+DRAM number of nodes dominated by per node query capacity has the lowest cost for queries/sec Middle Range → FAWN+SSD best balance of storage capacity, query rate and total cost

15 Conclusion Fast and energy efficient processing of random read- intensive workloads Over an order of magnitude more queries per Joule than traditional disk-based systems


Download ppt "FAWN: A Fast Array of Wimpy Nodes Presented by: Aditi Bose & Hyma Chilukuri."

Similar presentations


Ads by Google