Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Data Dissemination and Survivable Data Storage Lihao Xu

Similar presentations


Presentation on theme: "Efficient Data Dissemination and Survivable Data Storage Lihao Xu"— Presentation transcript:

1 Efficient Data Dissemination and Survivable Data Storage Lihao Xu http://www.cs.wayne.edu/~lihao/

2 Ubiquitous Information Access

3 Key Building Blocks Storage Retrieval Dissemination Consumption

4 Key Building Blocks

5 Error Correcting Codes

6 21k … 3 Message

7 Error Correcting Codes 21k … 3 Message Codeword 21n - 1 … 3n

8 Error Correcting Codes 21k … 3 Message Codeword 21n - 1 … 3n m 21k … 3 Message

9 MDS ( Maximum Distance Separable ) Codes m = k

10 (n,k) MDS Codes Reed-Solomon (RS) Code

11 (n,k) MDS Codes (4,2) B-Code a d+c b d+a c a+b d b+c

12 Data Dissemination: Broadcast Scheduling

13 Wireless Server Data Dissemination    want 1 want 2 want 1 want 3 Wireless Clients

14 Wireless Server Broadcast in a Cell    want 1 want 2 want 1 want 3 Wireless Clients

15 want 1 want 2 want 1 want 3 Wireless Server Broadcast Model Model clients as random processes Model clients as random processes Desired item is random with probability p i for item i of length l i. Desired item is random with probability p i for item i of length l i.    Wireless Clients

16 Scheduling Problem S = 2 items, l 1 =l 2 Each item consists of k packets, k large Challenge: choose packet broadcast schedule to minimize wait for clients 1212

17 Prior Work Complexity of optimal schedules Complexity of optimal schedules  Bar-Noy, Bhatia, Naor, Schieber, Foltz Complexity of computing optimal schedules Complexity of computing optimal schedules  Kenyon, Schabanel Error correction/detection Error correction/detection  Bestavros

18 Metric: Delivery Time Delivery Time for item 1 S = 1212

19 Delivery Time Total amount of time spent waiting for item i when starting at time in schedule S. Instant in time when client starts waiting for item. S = 1212

20 Expected Delivery Time (EDT) uniformly distributed over schedule S.

21 EDT Calculation 1212 P = P = 1/2 12

22 EDT Calculation 1212 DT2 P = P = 1/2 12

23 EDT Calculation 1212 DT23/2 P = P = 1/2 12

24 EDT Calculation 1212 DT23/2 P = P = 1/2 12 DT 1 7/4

25 EDT Calculation 1212 DT23/2 P = P = 1/2 12 DT 1 7/4 EDT 7/4

26 Performance with Errors Data items consist of k packets Data items consist of k packets What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k1

27 Performance with Errors What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k112345

28 Performance with Errors What happens if a packet is lost? What happens if a packet is lost? Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k112345 EDT = 3 !

29 Use k of n MDS code, n = 2k Use k of n MDS code, n = 2k  Now only need to wait for 1 additional packet Solution – Coding Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k11 12345...k 12345...k k +

30 EDT = 9/4 EDT = 9/4 Solution – Coding Original: Transmitted: 12345...k 12345...k Received: 1234...k 1 k1 k11 12345...k 12345...k k +

31 Solution – Coding Use k of n MDS code, m = 2(k+1) Use k of n MDS code, m = 2(k+1)  Now only need to wait for 1 additional packet Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn

32 Solution – Coding Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn EDT = 7/4 + e

33 General Solution Original: Transmitted: 12345...k Received: 1 k + n12345...kn 12345...k1n12345...kn 12345...kn Given loss probability p, what is the optimal n?

34 General Solution

35

36

37 k = 100 and p = 0.1

38 General Solution k = 100

39 Two-Channel Broadcasting Wireless Server   want 1 want 2 want 1 want 3 Wireless Clients Wireless Server  

40 Coordinating Schedule Data Use (2k, k) MDS code to eliminate data overlap Use (2k, k) MDS code to eliminate data overlap  Channel 1 sends packets 1 through k (raw data)  Channel 2 sends packets k+1 through 2k Features Features  Each channel is self-sufficient  No overlap between channels S 1 = 12 12 S 2 = 12 12 (same schedule, different data)

41 Scheduling for two channels Scheduling for two channels  Two items with equal length and demand  Two synchronized channels of equal bandwidth  First channel’s schedule fixed at 12 What is the optimal schedule for channel 2? What is the optimal schedule for channel 2? Two Broadcast Channels S 1 = S 2 = 12 ?

42 Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle Unequal Portions 12111222 12 1 1 2 2 Arbitrary 2 11122

43 Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle 11 Unequal Portions 12111222 1 12 1 1 2 2 Arbitrary 2 EDT = 1 22

44 Some Schedules 12 12 12 12 12 12 12 12 Repeat Swap Shift 2 Reshuffle 11 Unequal Portions 12111222 1 12 1 1 2 2 Arbitrary 2 EDT = 1 EDT = 63/64 EDT < 63/64? 22

45 Schedule Performance Symmetric Problem Symmetric Problem  Equal lengths  Equal demands  Equal bandwidth channels  Symmetric “fixed” schedule for 1 st channel Asymmetric Solution Asymmetric Solution  Asymmetric schedules can beat any symmetric schedule for the 2 nd channel  How is this possible?

46 More to Explore … More servers/Channels More servers/Channels Differing levels of synchronization Differing levels of synchronization Transmission Errors Transmission Errors Streaming Data Streaming Data Bounds Bounds Wireless Server    want 1 want 2 want 1 want 3 Wireless Clients Wireless Server    Wireless Server    Wireless Server   

47 Hydra: A Platform for SSS

48 Secure and Survivable Storage Availability Recoverability Persistence Confidentiality Integrity Scalability Efficiency

49 Secure and Survivable Storage Yahoo Ebay Amazon Google Banks Your Labs More …

50 Hydra

51 Hydra Design Goals Portable to various OS/FS Hardware independent Unix FS semantics maintained Low overhead in performance and storage Transport independent Easy to install, configure, scale, maintain and automate

52 Hydra and System App. Hydra FS I/O

53 Hydra and System App. Hydra FS I/O App. Hydra FS I/O

54 Hydra and System App. Hydra FS I/O App. Hydra FS I/O App. FS/Hydra I/O

55 Basics of Hydra (4,2) B-Code a d+c b d+a c a+b d b+c

56 Performance Test 2.4G P4, 512 MB, 80GB ATA/100 7200rpm, Redhat 9.0 (kernel 2.4.2.0) Operations Throughput ( Mbps ) File Read 384 File Write 200 Memory Copy 17572 (4,2) B-Code Encoding 5522 (4,2) B-Code Decoding 22866 (4,2) RS Encoding 286 (4,2) RS Decoding 216

57 Hydra Components Meta Data ( hnode) Operations Monitor

58 Hydra Meta Data Code Symbol Location Data Layout Security Flag Access Rights Extensions

59 Hydra Operations Distribute (Write) Recover (Read) Detect Repair Restore Others

60 Hydra Monitor Connectivity Security

61 Hydra Applications Web Server CDN/P2P/Data Server Archiving Data Security system activity logger, forensic, file integrity checker … Others

62 Acknowledgement

63 lihao@cs.wayne.edu


Download ppt "Efficient Data Dissemination and Survivable Data Storage Lihao Xu"

Similar presentations


Ads by Google