Presentation is loading. Please wait.

Presentation is loading. Please wait.

Malugo – a scalable peer-to-peer storage system..

Similar presentations


Presentation on theme: "Malugo – a scalable peer-to-peer storage system.."— Presentation transcript:

1 Malugo – a scalable peer-to-peer storage system.

2 What is peer-to-peer Oxford: A person who equals another in natural gifts, ability, or achievements; the equal in any respect of a person or thing. “Peer-to-peer” is a way to structuring distributed computers such that individual nodes have symmetric roles. Rather than conventional client-server architecture each with distinct roles, nodes in peer-to-peer networks usually act both a server and a client. 2016/2/29System Software Lab, National Tsing Hua University2 A server-based arch. A peer-to-peer network

3 What is peer-to-peer (cont.) Multiple peers –Each peer is small –The number of peers is large –Every peer owns some resources in the network and provides these to other participators. Characteristics –Autonomous, self-control, ad-hoc participation –Dynamics (join or leave freely) –High reliability and safety –Rely on underlay infrastructure 2016/2/29System Software Lab, National Tsing Hua University3

4 What is peer-to-peer (cont.) Peer-to-peer applications –File sharing: Edonkey(emule, ed2k), bit-torrent, Gnutella, … etc. –Multimedia broadcasting: PPStream, SOPCast, … etc. –Instant Messaging: Windows Live Messenger, Yahoo Messenger, ICQ, … etc. –VoIP: Skype 2016/2/29System Software Lab, National Tsing Hua University4

5 Why peer-to-peer? Aspiration of more powerful computational power, larger storage capacity and higher transmission bandwidth. Aggregating resources: –Sharing cost, information and resources. More safety and robustness. Autonomy. 2016/2/29System Software Lab, National Tsing Hua University5

6 Structured and unstructured P2P networks Unstructured P2P networks: –Links are established arbitrarily. Such that overlay can easily adapt when a new peer wants to join. –The main disadvantage is that if a peer requests a piece of data in network, the query has to flood through the network and this will introduce large traffic overhead. Since there is no correlation between nodes and content managed by them, whether the flooding will find a peer having desired data is not quaranteed. Structured P2P networks: –Employ a globally consistent protocol to ensure that any node can efficiently route a search to the node that has desired data. Such guarantee reply on a more structured pattern of overlay links. Most famous one is known as distributed hash table (DHT). 2016/2/29System Software Lab, National Tsing Hua University6

7 A DHT based P2P network DHT provides an efficient lookup service similar to hash table. DHT is base on an abstract keyspace. A keyspace partitioning scheme split ownership of this keyspace among participating nodes. Properties: –Decentralized –Scalability –Fault tolerance –Load balance Applied protocols: –Chord, CAN, Pastry … etc. 2016/2/29System Software Lab, National Tsing Hua University7

8 Chord protocol Node keys are generated from its name with consistent hash function (ex. SHA-1). Nodes are sorted with node key and arranged in a circle. Each node has a successor which is next node in identifier circle and a predecessor which is previous node in identifier circle. 2016/2/29System Software Lab, National Tsing Hua University8 Each node is responsible for keyspace from key of its predecessor +1 to key of itself.

9 Chord protocol (cont.) Finger table 2016/2/29System Software Lab, National Tsing Hua University9

10 Chord protocol Chord protocol can retrieve a file in O(logN) steps. Other benefits: –Load balance –Dynamics –Distributed indices –Large scale combinational search These properties are suitable for a data storage system. 2016/2/29System Software Lab, National Tsing Hua University10

11 Malugo 2016/2/2911

12 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

13 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

14 2016/2/29System Software Lab, National Tsing Hua University14 We have servers around the world We have users around the world Theses users want to download (upload) files from (to) servers How to organize these servers to provide services?

15 An introduction Chord is suitable for storage systems. Malugo is base on Chord protocol but has some improvement. –Locality –Replication –More load balancing mechanism 2016/2/29System Software Lab, National Tsing Hua University15

16 Basic thought 2016/2/29System Software Lab, National Tsing Hua University16 With multiple copies in different region, we can achieve both availability and reliability

17 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

18 Malugo Architecture 2016/2/29System Software Lab, National Tsing Hua University18 Overlay Modules –Inter-Overlay Module –Intra-Overlay Module File Management Module

19 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

20 Malugo overlay network Malugo achieves locality by clustering close nodes into a group and uses a 2-tier hierarchical architecture. –The upper layer is named inter-group overlay and formed a unstructured peer-to-peer network to achieve geographic locality property. New coming storage peer or client first locating the proper group and then joins or connects to the closest group. 2016/2/29System Software Lab, National Tsing Hua University20 An example of Malugo overlay network –The lower layer is named intra-group overlay and is based on the Chord protocol. This layer helps organized peers in local area into a sub- network, providing storage service together. –Dealing with problems on super-peer architecture.

21 Locating the proper group 2016/2/29System Software Lab, National Tsing Hua University21 K

22 Node Join New coming node N will locate proper group first. If the proper group is not close enough, N will form a new group. –Then N will copy data back from neighbor groups. Otherwise, N will join the proper group with Chord protocol. –Then N will copy data back from previous node. 2016/2/29System Software Lab, National Tsing Hua University22

23 2016/2/29Anson Ho, System Software Lab, NTHU23 Recover the local-overlay and get the ID section needed to be recovered. Node leave or failure X X ! Check the closest neighbor root, And get info of the node-60 of Group B. Notify that root of Group B to notify its children nodes to replicate back to Group A. Replicates data back to successor Notify nodes which ID’s are in the range. x x x

24 Dealing with problems of super-peer architecture Traditional super-peer architecture has 2 problems –Single-point-failure –Hot-spot situation 2016/2/29System Software Lab, National Tsing Hua University24

25 Dealing with problems of super-peer architecture (cont.) 2016/2/29System Software Lab, National Tsing Hua University25 Inter-group routing information Inter-group routing information Inter-group routing information Record redundant root peer Record backup root peers within 2 hops

26 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

27 File Management Module –File management module provides clients to insert files to or retrieve files from Malugo system. –While a file is inserted to the system, the file is also replicated in every group for efficient downloads. –Caching Mechanism: Idle peers will cache popular file to achieve more load balancing.

28 File Insertion 2016/2/29System Software Lab, National Tsing Hua University28 User upload file to corresponding node of closest group Corresponding node notify root of group there is a file inserted. Root notify neighbor roots Root route the insertion message to corresponding node of its group.

29 Replication In order to provide different level of replication and save disk space. Conventional replication mechanism replies on global information. However, retrieve or maintain global information in peer-to-peer environment usually introduces large computing and transferring overhead. A novel replication mechanism from local view. –Replication level (L): file will be replicated every L groups. –Example: L=2 2016/2/29System Software Lab, National Tsing Hua University29

30 Replication (cont.) 2016/2/29System Software Lab, National Tsing Hua University30

31 File retrieve Locate the proper group first Connect to the responding node according Chord protocol. 2016/2/29System Software Lab, National Tsing Hua University31

32 Caching Mechanism Reducing load for nodes hosting famous file. 2016/2/29System Software Lab, National Tsing Hua University32 Busy Achieve upper bound Reject more connections or current downloads goes slow

33 Dynamics with replication Peer join and leave will affect file location among peers. Therefore, we need to deal with this problem. –Nodes which really have exact file: check if groups in L steps do NOT have file and check groups in L+1 steps DO have file. –Nodes which only have file indicator: check if node which the indicator to DO has file. 2016/2/29System Software Lab, National Tsing Hua University33

34 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

35 Performance evaluation Simulations –Performance comparison –System performance under different grouping bound –System performance under different number of nodes –System performance under different replication level Experiments in real world 2016/2/29System Software Lab, National Tsing Hua University35

36 Performance evaluation (cont.) 2016/2/29System Software Lab, National Tsing Hua University36 With different grouping bound Compare with non-physically-grouping arch. pdf

37 Performance evaluation (cont.) 2016/2/29System Software Lab, National Tsing Hua University37 Download speed under different replication level With different number of nodes Number of hops Access rate (kpbs)

38 Performance evaluation (cont.) 2016/2/29System Software Lab, National Tsing Hua University38 42 server on 9 sites in 4 cities Access rate (kpbs)

39 Agenda Introduction Malugo Architecture Overlay operations File operations Performance evaluation Conclusion

40 Conclusions Malugo has some properties: –Self-organized –Slightly reply on global information –Load-balance –Achieve geographic locality We also successfully implement Malugo system in a real world. We still go on –User authorization –User/file permissions In the future, we will develop Malugo into a storage platform that has file system properties. 2016/2/29System Software Lab, National Tsing Hua University40


Download ppt "Malugo – a scalable peer-to-peer storage system.."

Similar presentations


Ads by Google