Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.

Similar presentations


Presentation on theme: "Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han."— Presentation transcript:

1 Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han

2 Network Computing Laboratory | 2 Korea Advanced Institute of Science and Technology Contents One Line Comment Motivation & Problems My Idea Key Idea Distributed Hash Table P2P file sharing system using DHT Technical challenges Conclusion

3 Network Computing Laboratory | 3 Korea Advanced Institute of Science and Technology One-line comment Achieving fully decentralized P2P file sharing system by distributing file indexing structure as distributed hash table (DHT)

4 Network Computing Laboratory | 4 Korea Advanced Institute of Science and Technology Scalability in file sharing is a practical key issue!!! Even worse is the request of infamous files network attack like DDoS Internet Explosion Internet Korean Users US Users Hot!! File sharing infrastructure

5 Network Computing Laboratory | 5 Korea Advanced Institute of Science and Technology Solution Approach Scalable solution for file sharing Investigate currently existing file sharing solutions Currently P2P based file sharing seems the most appropriate Investigate methods to provide scalability to P2P based approaches Fully decentralized architecture for P2P based file sharing

6 Network Computing Laboratory | 6 Korea Advanced Institute of Science and Technology Key Idea Decentralized indexing Existing schemes are either centralized or self-indexing E.g., … Self-indexing is not a index scheme. They have no indexing scheme. Solve the absence of indexing scheme by flooding-based search mechanism  High search overhead k.mp3 Search (k.mp3) 1 2 3 4 Node3n.mp3 Node2s.mp3 Node4b.mp3 Node1k.mp3 nodefile Central Index table Search (k.mp3) 1 2 3 4 Distributed Index table a-e f-m n-r s-z

7 Network Computing Laboratory | 7 Korea Advanced Institute of Science and Technology Key Idea Distributed Indexing Split index table & distribute each part to each node Hash Table for Distributed Indexing Possible to fast lookup Input to hash table : file name Output from hash table : node address Distributed Hash Table for Distributed Indexing Split hash table & distribute to each node Lookup through shortcut path P2P file sharing with DHT

8 Network Computing Laboratory | 8 Korea Advanced Institute of Science and Technology DHT based File sharing: Technical Challenges CHALLENGE Search (k.mp3) 1 2 3 4 Distributed Index table a-e f-m n-r s-z Routing?! Nodes often join and leave!

9 Network Computing Laboratory | 9 Korea Advanced Institute of Science and Technology Related Works Peer-to-Peer File Sharing System Sharing files among personal computers [e.g.] Soribada, eDonkey, KaZaa, Gnutella 33.4% of Internet traffic in KT investigation (2004.2) Millions of simultaneous users Key technical issues in file indexing of existing P2P file sharing system Evolution of indexing scheme for improving scalability 1 st generation : centralized indexing 2 nd generation : fully decentralized self-indexing 3 rd generation : semi-centralized indexing

10 Network Computing Laboratory | 10 Korea Advanced Institute of Science and Technology Related Works First generation file sharing system Centralized indexing ([e.g.] Soribada, Napster) Problems : not scalable, single point of failure Centralized Directory Server (napster.com) N1 N2 N3 N4 N5 …… a.mp3N5 …… filenode Search(a.mp3) N5 IP addr. Request(a.mp3) File(a.mp3)

11 Network Computing Laboratory | 11 Korea Advanced Institute of Science and Technology Related Works Second generation file sharing system Fully decentralized self-indexing ([e.g.] Gnutella) Problems : flooding overhead, partial searching N1 N2 N3 N5 N4 N7 N6 N8 N9 Search(a.mp3) Search Result  N3, N5, N8 Selected Node  N5

12 Network Computing Laboratory | 12 Korea Advanced Institute of Science and Technology Related Works Third generation file sharing system Semi-centralized Indexing ([e.g.] eDonkey, KaZaa) Problems : partial searching, weak to DoS attack Supernode Search (a.mp3) File (a.mp3)

13 Network Computing Laboratory | 13 Korea Advanced Institute of Science and Technology Distributed Hash Table, Basic (1) Distributed Hash Table File name  H(x)  File ID, Node address  H(x)  Node ID Mapping File ID to Node ID hash key node 0 1 9, 20 98 23,7,11 12767,102 H(x) a.mp3 k.txt x.mpg g.doc File Name Node k b w n Node IDFile ID 30 (0-30) 71 (31-71) 89 (71-89) 127 (89-127) H(x) k b n w Node Address

14 Network Computing Laboratory | 14 Korea Advanced Institute of Science and Technology Distributed Hash Table, Basic (2) Key and Node are uniformly distributed and exist in the same ID space Each node is responsible to keys between predecessor node and itself 000 001 010 011 100 101 110 111 001 000 g.txt(2,8) - 010 a.mp3(1) 100 011 x.doc(4) - 110 101 s.mpg(1,4) - 111 k.mp3(2) H(g.txt) H(a.mp3) H(x.doc) H(s.mpg) H(k.mp3)

15 Network Computing Laboratory | 15 Korea Advanced Institute of Science and Technology Distributed Hash Table, Routing (1) Naïve approach Each node knows one’s successor node Lookup request is forwarded to the successor until (Node ID < File ID < Successor Node ID) Worse case performance : O(N) 000 001 010 011 100 101 110 111 110 101 s.mpg(1,8) - 111 k.mp3(2) successor=010 successor=100 successor=111 successor=001 Lookup (H(k.mp3))  Lookup (101)

16 Network Computing Laboratory | 16 Korea Advanced Institute of Science and Technology Distributed Hash Table, Routing (2) Tree-based routing table Shortcut to nodes whose node ID have different bits in each bit position 2 m ID space  m entries Lookup performance  O(logN) 0 1 1 1 0 0 000 001 010 011 100 101 110 111 d a b c 011 00x 1xx c a c 101 11x 0xx d d a Shortcut table Lookup (101)

17 Network Computing Laboratory | 17 Korea Advanced Institute of Science and Technology Distributed Hash Table, Routing (3) Complete example of routing table & routing algorithm Lookup from node 65a1fc with key d46a1c Lookup from node 65a1fc with key d46a1c Routing Tables

18 Network Computing Laboratory | 18 Korea Advanced Institute of Science and Technology Distributed Hash Table, Join Join process Try to lookup with one’s node ID as lookup key Gathering routing table entries in routing d46a1c Lookup from node d46a1c with key d46a1c Lookup from node d46a1c with key d46a1c 0- 1- 2- 3- 4- 5- 6- 7- 8- 9- a- b- c- e- f- d0- d1- d2- d3- d5- ….. dc- dd- de- df- d40- d41- d42- d43- d44- d45- ….. d4f- Routing Tables Creation Routing Tables Creation

19 Network Computing Laboratory | 19 Korea Advanced Institute of Science and Technology P2P File Sharing with DHT Storing file index into DHT Example : node a shares new file g.txt, node b lookup g.txt 000 001 010 011 100 101 110 111 011 00x 1xx c a c 101 11x 0xx d d a 000 01x 1xx a c c 110 10x 0xx d c a a b c d 1. Hash g.txt  File ID=101 2. Insert file info with ID=101 g.txt 3. Hash g.txt  File ID=101 4. Lookup with ID=101 101g.txta ID File name Node addr File index table addr(a) 5. Download file g.txt g.txt

20 Network Computing Laboratory | 20 Korea Advanced Institute of Science and Technology File Sharing with DHT: Technical Challenges Frequent node join & leave  Index replication & fast routing table adaptation Exact matching search by hashing file name  Keyword search scheme Hotspot problem in node which is indexing a popular file  Load balancing mechanism

21 Network Computing Laboratory | 21 Korea Advanced Institute of Science and Technology Conclusion New approach for P2P file sharing system Using new distributed data structure, Distributed Hash Table (DHT) Fully decentralized indexing Guarantee lookup performance of O(logN) Possible to full search Robust to node failure & network attack like DoS attack


Download ppt "Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han."

Similar presentations


Ads by Google