“Blessed are the poor in spirit: for theirs is the kingdom of heaven.”

Slides:



Advertisements
Similar presentations
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Advertisements

Scalable Content-Addressable Network Lintao Liu
Intelligent File System Changgyu Oh 04/02/02. Problem Domain(1) Scalability of Current Decentralized P2P Networks similar to Gnutella –A total number.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
1 Unstructured Routing : Gnutella and Freenet Presented By Matthew, Nicolai, Paul.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Protecting Free Expression Online with Freenet Presented by Ho Tsz Kin I. Clarke, T. W. Hong, S. G. Miller, O. Sandberg, and B. Wiley 14/08/2003.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
Client-Server Computing in Mobile Environments
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presentation by Theodore Mao CS294-4: Peer-to-peer Systems August 27, 2003.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Ian Clarke, Oskar Sandberg, Brandon Wiley,Theodore W. Hong Presented by Zhengxiang.
GNUTELLA PEER-TO-PEER NETWORKING. GNUTELLA n What is Gnutella n Relation to the World Wide Web n How it Works n Sites / Links / Information.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Secure Distributed Document Sharing System Dukyun Nam, Seunghyun Han, CDS&N Lab. ICU.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Freenet: Anonymous Storage and Retrieval of Information
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
ANONYMOUS STORAGE AND RETRIEVAL OF INFORMATION Olufemi Odegbile.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Building Distributed Educational Applications using P2P
Distributed Systems CS
CHAPTER 3 Architectures for Distributed Systems
Plethora: Infrastructure and System Design
Early Measurements of a Cluster-based Architecture for P2P Systems
Information Integration for Digital Libraries
A Scalable content-addressable network
Presentation by Theodore Mao CS294-4: Peer-to-peer Systems
Unstructured Routing : Gnutella and Freenet
Peer-to-Peer Information Systems Week 6: Performance
Intelligent File Sharing Framework
Distributed Publish/Subscribe Network
Distributed Systems CS
Chord and CFS Philip Skov Knudsen
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Presentation transcript:

“Blessed are the poor in spirit: for theirs is the kingdom of heaven.”

Intelligent File Sharing Framework A THESIS IN Computer Science Changgyu Oh 5/2/2002

Contents Title Page Motivations3 Network Topologies4 Problem Domains5 Research Goal8 Related Works9 Intelligent File Sharing Framework12 Framework Figure13 Query Service Using Reasoning14 IS-A/Contained-In Hierarchies15 File Association Rules16 The Benefits of IFS Search17 Grouping Service18 IFS P2P V.S. P2P Network19 Benefits of Dynamic Group Partition20 Title Page Dynamic Group Partition 21 IP Clue Mechanism22 File Transaction in IFS24 QUERY SERVICE TYPES25 IFS System Architecture26 Client View27 Server View28 IFS Prototype Implementation29 IFS Query Interface30 Experimental Results32 Comparative Analysis33 Contributions34 Conclusion35 Future Work36 References37

Motivations Why P2P? –Limitations of Client/Server –Increasing interest in sharing and collaborative computing –Improving P2P technologies Why P2P File Sharing? –FILE Reusability –Share available resources Significance of this research –Increase Network scalability –Anonymity –Flexible and powerful query

Network Topologies

Problem Domains (1) Limitations of P2P Network –Scalability –Utilization of Network resources –P2P Network Topology Broadcast Logical Mesh network

Problem Domains (2) Limitation of Resource Source’s Anonymity –Resource source’s IP address in queryHit message Privacy and security How can source node send it to destination without revealing its IP address in public?

Problem Domains (3) Limitation of Keyword Based Query –Primitive and limited –Only one file searching –Not flexible –Not satisfy users’ requests

Research Goal To increase P2P network scalability Message flow control (Dynamic Group Partition and Caching) To protect the publisher anonymity IP-Clue mechanism (Encoding/Decoding) To increase the capacity of file querying File querying using intelligent reasoning, caching, dynamic peer group

Related Works-I Anonymous Publication Service The Publius system [Marc W., 2000] –document-anonymity because the key is split between the n servers, and without sufficient shares of the key a server is unable to decrypt the document that is stores. –Anonymity based on static, system-wide list of available servers. –Not support the adding of new server The Eternity system [Ross J., 1996] –Provides publisher’s anonymity by using one-way anonymous r ers –Server anonymity is not provided –Reader anonymity is not provided by open public proxies Query and Advertising System [Heimbigner D., 2000] –Arbitrary name is placed at the first level server for each client. –First level server has actual IP address of clients Freenet [Ian C., 2000] –Provides document-anonymity –Server-anonymity is not provided.

Related Works- II Meta Search Methods Efficient and Effective Metasearch [Yu C.,’1999] –representatives for each database optimizing relationship hierarchy Efficient Transitive Closure Reasoning [Lee Y.,2001] –inheritance, classification transitive closure reasoning –Class/Part/Containment Hierarchy Browsing Large Digital Library Collections [Geffner S., 1999] –classification hierarchies to increase capabilities of the data browsing in digital libraries.

Related Works-III File Sharing Systems using Caching The Distributed File System [Burns, R.C, 2000] –Detecting network failures ensures that caches are consistent. Network File System [Palmer J., 1996] –Clients poll the server to find out when the file was last modified –Determines the cached version is valid. Hint-Based Cooperative Caching file system [Sarkar, P., 2000] –Help clients make decisions based on the computer’s local state –Reduce overhead and access latency

Intelligent File Sharing Framework Major Building Blocks: –Query Service using Reasoning –IP-Clue Mechanism: Encoding/Decoding –Dynamic Grouping and Caching Service

Query Service Using Reasoning Goal: –Fast search using the file relation hierarchy Set –More flexible query and directory services Approach: –Relationships: IS-A Contained-In Run-With –File Relation Hierarchy Set Set of Number pairs (Ν), Relation Type (Ŗ), Constraint Rule (Ω), Hierarchy Identifier (Њ). –File Association Rules Generalized Association Rule Aggregated Association Rules Constrain-based Association Rule

IS-A/Contained-In Hierarchies

File Association Rules Generalized Association Rule –Subtype relationship between files –E.g., If Window multimedia application X is a multimedia application Y and if a multimedia file Z is running with the Multimedia application Y, then X runs Z. Aggregated Association Rule –directory contains multiple sub-directories or files –E.g., “Find the files on CS101 homework” Constrain-based Association Rule –File association based on constraints such as file size, Network capacity, etc. –E.g., “Find a file whose size is less than 1 MBtype and can be opened with MS Word.”

The Benefits of IFS Search MethodIFS SearchKeyword Based Search Keyword SearchYes File Extension SearchYesNo Application SearchYesNo Directory SearchYesNo Keyword Search in a certain directory YesNo File Extension Search in a certain directory YesNo File Search with ConstraintsYes CombinationYesNo

Grouping Service Goal: Increase Scalability –Control Maximum hop –Control a number of replicas of message generated by peer hosts –Control a number of peer hosts for message forwarding in a routing table of each peer host. Approach: –Group partition –Brother relationship –Caching

IFS P2P V.S. P2P Network

Benefits of Dynamic Group Partition Broadcast in a same group –Robust Search against node failure –Ensure a shortest path Increase Network Scalability by grouping peers –Server-less and Decentralized manner –Dynamic partition –Reduce network traffics Requires only one hop per a group

Dynamic Group Partition

IP Clue Mechanism Goal: Protect identity of resource publisher in P2P file sharing Approach –IP Encoding/Decoding Encoding the IP in source peers Decoding the encoded IP in destination peers –Formula: Assume that IP address of A is represented in [W.X.Y.Z] (e.g., [ ]) –(1) W + the size of query –(2) X + the first character of a query –(3) Y + the file extension size –(4) Z + the last character of a query message  Only the destination peers can recognize the IP Clue!!!

IP-Clue Mechanism

File Transaction in IFS

QUERY SERVICE TYPES

IFS System Architecture Component-based Architecture Servant Component –Highest level of component –Server + Client Components Manager Components: –Control work flow –Assign tasks to worker components Worker Components: –Perform actual tasks Service (Entity) Components: –Task description

Client View

Server View

IFS Prototype Implementation IFS prototype is built on top of Gnutella Phex System Developing System Environment –Need at least 25 Mbyte free Memory Space –JAVA Virtual Machine –Pentium III 500MHz CPU Event Driven Methods –Each task is performed based on events Components based Programming –Manager Components –Worker Components –Service Components

IFS Query Interface

Experimental Results Dynamic Group Partition and Cache

Comparative Analysis MeasureNapsterGnutellaIFS TopologyClient/ServerLogical Mesh Design Purpose MP3 file sharingFile sharing in a decentralized manner Enhanced Gnutella Size of Routing table Need a server’s IP address O(N)O(K) Where K << N Node Join Operation O(1) Node failureSevereTolerable Search Mechanism File indexing based on keyword search Fast Reasoning based on file association rules DescriptionClient/server based P2P network. Heavy traffics on servers Node failure is severe Decentralized Heave traffics due to the exponentially increased replicas of query messages Decentralized Control the network traffics Flexible query mechanism

Contributions Proposed a conceptual framework for decentralized P2P file sharing. –Dynamic group partition and caching –Query using fast reasoning –IP-clue mechanism (encoding/decoding) Designed a component-based architecture Implemented to extend an existing file sharing system (Gnutella Phex)

Conclusion The IFS system –Supports decentralized P2P File Sharing. –Increases high Network scalability. –Provides flexible file searching and querying. –Protect resource sources’ anonymity.

Future Work Further Research on the latency due to the grouping File registration strategy on heterogeneous environment Discover advanced mechanism to reasoning file relationships & file association rules Research on the grouping policies –Grouping by peer host’s network capacity –Grouping by interests –Grouping by context –Grouping by location

References: C. T. Yu, W. Meng, K.-L. Liu, W. Wu, and N. Rishe. Efficient and effective metasearch for a large number of text databases. In CIKM, pages , 1999 Y. Lee and J. Geller, Efficient Transitive Closure Reasoning in a Combined Class/Part/Containment Hierarchy, Journal of Knowledge and Information System, 2002 S. Geffner, D. Agrawal, A. Abbadi and T. Smith, Browsing Large Digital Library Collections Using Classification Hierarchies, CIKM, , 1999

References: (Continue) M. Waldman, A. Rubin, and L. F. Cranor. Publius: A robust, tamperevident, censorship-resistant, web publishing system. In Proc. 9th USENIX Security Symposium, page 59-72, August 2000 R. J. Anderson, The Eternity service, in Proceedings of the 1st International Conference on the Theory and Applications of Cryptology (PRAGOCRYPT '96), Prague, Czech Republic J. Palmer, R. Strong, and E. Upfal. Nonblocking membership protocols with asymmetric safety. Technical Report RJ10096 (91912), IBM Research Division, December 1997.

References: (Continue) I. Clarke, O. Sandberg, B. Wiley, and T. Hong. Freenet: A distributed anonymous information storage and retrieval system. In Proceedings of the Workshop on Design Issues in Anonymity and Unobservability, pages 46-66, July D. Heimbigner, Adapting Publish/Subscribe Middleware to Achieve Gnutella-like Functionality. Technical Report CU-CS , Department of Computer Science, University of Colorado, Sept P. Sarkar, J. H. Hartman ACM Transactions on Computer Systems (TOCS) November 2000 Volume 18 Issue 4