Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search.

Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search Techniques

2 Sybil Attack A fundamental problem in distributed systems. Single user assumes many fake/sybil identities  Already observed in real-world p2p systems Sybil identities can become a large fraction of all identities  “Out-vote” honest users in collaborative tasks launch sybil attack honest malicious

Sybil attack Present in both Application level and P2P Networking Attacker creates many fake/sybil identities Many cases of real world attacks : Digg, Youtube Several research works shown how easy it was to subvert DHT like Chord or Kademlia using Sybil Attack Automated sybil attack on Youtube for $147!

Defending against Sybil attacks Traditional solutions rely on central trusted authorities  Runs counter to open membership policies of OSNs Recent proposals leverage social networks  Lots of research activity recently  Each optimized under assumptions about the graph structure  Each evaluated on different datasets SybilGuard [SIGCOMM’06] SybilLimit [Oakland’08] Ostra [NSDI’08] SumUp [NSDI’09] SybilInfer [NDSS’09] Whanau [NSDI’10] MobID [INFOCOM’10] All schemes analyze the graph structure to isolate Sybils

Defending against Sybil attacks Recent proposals leverage social networks  Key Insight: Social links are hard to acquire in abundance  Look for small cuts in the graph  Conversely, look for communities around known trusted nodes  Dunbar’s Number  Power law node degrees Links difficult to create

HOW DO SOCIAL NETWORKS LOOK LIKE

SybilGuard: Defending Against Sybil Attacks via Social Networks Sybilguard is a system for detecting Sybil nodes in social graphs. Features of Sybil Guard  SybilGuard enables an honest node to identify other nodes  Verifier node V can verify if suspect node S is malicious  Guaranteed bound on number of sybil groups  Guaranteed bound on size of sybil groups  Completely decentralize Key Insight: 1. Use a social network to limit Sybils 2.Social links are hard to acquire in abundance 3.Look for small cuts in the graph DBLP Network

Dunbar’s number Limits the # of stable social relationships a user can have  To less than a couple of hundred  Linked to size of neo-cortex region of the brain  Observed throughout history since hunter-gatherer societies  Roughly reported to be 150 Also observed repeatedly in studies of OSN user activity  Users might have a large number of contacts  But, regularly interact with less than a couple of hundred of them

Power-law node degrees 9 U.S. highwaysU.S. Airlines

10 Path lengths and diameter all major networks have short path length from 4.25 – 5.88 six degrees of separation Facebook, 4.2 million for Octorber 2007, 6.12 from http://blog.paulwalk.net/2007/10/ 08/no-degrees-of-separation/

11 Implications of Path lengths and diameter The small diameter and path lengths of social networks are likely to impact the design of techniques for finding paths in such networks

12 Link degree correlations high-degree nodes tend to connect to other high-degree nodes ? OR high-degree nodes tend to connect to low-degree nodes ? In real society: the former theory is true. By virtue of two metrics: the scale-free metric and the assortativity. Suggests that there exists a tightly-connected “core” of the high-degree nodes which connect to each other, with the lower-degree nodes on the fringes of the network. The next question: How big the core is

13 Implications of Link degree correlations Spread of Information “A Measurement-driven Analysis of Information Propagation in the Flickr Social Network” [WWW’ 09]

14 Densely connected core the graphs have a densely connected core comprising of between 1% and 10% of the highest degree nodes such that removing this core completely disconnects the graph. Sub logarithmic growth

15 Densely connected core the graphs have a densely connected core comprising of between 1% and 10% of the highest degree nodes such that removing this core completely disconnects the graph. Sub logarithmic growth

Implications of densely connected core Network contains dense core of users Core necessary for connectivity of 90% of users Most short paths pass through core Could be used for quickly disseminating information So 10% at core What about remaining nodes (90% at fringe) 16

What does the structure look like the networks contain a densely connected core of high-degree nodes; and that this core links small groups of strongly clustered, low-degree nodes at the fringes of the network. octopus

Mixing time Random walk: choose each hop randomly Mixing time: #hops until uniform probability Fast mixing network: mixing time = O(log n)

Sampling by random walks A random walk has o(1) chance of escaping*  True when g bounded by o(n/log n)  Of r walks, (1-o(1))r = Ω (r) end nodes are good!  Can’t distinguish good from bad nodes in set Honest region Sybil region escaping paths non-escaping path

Creating Social Link Is Hard

Social links maintained over Internet

Social network …

Sybil region Social network Honest region … Attack edges A malicious user fools an honest user Creates an attack edge

Sybil resilience & group attachment theory Sybil schemes find bond groups around a trusted node But, these are only a fraction of all honest nodes Bond groups are hard for Sybils to infiltrate Not the case with identity groups

SYBILGUARD Yu, Kaminsky, Gibbons, Flaxman, Sigcomm 2006

Problem Formulation and Objective n Social network »n honest human users »1+ malicious users : multiple sybil identities »SybilGuard enables an honest node to identify other nodes »Verifier node V can verify if suspect node S is malicious

SybilGuard n Guaranteed bound on number of sybil groups » Divides n nodes into m equivalence classes » A group is sybil if it contains 1+ sybil nodes n Guaranteed bound on size of sybil groups » In a group, at most w sybil nodes n Completely decentralized » An honest node accepts honest nodes with high probability » Rejects malicious nodes with high probability » Accepts bounded number of sybil nodes

Random Routes n Foundation of SybilGuard: different from random walk n Random route begins at a random edge of a node n At every node »For an incoming edge i, there is a unique outgoing edge j »Thus, input to output is one-to-one mapped n A node A with d neighbors uniformly randomly chooses a permutation “x1,x2,...,xd” among all permutations of 1,2,...,d. n If a random route comes from the ith edge, A uses edge xi as the next hop.

SybilGuard Algorithm Attack Model  n honest users: One identity/node each  Malicious users: Multiple identities each (sybil nodes) node A: verify node B  A computes d random routes (length w)  B computes d random routes (length w)  If d/2 random routes intersects, accept S  Else reject S If few attack edges, then a sybil node’s random route is less likely to reach honest region And vice-versa

Main Assumptions of SybilGuard Honest Nodes Sybil Nodes Attack edges

Properties of Random Routes n Convergence »Once two routes merge, they will remain merged n Routes are back-traceable n There can be only one route with length w that traverses e along the given direction at its ith hop n If two random routes ever share an edge in the same direction, then one of them must start in the middle of the other n Cycles can exist, but with low probability »Prob. (diameter k cycle) = 1/d (k-2)

Sybilguard Algorithm 32 Step 1: Bootstrap the network. All users exchange signed keys. Key exchange implies that both parties are human and trustworthy. Steps: 2 Choose a verifier (A) and a suspect (B). A and B send out random walks of a certain length (2). Look for intersections. A knows B is not a Sybil because multiple paths intersect and they do so at different nodes. A B

SybilGuard Algorithm, cont. 33 A B

SybilGuard Caveats Bootstrapping requires human interaction. Assumes short random walks lie mostly in the honest region Results in poor threshold to colluding attackers. In a million node network,each attack edge accepts nearly 2000 sybil nodes. In million node network, SybilGuard cannot bound the number of sybils at all if there are > 15,000 attack edges.

SybilLimit A Near-Optimal Social Network Defense Against Sybil Attacks

 Motivation : To mitigate the problems of SybilGuard.  Basic insight : Social network (same as SybilGuard)  SybilLimit Novelity : 1. use many random routes but shorter ones. 2. intersect edges not nodes 3. limit how often each edge is used.

Identity Registration Each node (honest or sybil) has a locally generated public/private key pair  “Identity”: V accepts S means V accepts S’s public key K S  NO assumption/need PKI Every suspect S “registers” K S on some other nodes

Registration Goals Ensure that sybil nodes (collectively) register only on limited number of honest nodes  Still provide enough “registration opportunities” for honest nodes sybil regionhonest region K : registered keys of sybil nodes K K K K K K K K K K K K K K KK K : registered keys of honest nodes

Acceptance Criteria Accept S only if K S is register on sufficiently many honest nodes K K K K K K K K K K K K K K KK sybil regionhonest region K : registered keys of sybil nodes K : registered keys of honest nodes

Take random “walks” of w= hops  Honest nodes: likely to remain in honest region*  Sybil nodes: must cross an attack edge to reach honest region Key Idea sybil regionhonest region K K K K K K K K K K K K K K KK Register key at last hop of “walk”

4. Is K S registered? Verification Procedure V S 1. request S’s set of tails ABAB CDCD EFEF F 2. I have three tails A  B; C  D; E  F 3.common tail: E  F 5. Yes. 4 messages involved V accepts S Tails intersect + key registered

Attack edgesSybilGuardSybilLimit unbounded Sybil nodes accepted unbounded between and

SybilInfer: How to Win the Zombie Wars! Prateek Mittal, George Danezis (MSRC Intern) (MSR Cambridge)

SybilInfer Work from UIUC and Microsoft Research A centralized algorithm Uses the fast mixing properties of social network to design a Bayesian Classifier Classify nodes

Formal Model Assign probabilities of cuts being honest Using Bayes Theorem, we have that : Next Challenge: Model

Formal Model X X

SYBIL PROOF DHT

Distributed Hash Table Interface: PUT(key, value), GET(key) → value Route to peer responsible for key GET( sip://alice@foo ) PUT( sip://alice@foo, 18.26.4.9 )

DHTs are subject to the Sybil attack Attacker creates many pseudonyms Disrupts routing or stabilization s t {ID t }

The Sybil attack on open DHTs Brute-force attackClustering attack

Sybil Proof DHT How to build a sybil resilient DHT ?

Works from MIT PDOS Group Parallel and Distributed Operating Systems Quest to build Sybil Proof DHT  Sybil-resistant DHT routing 2005  A Sybil proof One hop DHT SocialNets 2008  Whanau NSDI 2010

A Sybil proof one hop DHT Motivation: SybilGuard/SybilLimit:Not a DHT, but a “general” Sybil defense Honest node accepts at most O(g log n) Sybils Features : DHTs are subject to the Sybil attack Social networks provide useful information Created a Sybil-resistant one-hop DHT  Resistant to g = o(n/log n) attack edges  Table sizes and routing BW O( √ n log n)  Uses O(1) messages to route

Basic one-hop DHT design Construct finger table by r random walks Route to t by asking all fingers about t  If r = Ω ( √ n log n), some finger knows t WHP Adversary cannot interfere with routing s t r r {know t?} {t?} {t’s IP address} {forwarded message from s}

Properties of this solution Finger table size: r = O( ) Bandwidth to construct: O(r log n) bits Bandwidth to query: O(r) messages Probability of failure: 1/poly(n)

WHĀNAU: A SYBIL-PROOF DISTRIBUTED HASH TABLE Chris Lesniewski-Laas M. Frans Kaashoek NSDI 2010

Contribution Whānau: an efficient Sybil-proof DHT protocol  GET cost: O(1) messages, one RTT latency  Cost to build routing tables: O(√N log N) storage/bandwidth per node (for N keys)  Oblivious to number of Sybils! Proof of correctness PlanetLab implementation Large-scale simulations vs. powerful attack

Sybil regio n Social network Honest region … Attack edges

Random walks c.f. SybilLimit [Yu et al 2008]

Building tables using random walks c.f. SybilLimit [Yu et al 2008] What have we accomplished? Small fraction (e.g. < 50%) of bad nodes in routing tables Bad fraction is independent of number of Sybil nodes

S ETUP L OOKUP Social Network Routing Tables key value P UT ( key, value ) P UT Queue

Routing table structure O(√n) fingers and O(√n) keys stored per node Fingers have random IDs, cover all keys WHP Lookup: query closest finger to target key Finger tables: (ID, address) Key tables: (key,value) Keynes AardvarkZyzzyva Kelvin

From social network to routing tables Finger table: randomly sample O(√n) nodes Most samples are honest

Honest nodes pick IDs uniformly Plenty of fingers near key

Sybil ID clustering attack [Hypothetical scenario: 50% Sybil IDs, 50% honest IDs] Many bad fingers near key

Honest layered IDs mimic Sybil IDs Layer 0Layer 1

Every range is balanced in some layer Layer 0Layer 1

Two layers is not quite enough Layer 0Layer 1 Ratio = 1 honest : 10 Sybils Ratio = 10 honest : 100 Sybils

Log n parallel layers is enough log n layered IDs for each node Lookup steps: 1. Pick a random layer 2. Pick a finger to query 3. GOTO 1 until success or timeout Layer 0 Layer 1Layer 2Layer L …

From Social relations to Routing Tables S ETUP L OOKUP Social Network Routing Tables key value P UT ( key, value ) P UT Queue

Problems Whanau’s goal is to create a Sybil proof DHT Which ensures delivery Whanau uses the idea of random walk in fast mixing graphs  Whanau has changed the basic structure of DHT  Tables contain O(√n log n) entries !!  The DHT has become a one hop DHT  But O(√n) entries are insane !!  Think of a DHT with 100000000 users How to handle churn ??

OUR IDEA OF A SYBIL PROOF P2P APPLICATION

Problems of present solutions SybilGuard and SybilLimit results in lots of false positives The system should try to capture “Sybil-like” behavior.  Though sybil-like behavior is also not an indicator, but together with the other evidence, it should work stronger. Whanau changes the basic structure of DHT to a multi-layer, ID unordered, one-hop one.  If one need to alter the structure of some DHT, it effectively means that the its structure has a inherent design flaw inside, which makes it vulnerable.

Problems of the present solutions As a design methodology, open systems are often uncontrollable.  Systems with feedback stabilize easily. There should be a feedback mechanism via ratings which is not present in any of the protocols. In whanau, a node have to save O(sqrt(n) log n) nodes the finger table.  10^8 members in a DHT is possible in the upcoming era of distributed systems, and far more members will be a commonl case when IPv6 will become general. Having a table size growing at this rate doesn’t scale well.

Our Idea Primarily, we don’t want to change the basic structure of a DHT to apply security patches in it. For a P2P application, our idea is to divide the responsibility in three layers.  Network-Access Layer  DHT Layer  Application Layer

Security Security is imposed via four mechanism  Admission control at Application layer  Social trust, Friendship, controlled at Application layer  Application Object (Files or other shared items) rating and Rating behavior recording (determined at application layer)  Routing behavior rating, Query reply rating and rating behavior recording (determined at DHT Layer)

Security A new node can not perform any DHT lookup, or can not perform in any DHT level ratings. May or may not perform Application object ratings, depending on application policy May not have full permissions capabilities at application (depending on the application policy). Can not make social relationships with another ``new’’ or ``immature’’ node

Explanation of the lookup process for immature nodes:  Any friend of an ``immature’’ node should work as the proxy for it (only for DHT lookup process). Content object / Shared files will be exchanged directly, but as the immatured nodes can not have acess to DHT.  All lookup for them will be done by its matured friends. It may want to load balance the queries among the friends. And the content / files shared by the child, will also be represented by their parents / friends. Any lookup of that content should return the IP address of a friend / parent.

However, the friend / parent will then provide the IP address of the child. Then the file transfer can work directly. However, based on the behavior of its provided content, first class citizens / DHT members will rate it. If the new node gets bad rating, its probability of becoming a DHT member will be less.

OUR IDEA OF A SYBIL PROOF DHT

Our Idea We are given a social graph Each node knows about their friends in the social graph Same assumptions about SybilGuard or Whanau  Fast mixing graphs  Small cut around attack edges  o(n/log n) attack edges at most

Our Motivation for DHT Isn’t it possible to keep the basic routing features of a DHT while making it sybil resilient? O(log n) table size Lookup should take O(log n) We should use social information to build the DHT

Bootstrapping the DHT Here comes the fundamental question  How to convert a given social graph into a DHT  So that the socially connected nodes are near  Socially far nodes are far in the DHT  Sybil nodes require significant amount of social engineering to be strongly connected members of a social group

A new type of DHT We want to build a DHT  Where distance between two nodes in the DHT-Space is related to their social-distance  i.e, two friends in the social graph are expected to be one- hop distant in the DHT-Space  Most of the queries will be through friends  Hence, the probability of reaching a Sybil node is less  We use the idea of Plexus A novel DHT routing based on linear block codes Plexus: A Scalable Peer-to-Peer Protocol Enabling Efficient Subset Search : Reaz Ahmed and Raouf Boutaba  ACM/ IEEE TON Feb 2009

85/15 Advertisement, P Query, Q advSet(P)  C qSet(Q)  C Plexus: Index Clustering C = set of cluster heads Linear code, C Cluster head  Codeword Generator matrix based routing Hamming code Cluster Pattern Cluster head

86/15 Linear Binary Code C = linear binary code  n: number of bits in a codeword  k: dimension  2 k codeword in code  d: minimum distance between any pair of codeword  e.g., G 24 =  24, 12, 8  Generator Matrix G,  2 k codeword can be formed by applying XOR to any combination of these k codewords.

87/15 Plexus: Routing Table In a complete network each peer is responsible for a codeword Peer with codeword X maintains links to k+1 peers with IDs computed as:  X i = X  g i 1  i  k  X k+1 = X  g 1  g 2  …  g k X k+1 is used for:  Replication  Reducing routing cost

88/15 Plexus: Routing Observation: C is closed under  operation Example: Route from X to Y where, X Y X2X2 g2g2 X3X3 X5X5 g3g3 g5g5 X 23 g3g3 g5g5 X 25 g3g3 g5g5 X 35 g2g2 g3g3 g5g5 g2g2 g2g2 X2X2 X 23 X 235 =Y X 1 =X  g 1 X 2 =X  g 2 … X k =X  g k X 21 =X 2  g 1 … X 23 =X 2  g 3 … X 2k =X 2  g k X 231 =X 23  g 1 … X 235 =X 23  g 5 … X 23k =X 23  g k

89/15 Hamming distance based clustering & indexing Maximum routing hops (within a subnet)  ½ K in normal condition  ½ K +2 in presence of failure. Disjoint routing paths  Source X destination Y  X  Y is disjoint from X  Y K+1 Alternate routing paths  Suitable for Multicasting  Improved fault resilience  Improved load balancing Strengths of Plexus Routing X Y Y K+1 Replica Y’ Y Y K+1 Y’ K+1 X

Social Network to Plexus Now, the problem reduces to assigning appropriate linear block codes to the nodes How to do that ?

Naïve Idea All nodes u know their friends F1(u). All nodes u send F1(u) to all of their friends. At this point, Every node u, in addition to F1(u), can calculate its "mutual friend list" for each of its friends. For any two friends u, v :  Their mutual friend set is Every node u, can also calculate F2(u), its exact two- hop distant friend list.

Naïve Idea Each node u, sorts their friends according to an "influence metric.“ For each friend v of a node u, Influence(u,v) = Influence of v on u = I (u, v) it is highly probable that a sybil node will have very low influence on an honest node via attack edge due to very small number of mutual friends.  However not only sybils, but also a common friend of two groups will have low influence on both group (however, this case is not handled in any algorithms)

Naïve Idea Each node u, calculates I(u,v) and I(v,u) for all its friends. There are 2*deg(u) such quantities.  C(u) = Those nodes for which u has more influence on v than v has on u  P(u) = Those nodes for which v has more influence on u than u has on v.  and R(u) = Those nodes for which u and v both has same influence on each other

Naïve Idea max{ C(u) } = x = The friend, on which u is maximum influential.  However, it doesn’t mean x doesn’t have a friend more influential than u. It means, u does not have a friend on which it has more influence than it has on x. max { P(u) } = y = The friend which has the highest influence on u.  It also doesn’t mean y doesn’t have friends on which it has more influence than it has on u. max { R(u) } = z = The friends which has same influence on u as u has on them.

Naïve Idea lx = I(x,u) ; Iy = I(u,y), Iz = I(u,z) = I(z,u) MI = { Ix, Iy, Iz} If Max { MI } = Ix : u is an “influencial” node Iy : u is an “influenced” node Iz : u is an “neutral” node

Naïve Idea Action D : If u is “influenceD”, it decides not to generate any ID, and decides to take command from y. It sends a message to y that it has come into his control. Action L : If u is an influentiaL node, it decides to generate ID for u and some of F1(u) and F2(u) Action N : If u is Neutral, then decides Action L or Action D by a uniform bernoulli trial.

Now, u generates ID for itself, for those of Gang(u). It will try to keep friend IDs as close as possible, also those of Gang(u) which are friends themselves will get close ID as possible. u will inform all of Gang(u) all the ids generated by it. Members of Gang(u) will take care of id generation of their neighbors

But how to handle collision ? Some gossip protocols needed !!

Naïve Idea Thus Id’s will be assigned in the code space according to their “Social Groups”

Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search.

Similar presentations

Presentation on theme: "Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search.

Similar presentations

Presentation on theme: "Defending Sybil Attack in Peer2Peer Networks Md. Tanvir Al Amin 04 09 05 2064 Shah Md. Rifat Ahsan 10 09 05 2060 Adviser : Dr. Reaz Ahmed Distributed Search."— Presentation transcript:

Similar presentations

About project

Feedback