Presentation is loading. Please wait.

Presentation is loading. Please wait.

SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual.

Similar presentations


Presentation on theme: "SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual."— Presentation transcript:

1 SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual Computer Security Applications Conference 2009 1

2 Outline 1. Introduction 2. Background 3. System Model 4. System Design 5. Analysis 6. Evaluation 7. Conclusion and Future Work 2

3 1. Introduction  about MapReduce  A parallel data processing model  about communication security threats  Eavesdropping attacks  Replay attacks  Denial of Service (DoS) attacks  about Replication-based techniques 3

4 1. Introduction (cont.)  about SecureMR  A decentralized replication-based integrity verification scheme  Ensuring the integrity of MapReduce in open systems  A prototype of SecureMR based on Hadoop 4

5 2. Background  The data processing model of MapReduce is composed of three types of entities: a distributed file system (DFS), a master and workers.  MapReduce can be divided into two phases: i) a map phase ii) a reduce phase 5

6 2. Background (cont.) 6

7 3. System Model  MapReduce in Open Systems  The entities in MapReduce come from different domains  The communications and data transferred among entities are through public networks  SecureMR focus on protecting the service integrity for MapReduce 7

8 3. System Model (cont.)  Assumptions 1. A public/private key pair associated with a unique worker identifier 2. The master is trusted, but workers are not necessarily trusted 3. A good worker is honest and always returns the correct result for its task 4. The DFS for MapReduce provides data integrity protection 5. If a worker is good, then others cannot tamper its data 8

9 3. System Model (cont.)  Attack Models  Giving a wrong result without computation or tamper the intermediate result to mess up the final result  DoS, Replay attacks, Eavesdropping  Non-collusive malicious behavior  Collusive malicious behavior 9

10 4. System Design  Architecture Design 10

11 4. System Design (cont.)  Communication Design 11

12 Signed and Encrypts  Public-key cryptography from wikipediawikipedia 12

13 4. System Design (cont.)  Commitment Protocol 13

14 4. System Design (cont.)  Verification Protocol 14

15 4. System Design (cont.)  SecureMR Extension  An additional phase called Verify phase 15

16 5. Analysis  Security Analysis  An inconsistency between results returned by different mappers that are assigned the same task  An inconsistency between the commitment and the result generated by a mapper 16

17 5. Analysis (cont.)  No False Alarm  For any inconsistency detected by SecureMR, it must happen between good and bad mappers, between bad mappers or on a bad mapper  Non-Repudiation  For any inconsistency that can be observed by a good reducer or the master, SecureMR can detect it and present evidence to prove it 17

18 5. Analysis (cont.)  Attacker Behavior Analysis  Periodical Attackers  Strategic Attackers  Definition  D rate, detection rate  l, jobs  one master, n workers, m malicious workers (m < n)  b, number of blocks  p b, the percentage of blocks that will be duplicated in each job  b · p b, the number of duplicated blocks 18

19 5. Analysis (cont.)  Periodical attackers without collusion 19

20 5. Analysis (cont.)  Periodical attackers with collusion  P(B i ) denote the probability that a block will be duplicated i times  P(D) denote the probability that the inconsistency caused by the misbehavior of a malicious mapper will be detected. 20

21 5. Analysis (cont.)  Strategic attackers  P(F) denote the probability that the intermediate result that reducers receive is tampered 21

22 5. Analysis (cont.)  Naive task scheduling algorithm  Commitment-based task scheduling algorithm  Launching the duplicates of a task only after the task has been committed. 22

23 6. Evaluation  Experiment Setup  14 hosts provided by Virtual Computing Lab  Hadoop Distributed File System (HDFS) is also deployed  11 hosts as workers that offer MapReduce services and one host as a master  HDFS uses 13 nodes, not including the master host  2.66GHz Intel Intel(R) Core(TM) 2 Duo, Ubuntu Linux 8.04, Sun JDK 6 and Hadoop 0.19  Hadoop WordCount application 23

24 6. Evaluation (cont.)  Performance Analysis 24

25 6. Evaluation (cont.) 25

26 6. Evaluation (cont.) 26

27 7. Conclusion and Future Work  SecureMR, a practical service integrity assurance framework for MapReduce.  It is impossible to detect any inconsistency when all duplicated tasks are processed by a collusive group 27


Download ppt "SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual."

Similar presentations


Ads by Google