SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications.

Slides:



Advertisements
Similar presentations
Key distribution and certification In the case of public key encryption model the authenticity of the public key of each partner in the communication must.
Advertisements

Anonymity without Sacrificing Performance Enhanced Nymble System with Distributed Architecture CS 858 Project Presentation Omid Ardakanian * Nam Pham *
Secure Multiparty Computations on Bitcoin
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Cryptography and Network Security 2 nd Edition by William Stallings Note: Lecture slides by Lawrie Brown and Henric Johnson, Modified by Andrew Yang.
Chapter 19: Network Management Business Data Communications, 5e.
Efficient Public Key Infrastructure Implementation in Wireless Sensor Networks Wireless Communication and Sensor Computing, ICWCSC International.
SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual.
1 Secure Credit Card Transactions on an Untrusted Channel Source: Information Sciences in review Presenter: Tsuei-Hung Sun ( 孫翠鴻 ) Date: 2010/9/24.
Ragib Hasan University of Alabama at Birmingham CS 491/691/791 Fall 2011 Lecture 10 09/15/2011 Security and Privacy in Cloud Computing.
 Authorization via symmetric crypto  Key exchange o Using asymmetric crypto o Using symmetric crypto with KDC  KDC shares a key with every participant.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
An Authentication Service Against Dishonest Users in Mobile Ad Hoc Networks Edith Ngai, Michael R. Lyu, and Roland T. Chin IEEE Aerospace Conference, Big.
Security in Wireless Sensor Networks Perrig, Stankovic, Wagner Jason Buckingham CSCI 7143: Secure Sensor Networks August 31, 2004.
Applied Cryptography for Network Security
Cryptography and Network Security Third Edition by William Stallings Lecture slides by Lawrie Brown.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Alexander Potapov.  Authentication definition  Protocol architectures  Cryptographic properties  Freshness  Types of attack on protocols  Two-way.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Cong Wang1, Qian Wang1, Kui Ren1 and Wenjing Lou2
MapReduce.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
MOBILE AD-HOC NETWORK(MANET) SECURITY VAMSI KRISHNA KANURI NAGA SWETHA DASARI RESHMA ARAVAPALLI.
1 Anonymous Roaming Authentication Protocol with ID-based Signatures Lih-Chyau Wuu Chi-Hsiang Hung Department of Electronic Engineering National Yunlin.
Ragib Hasan Johns Hopkins University en Spring 2010 Lecture 6 03/22/2010 Security and Privacy in Cloud Computing.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Optimizing Cloud MapReduce for Processing Stream Data using Pipelining 作者 :Rutvik Karve , Devendra Dahiphale , Amit Chhajer 報告 : 饒展榕.
An Authenticated Payword Scheme without Public Key Cryptosystems Author: Chia-Chi Wu, Chin-Chen Chang, and Iuon-Chang Lin. Source: International Journal.
Network Security Lecture 23 Presented by: Dr. Munam Ali Shah.
PR SM A Secure Code Deployment Scheme for Active Networks Amdjed Mokhtari Leïla Kloul 22 November 2005.
Strong Security for Distributed File Systems Group A3 Ka Hou Wong Jahanzeb Faizan Jonathan Sippel.
MSRC: (M)icropayment (S)cheme with Ability to (R)eturn (C)hanges Source: Journal of Information Science and Engineering in review Presenter: Tsuei-Hung.
Rushing Attacks and Defense in Wireless Ad Hoc Network Routing Protocols ► Acts as denial of service by disrupting the flow of data between a source and.
Optimizing Cloud MapReduce for Processing Stream Data using Pipelining 2011 UKSim 5th European Symposium on Computer Modeling and Simulation Speker : Hong-Ji.
Chapter 4 Using Encryption in Cryptographic Protocols & Practices.
P2: Privacy-Preserving Communication and Precise Reward Architecture for V2G Networks in Smart Grid P2: Privacy-Preserving Communication and Precise Reward.
Topic 1 – Introduction Huiqun Yu Information Security Principles & Applications.
Secure Communication between Set-top Box and Smart Card in DTV Broadcasting Authors: T. Jiang, Y. Hou and S. Zheng Source: IEEE Transactions on Consumer.
Shambhu Upadhyaya 1 Ad Hoc Networks – Network Access Control Shambhu Upadhyaya Wireless Network Security CSE 566 (Lecture 20)
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
Data Integrity Proofs in Cloud Storage Author: Sravan Kumar R and Ashutosh Saxena. Source: The Third International Conference on Communication Systems.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
m-Privacy for Collaborative Data Publishing
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
P2 DAP-Sybil Attacks Detection in Vehicular Ad hoc Networks..
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Cryptography and Network Security Chapter 1. Background  Information Security requirements have changed in recent times  traditionally provided by physical.
International Conference Security in Pervasive Computing(SPC’06) MMC Lab. 임동혁.
Security Review Q&A Session May 1. Outline  Class 1 Security Overview  Class 2 Security Introduction  Class 3 Advanced Security Constructions  Class.
TOMIN: Trustworthy Mobile Cash with Expiration-date Attached Author: Rafael Martínez-Peláez and Francisco Rico-Novella. Source: Journal of Software, 2010,
 Attacks and threats  Security challenge & Solution  Communication Infrastructure  The CA hierarchy  Vehicular Public Key  Certificates.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Introduction to Distributed Platforms
湖南大学-信息科学与工程学院-计算机与科学系
Providing Secure Storage on the Internet
Distributed computing deals with hardware
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications Conference, 2009, pp Presenter: Tsuei-Hung Sun ( 孫翠鴻 ) Date: 2010/9/17

2 Outline Introduction Motivation Contribution Scheme Security analysis Performance evaluation Comment

3 Introduction MapReduce – A parallel data processing model to simplify parallel data processing on large clusters. – Proposed by Google. – It is mainly running on clusters belonging to a single administration domain.  Yahoo’s Hadoop – Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (Amazon S3).

4 Introduction Fig. The MapReduce data processing reference model. M1. M2. M3. R1. R2. R3. (Distributed File System)

5 Introduction Fig. Combine multiple map and reduce phases.

6 Introduction Data processing service integrity  Replication-based techniques – Sampling techniques – Checkpoint-based verification

7 Motivation Existing address the service integrity, but not on data processing service. Replication-based techniques drawback – Replicate all distributed computing tasks for consistency verification is not efficiency. – Not scalable to perform centralized consistency verification over massive result data.

8 Contribution Decentralized replication-based integrity verification for MapReduce in open systems. Achieves security: non-repudiation, resilience to DoS attacks and replay attacks. Security components can be easily integrated into existing MapReduce implementations. Low performance overhead. The first attempt to address data processing service.

9 Scheme Three types of entities: a DFS, a master and workers. Assumptions 1. Worker has a public/private key pair associated with a unique worker identifier. 2. Master is trusted and its public key is known to all, workers are not necessarily be trusted. 3. Good worker is honest and always returns the correct result for its task while a bad worker may behave arbitrarily. 4. DFS provides data integrity protection, each node can verify the integrity of data. 5. If a worker is good, then others cannot tamper its data.

10 Scheme SecureMR - Architecture Design

11 Scheme SecureMR - Communication Design Commitment protocol Verification protocol

12 Scheme Commitment Protocol ID Map : a monotonically increasing identity of a map task. Data Loc : input data block location. sig: Master’s signature. K pubM : Mapper’s public key. sigM: Mapper’s signature. H P1,…,H Pr : hash value for each partition of its intermediate result Scheduler Task Executor Commit Manager

13 Scheme Verification Protocol Pi: partition of intermediate results that the reducer will process. AD M : Mapper’s address. H Pi : Pi partition committed by the Committer. Req Seq : sequence number. Task Executor Manager Scheduler Verifier Committer Verifier Committer Verifier Manager Verifier sigR

14 Scheme Extension for Reducers and MapReduce Chain Map Phase Map Phase Reduce Phase Reduce Phase Verify Phase Add Verifier component Add Committer component

15 Security analysis Collusive Attack - Attacker behavior analysis – Periodical Attacker Naive attacker Without collusion attacker With collusion attacker – Strategic Attacker

16 Security analysis Fig. Detection Rate for Non- Collusion Naive Attacker. Fig. Detection Rate for Non- Collusion Periodical Attacker. b = 20; Pm = 1 b = 20; Pm = 0.5 b : block number of one input job. Pm: misbehaving probability. l: misbehavior of mapper is detected when he do number of jobs.

17 Security analysis Fig. Detection Rate for Collusion Periodical Attacker. Fig. Misbehaving Probability vs. Duplication Rate. n : total worker number. m: malicious workers n = 50; Pm = 0.5; b=20; l = 15 n = 50; b =20; l = 15

18 Performance evaluation T: time D: data transmission cost. r: number of reducers.

19 Performance evaluation Fig. Response Time vs. Number of Reduce Tasks. Fig. Response Time vs. Data Size. number of map task = 60; Data Size = 1GB number of map task = 60; number of reduce task =25

20 Performance evaluation Fig. Response time vs. Duplication Rate. Fig. Response time vs. Number of Reduce Tasks. number of map task = 60; Data Size = 1GB

21 Comment Assign and Notify can combine into one step. Ticket M contain some parameters are the same as reducer sign part in request massage. If first request is failure, how can reducer do? (Ticket M and Req Seq how to renew) In Response massage, mapper can sign Data together that can avoid one hash and reducer also didn’t need to check it.

22 References MapReduce: Simplified Data Processing on Large Clusters Computer cluster Monotonic function

23 Analysis Function Naive attacker and without collusion attacker With collusion attacker

24 Analysis Function Strategic attacker

25 Appendix MapReduce: Simplified Data Processing on Large Clusters Computer cluster: A computer cluster is a group of linked computers, working together closely, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that of a single computer, while typically being much more cost- effective than single computers of comparable speed or availability. Monotonic function is a function which preserves the given order.

26 MapReduce in open system – Entities in MapReduce come from different domains, which are not always trusted. – Communications and data transferred among entities are through public networks. (be eavesdropped) Most of research focus on utilize MapReduce to solve problems in specific application, few work pays attention to service integrity protection.

27 Advantage vs. weakness Advantage – Inconsistencies for mappers No False Alarm Non-Repudiation – Ensure MapReduce data processing service integrity by scalable decentralized replication- based verification.

28 1) Provide mappers to examine the integrity of data blocks from the DFS. 2) Provide reducers to verify the authenticity and correctness of the intermediate results generated by mappers. 3) Provide users to check if the final result produced by reducers is authentic and correct. 4) Combination of three ensures the MapReduce data processing service integrity to users.

29 First step: ensures the integrity of inputs for MapReduce in open systems. Second step: provides reducers with the integrity assurance for their inputs. Third step: guarantees the authenticity and correctness of the final result for users.

30 DoS attacks – Sending requests to a good worker and asking for intermediate results. – Impersonate the master to send fake task assignments. Replay attack – Sending old task assignments to keep them busy. Eavesdrop attack Tamper the messages

31 Non-collusive malicious behavior – Independently, not necessarily agree or consult with each other. – Ex. Return wrong results for the same input, they may return different wrong results. (detected) Collusive malicious behavior – Depends on other collusive workers, communicate, exchange information, and make an agreement with each other. – Ex. Assigned tasks by the master, know their colluders receive tasks with the same input blocks. (not detected)