Presentation on theme: "Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space Paper by Martin MulazzaniSBA Research Sebastian SchrittwieserSBA."— Presentation transcript:
Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space Paper by Martin MulazzaniSBA Research Sebastian SchrittwieserSBA Research Manuel LeithnerSBA Research Markus HuberSBA Research Edgar WeipplSBA Research (Secure Business Austria Research)Secure Business Austria Research (presented in 20 th USENIX Security Symposium,20 th USENIX Security Symposium August 8 – 11, 2011 San Francisco, CA) Presented by Chayutra PailomCS Department, UTSA
What is this topic about? Motivation: Proved feasibility on cloud storage Key issues: Specific attacks under certain conditions and study exploitation “online slack space” Solutions: Countermeasures on the client Comparisons: will be mentioned in background Possible improvement: Be aware of online cloud storage and protect your important data with applied security measures or cryptography Number of References: 32 references
Terminology Cloud Technology No accurate definition, mostly for business Two major areas: computing & storage Picture: http://en.wikipedia.org/wiki/Cloud_computing
Terminology Cloud Storage A model of networked online storage where data is stored on multiple virtual servers, generally hosted by third parties, rather than being hosted on dedicated servers. Credit: http://www.simplywit.net/2010/05/dropbox
Terminology Slack Space P ortions of a hard drive that are not fully used by the current allocated file and which may contain data from a previously deleted file. Storing small files on a filesystem with large clusters will therefore waste disk space; such wasted disk space is called slack space Credit: http://backup.datavelocity.com/s_slack_space.htm
Terminology Online Slack Space Based on cloud storage ≡ online Space that can be dynamically added. Data rarely fill fixed storage locations exactly. Residual data occurs when a smaller file is written into the same cluster as a previous larger file. Slack space is examined because it may contain meaningful data. For this paper, torrent file will be used to evaluate this online slack space.
Terminology Attack Vector A path or means by which a hacker (or cracker) can gain access to a computer or network server in order to deliver a payload or malicious outcome. A specific computer-system vulnerability, along with the path and method used to exploit it. It's just a particular way to gain access to a computer in order to install malware, gain external control, or extract user data. “ Exploitation ”
Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space
Introduction Hosting files on the Internet Cloud storage as shared resource Dropbox is the biggest online storage service Shared usage of user’s data raises new challenge Ensure pool of data only authorized access is possible This paper will propose measurements to: prevent unauthorized data access information leakage used for other online storage services
Introduction Paper contribution Document the functionality of an advanced cloud storage service. Show under what circumstances unauthorized access to files stored within Dropbox is possible. Assess if Dropbox is used to store copyright protected material. Define online slack space and the unique. problems for the process of a forensic examination. Explain countermeasures to mitigate the resulting risks from attacks for user data.
Paper organized as following: Section 2 – Related work Section 3 – Attack on files stored in Dropbox Section 4 – How Dropbox can be exploited Section 5 – Evaluation the feasibility of attacks Section 6 – Proposed various techniques to reduce the attacks Introduction
Background “ Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centers that provide those services. ”... “ The datacenter hardware and software is what we will call a Cloud. ”
Dropbox:Dropbox One of the most popular cloud storage providers Client (Dropbox program) – Server (Online storage) Client keeps data in specified directory in sync with the servers, changes automatically Background Credit: http://www.simplywit.net/2010/05/dropbox
Data Deduplication At the server: Same file only stored once Benefit: Save storage space at the server At the client: Calculate hash sum or other digest Benefit: Reduce communication with clients Beneficial for everyone, right? There might be a case that we have same hashes Background
Dropbox (cont.): Every file is split up to 4 MB in size (chunk) When add file to dir, client app calculates hash values and then sent to the server and compared Use SHA-256 algorithm for data deduplication (server-side) AES-256 The connection client-server use SSL 25 billion users Store more than 100 billion files 1 million files added every 5 minutes Background
Dropbox (cont.): After uploading chunks, Dropbox will compare hash values on server to validate the transmission If the hash values don’t match, chunk is repeated Drawback: server trusts the client software Server can only calculate the hash values of uploaded chunk Unable to validate the hash values of files already on Dropbox provided by client Dropbox can be operated on various OS Background
User has to choose threat model: Danger of honest, but curious operator? Unauthorized file access by third parties? Location of data?
Hash Value Manipulation Attack (Attack #1) Unauthorized File Access Manipulating local hash computation Every time a new file is added Can be set arbitrarily Hash value needs to be known Results in unauthorized file access Undetectable for victim or Dropbox Disclaimer: attack valid against all systems with client-side data deduplication
Stolen Host ID Attack (Attack #2) Host ID is created during setup phase of the client Client software doesn’t store username and pw Server is looking for 128-bit key provided by client for authorization Algorithms is not publicly known Host ID can be stolen extracted by malware or social engineering All the files can be downloaded by attacker including your private files Can be detected/prevented by Dropbox Unauthorized File Access
Direct Download Attack (Attack #3) Protocol used is HTTPS For downloading, client requests file chunk from https://dlclientXX.dropbox.com/retrieve Sending SHA-256 hash values and valid Host ID as HTTPS POST data Host ID is ignored as long as hash is matched Just create HTTPS request with valid Host and the hash values of the chunk (but not that easy) Dropbox easily to detect this unauthorized access since Host ID is not used for uploading Unauthorized File Access
Hiding data in the cloud (Attack #3) Same as retrieval, but for storing chunks Uploading without linking Simple HTTPS request: https://dl-clientXX.dropbox.com/store No storage quota / unlimited space If host ID is known: push data to other peoples Dropbox Can be detected / prevented by Dropbox “Online Slack Space” Unauthorized File Access
Online Slack Space Upload files is similar to download files Dropbox calls https://dl-clientXX.dropbox.com/store with hash values and host ID as HTTPS POST. After finishing, Dropbox software links the uploaded files to the host ID with another HTTPS request. If the linking step is omitted, the modified client software can upload files without limitation Dropbox can used to store data without decreasing the available amount of data. “Online Slack Space” Attack Vector & Online Slack Space
Attack Vector If the host ID is known, attacker can upload and link arbitrary files to the victim’s Dropbox’s account. Instead of linking the file to his account with the 2 nd HTTPS request, he can use an arbitrary host ID with to link the file. An attacker could use any 0-day weakness in the file preview of supported operating systems to execute code on the victim’s computer.weakness Pushing a manipulated file into his Dropbox folder and waiting for the user to open that directory. A large scale infection using Dropbox is however very unlikely, and if an attacker is able to retrieve the host ID he already owns the system Attack Vector & Online Slack Space
Attack Detection Attacker can access to the content of client database, no further access to victim’s system is needed. Access only specified files by obtaining only hash values of the file. Stolen Host ID attack and Direct Download attacks are detectable in some extent. Hash Manipulation Attack is still undetectable. Unauthorized File Access
Evaluation The authors measured time until (hidden) chunks get deleted: Random data in multiple files Hidden upload: at least 4 weeks Regular upload: unlimited undelete possible (> 6 months) The authors used the HTTPS attack:HTTPS attack Stealthiness was not an issue Hash manipulation equally suitable
Evaluation Dropbox used to store filesharing files, BitTorrent, as well as how long data is stored as online slack space. BitTorrent relies heavily on hashing for file identification. It uses SHA-1 hashes to identify files and their chunks so only.torrent file is not enough The authors mentioned about the copyright issues so they test on.torrent that lacks copyright and avoid uploading data
Evaluation The authors downloaded just only 4 MB Just 1 st chunk is sufficient to determine a file is stored or not and test the HMA. The authors observed the torrent files by using “Identifying files” e.g. screen shots. In total 100 torrent archives, 98 contained identifying files. The author removed 2 and added another 9 consisting identifying files, totally 107
Evaluation For every.torrent and identifying file, the authors generated SHA-256 hash value and check if the files were stored in the Dropbox. Total 368 hashes (some have multiple files) If the file is bigger than 4 MB, only generated the hash of the first chunk.
Evaluation Popular files on Dropbox: thepiratebay.org Top 100 Torrent files Downloaded copyright-free content (.sfv,.nfo,...) 97 % were retrievable (detail next slide) Approx. 475k seeders Interpretation:: At least one of the seeders uses Dropbox
Evaluation From 368 hashes, 356 were retrievable, 12 unknown to Dropbox but linked to 8.torrent. This means for every.tor either.tor file, the content or both are easily retrievable from Dropbox once the hashes are known. The hit rate describes how many of them were retrievable from Dropbox
Evaluation The authors analyzed the age of. torrent to see how quick Dropbox users are to download the.torrents and the corresponding content, and to upload everything to Dropbox. 20 % of torrents were less than 24 hours old
Evaluation Online Slack Space Evaluation Test whether Dropbox can be used to hide files by uploading without linking them to user account. Generate first 30 files and uploaded then 55 files and delete them afterward for online slack space. Using multiple files with various sizes and random content, an unintended hash collision is minimized.
Evaluation Long term undelete Free account can undo file modification/undelete 30 days The authors tested uploading 55 files (30 shared +25 private) on 7 Oct 10. At he end of April 11, 100% constantly available More than 6 months still retrievable. Online Slack Uploaded 30 files of various sizes without linking them to any account with HTTPS at beginning of Jan 11 More than 4 weeks, all files were still retrieveable When Dropbox fixed the HTTPS download attack in late Apr 11, 50% of the files were retrievable.HTTPS download attack
Discussion Dropbox storing files from filesharing networks From tested 107.torrents, contained more than 2 GB, the biggest 7.2 GB (.torrent indicates the size of file) Dropbox bigger than 2 GB Pro account It’s very easy to hide data on Dropbox with low accountability. Malicious user can upload files without linking them to his account, resulting in unlimited storage. For advanced setup, booting from live CD and save all files in online slack space no trace or local evidence of local computer issues for future forensic examination
Security Recommendations Applied to all cloud storage providers Basic security primitives A strong protocol for provable data possession is needed based on cryptography or probabilistic proofs or both (using algorithm ) Upload every file, no client-side data deduplication Every service should use SSL for all communication and data transfer.
Security Recommendations Secure Dropbox: Use secure data possession protocol Use simple challenge-response mechanism below:
Security Recommendations Challenge-response mechanism: Challenge the client: Client and Server are in possession of the same file Client has to answer challenges Precomputable by the server Possible challenges: Hash a subset of data Append & XOR random bits and bytes Possibly multiple rounds Drawbacks: Challenges can be forwarded Not a real proof! But makes hash manipulation attacks detectable
Security Recommendations Secure Dropbox (cont.) Uploading chunk without linking them to the users. Dropbox should not be allowed client to have unlimited storage capacity. It can make online slack space on Dropbox infeasible. To prevent misuse of historic data and online slack space, all chunk linked to a file that is retrievable by a client should be deleted Dropbox should keep track of which files are in which Dropboxes (enforcement of data ownership)
Security Recommendations Secure Dropbox (cont.) Check for host ID activity Dynamic host ID
Conclusion Dropbox is used to heavily store data. Online slack space can be used to hide files to any cloud storage. The majority of the evaluation is to analyze the existence of attack vector for the torrent files. Hash manipulation attack is undetectable Applicable to all services using client-side data deduplication These vulnerabilities are not specific to Dropbox, as the underlying communication protocol is straightforward The data possession proof on the client side, should be included by all cloud storage operators. As of April 2011, They fixed the HTTPS Up-/Download Attack Host ID is now encrypted on disk
Sample References  At Dropbox, Over 100 Billion Files Served–And Counting, retrieved May 23rd, 2011. Online at http://gigaom.com/2011/05/23/at-dropbox-over-100- billionfiles-served-and-counting/. ATENIESE, G., BURNS, R., CURTMOLA, R., HERRING, J., KISSNER, L., PETERSON, Z., AND SONG, D. Provable data possession at untrusted stores. In Proceedings of the 14th ACM conference on Computer and communications security (2007), CCS ’07, ACM, pp. 598–609. WANG, Q., WANG, C., LI, J., REN, K., AND LOU, W. Enabling public verifiability and data dynamics for storage security in cloud computing.