We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byAsher Grimstead
Modified about 1 year ago
© 2010 IBM Corporation 1 Big Data Flows vs. Wicked Leaks Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics December 1, 2010
© 2010 IBM Corporation 2 Background Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy 1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA) 2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics Personally designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities Today: My focus is in the area of ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections – Markle Foundation, Member, Task Force on National Security – EPIC, Member, Advisory Board
© 2010 IBM Corporation 3 Data Volumes Exploding “Every two days now we create as much information as we did from the dawn of civilization up until 2003.” -Eric Schmidt, CEO Google
© 2010 IBM Corporation 4 Big Data Flows: How Many Copies? Blog Post: How Many Copies of Your Data? Is Somewhat Like Asking: How Many Licks to the Center of the Tootsie Pop?How Many Copies of Your Data? Is Somewhat Like Asking: How Many Licks to the Center of the Tootsie Pop? Often, at minimum, 144 copies – Backups – Internal transfers – Other operational systems – Operational data stores – Data warehouses – Data marts – Testing systems – Training systems – Their backups – External transfers (information sharing partners) – And then their entire ecosystem (from warehouses to backups) – And their information sharing partners Sometimes 10,000’s of copies
© 2010 IBM Corporation 5 What’s driving information sharing and big data?
© 2010 IBM Corporation 6 Time Computing Power Growth Sensemaking Algorithms Available Observation Space Context Enterprise Amnesia Organizations Are Getting Dumber
© 2010 IBM Corporation 7 No Context
© 2010 IBM Corporation 8 Information in Context … and Accumulating Top 200 Customer Job Applicant Identity Thief Termination “No-Rehire”
© 2010 IBM Corporation 9 Demonstration
© 2010 IBM Corporation 10 VOTER George F Balston YOB: 1951 D/L: SW Karen Blvd Apt 7 Beaverton, OR Last voted: 2008 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Is This Voter Deceased? When it comes to best practices in voter matching, if only a name and year of birth match, this is insufficient proof of a match. Many different people in the U.S. share a name and year of birth. Human review is required. Unfortunately, there are thousands and thousands of cases just like this and state election offices don’t have the staff (or budget) to manually review such volumes.
© 2010 IBM Corporation 11 VOTER George F Balston YOB: 1951 D/L: SW Karen Blvd Apt 7 Beaverton, OR Last voted: 2008 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Now Consider This Tertiary DMV Record DMV George F Balston YOB: 1951 SSN: 5598 D/L: SW Clementine Blvd Apt 210 Beaverton, OR The DMV record contains enough features to match both the voter (name, year of birth and driver’s license) and/or the deceased persons record (name, year of birth and SSN). For the sake of argument, let’s say it matches the voter best.
© 2010 IBM Corporation 12 VOTER George F Balston YOB: 1951 D/L: SW Karen Blvd Apt 7 Beaverton, OR Last voted: 2008 DMV George F Balston YOB: 1951 SSN: 5598 D/L: SW Clementine Blvd Apt 210 Beaverton, OR DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Is This Voter/DMV Person Deceased? The voter/DMV record now shares a name, year of birth and SSN with the deceased person record. In voter matching best practices, this evidence would be sufficient to make a determination that this voter is in fact deceased. This case no longer needs human review.
© 2010 IBM Corporation 13 VOTER George F Balston YOB: 1951 D/L: SW Karen Blvd Apt 7 Beaverton, OR Last voted: 2008 DMV George F Balston YOB: 1951 SSN: 5598 D/L: SW Clementine Blvd Apt 210 Beaverton, OR DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Context Accumulates As features accumulate it becomes easier to match future identity records. As events and transactions accumulate – detection of relevance improves. Here we can see George who died in 1995 voted in 2008.
© 2010 IBM Corporation 14 Flows vs. Leaks
© 2010 IBM Corporation 15 Flows vs. Leaks Most flows are by design – Better context – Better prediction Unintended disclosure flows – External: e.g., Cyber – Internal: e.g., Insider threats Recent WikiLeaks – Devastating consequences – Could have been worse. What if the leaks were not made public rather selectively and quietly passed around to others?
© 2010 IBM Corporation 16 Wicked Leaks, Prediction Sudden change of leak cadence – so much, so fast, no considered intolerable – May result in new harsh laws and prosecutions Receiving stolen property (e.g., data) and benefiting from it … met with severe pursuit and prosecution – Stealing party – The information exchange points (e.g., Wikileaks) – Those publishing the contents (e.g., the media) What if? Could result in less transparency, less accountability
© 2010 IBM Corporation 17 Protecting Big Data from Wicked Leaks Central indexes – fewer copies of the data moved; easier to control and audit usage Analytics in the anonymized data space – data anonymization before transfer, reducing the risk of unintended disclosure Immutable (tamper resistant) audit logs – to help prove the system is being used within policy and law Real-time active audits – to evaluate the actions of authorized users
© 2010 IBM Corporation 18 Big Data Flows vs. Wicked Leaks Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics December 1, 2010
Insert your company logo here (on slide master). Insert your company logo here (on slide master) Developed by the Department of Communications, Information.
IP Audit "We're in an object-oriented, outsourced, and open-sourced world, and organizations are anxious to take steps to ensure that the software they.
COUNTY BOARD RECORDS, Retention, Production, and Confidentiality April 21, 2009 Don Wright, General Counsel NC State Board of Elections.
SharePoint Governance Questions January 2014 ©2014 SUSAN HANLEY LLC.
PLANNING THE AUDIT Individual audits must be properly planned to ensure: Appropriate and sufficient evidence is obtained to support the auditors opinion;
WORKDAY TRAINING QUESTIONS AND ANSWERS. Position Management.
Chapter 13 Planning for Electronic Commerce. Learning Objectives In this chapter, you will learn about: Planning electronic commerce initiatives Strategies.
PCTI Limited - A Unique Name For Quality Education CS-75 INTRANET ADMINISTRATION By: Vinay Aggarwal.
Information Systems Using Information (Higher and Intermediate 2)
Infrastructure for E-Business DIS 302: E-Business Laudon & Laudon.
SECURITY AWARENESS. The Importance of Security Awareness Training Security Awareness Training provides the knowledge to protect information systems and.
Chapter 12 Technology. INTRODUCTION This chapter considers technology in general, with some limited emphasis on software. The life cycle and software.
Managing Trade Secrets: New Challenges in the Digital Environment Guriqbal Singh Jaiya May 21, 2008 Genève,
Workshop: Governance, Risk, Compliance (GRC) & Identity Management , 09:00-12:30, Track: Workshop I Dr. Horst Walther, Kuppinger Cole + Partner.
Of. and a to the in is you that it at be.
What happened to IPv5? and other oft asked IPv6 questions The Internet Society, IPv6 and You Susan Estrada.
The. of and a to in is you that it he was.
Feb 5, 2011 My name is Kaye Beach. I am a member of the Board of Directors of the Constitutional Alliance and have been working with Mark Lerner for the.
Getting Real with Electronic Content Management EDMS use at Dakota County.
Call Recording Made Easy Presented by Barbara Courneya National Director of Contact Center Technology Avaya Certified Contact Center Expert ,
1 Security Awareness 101 ……and Beyond 20th Annual Computer Security Applications Conference December 6, 2004 Tucson, Arizona Kelley Bogart Melissa Guenther.
Cost Control Chapter 3 Managing the Cost of Food.
Describing Your Program and Choosing an Evaluation Focus Thomas J. Chapel, MA, MBA Chief Performance Officer (Acting) CDC/Office of the Director/OCOO Presented.
The. of and a to in is you that it he for.
Emerging Privacy and Security Issues for Healthcare Professor Peter P. Swire The Ohio State University Center for American Progress Sentrigo Webinar July.
Copyright, 2001, ePrivacy Group HIPAA Summit IV Preconference III Basic Privacy and HIPAA.
HIPAA Workforce Training The Health Insurance Portability & Accountability Act (HIPAA) requires that the University train all workforce members about the.
© 2016 SlidePlayer.com Inc. All rights reserved.