We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byEmerson Musgrove
Modified about 1 year ago
© 2011 IBM Corporation 1 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011 Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com
© 2011 IBM Corporation 2 Background Early 80’s: Founded Systems Research & Development (SRD), a custom software consultancy 1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA) 2005: IBM acquires SRD, now chief scientist of IBM Entity Analytics Personally architected, designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100’s of millions of entities Selected Affiliations: – EPIC, Member, Advisory Board – Privacy International, Member, Advisory Board – Markle Foundation, Member, Task Force on National Security in the Information Age – Senior Associate, Center for Strategic and International Studies (CSIS)
© 2011 IBM Corporation 3 A Late Bloomer to Privacy 1980 – 2001No clue whatsoever 2001 – 2006Slowly waking up 2007 – 2011Today, at best, a student of privacy
© 2011 IBM Corporation 4 A Journey Fraught with Reflection and Rethinking The greater my privacy and civil liberties awareness The greater the number of imperfections appear in the rearview mirror
© 2011 IBM Corporation 5 Katrina – Missing Persons Reunification Project Information about status of persons quickly end up scattered across countless databases – Over 50 such web sites/organizations were identified as having victim related data – Many people were registered duplicate times in the same database – Many people were registered duplicate times across databases – Many people were registered as missing in one database and found in another database Connecting found persons previously reported as missing becomes nearly impossible – Too many databases – Constantly changing data
© 2011 IBM Corporation 6 Katrina Reunification Project Statistics Total data sources 15 Usable records 1,570,000 Unique persons 36,815 Total loved ones reunited >100
© 2011 IBM Corporation 7 Katrina – Missing Persons Reunification Project Privacy by Design – Contractually authorized to delete all the data after the reunification office completed its work – Hence, a few months later, all collected data and reporting products were deleted DESTRUCTION OF EVIDENCE! Data Decommissioning – Destruction of Accountability
© 2011 IBM Corporation 8 “G2” My Skunk Works Project
© 2011 IBM Corporation 9 G2: Sensemaking on Streams 1) Evaluate new information against previous information … as it arrives. 2) Determine if what is being observing is relevant. 3) Deliver this relevant, actionable insight fast enough to do something about it … as it’s happening. 4) Do this with sufficient accuracy and scale to really matter.
© 2011 IBM Corporation 10 From Pixels to Pictures to Insight Observations Contextualization Information in Context Relevance Consumer (An analyst, a system, the sensor itself, etc.)
© 2011 IBM Corporation 11 G2: Sensemaking on Streams Domain: People, organizations, places, things, events … proteins, asteroids, and more. Will simultaneously commingle and make sense over structured, unstructured, biographic, biometric and geospatial data Multi-lingual Even curious: If it is unsure, it figures if it is worth researching and may choose to ask Google or maybe even Jeopardy champion to clear up any confusion
© 2011 IBM Corporation 12 Harnessing Big Data. New Physics. More data: better the predictions More data: bad data … good More data: less compute
© 2011 IBM Corporation 13 Smarter Planet: Example G2 Use Cases Traffic optimization – Route suggestions pushed to drivers, just-in-time, to avert significant traffic events Optimize individual lives – Search results optimized based on predictions about where you are going next Pandemic response – A nation able to work right through an extreme global pandemic with real-time citizen recommendations (e.g., “quarantine yourself!”)
© 2011 IBM Corporation 14 THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. ALTHOUGH EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS); OR ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF IBM SOFTWARE.
© 2011 IBM Corporation 15 IBM InfoSphere Sensemaking V184.108.40.206 Following two years of skunk works development while guided by privacy by design goals … it is just possible that there are more privacy and civil liberties enhancing capabilities baked-in, during conception and design, than any other general purpose advanced analytics technology commercially available … on Earth … to date.
© 2011 IBM Corporation 16 PbD: Full Attribution ABOUT THE FEATURE Every record knows where it came from and when No merge/purge data survivorship processing IMPORTANCE Universal Declaration of Human Rights has four articles containing the word “arbitrary” e.g., Article 9 reads “No one shall be subjected to arbitrary arrest, detention or exile.” If you don’t know where the data came from, how can this be non-arbitrary? The ability to identify every original record is essential for reconciliation and audit
© 2011 IBM Corporation 17 PbD: Data Tethering ABOUT THE FEATURE Adds, changes and deletes from source systems can be processed Real-time, sub-second (not requiring periodic batch reloading) IMPORTANCE Data currency in information sharing environments is important e.g., when derogatory data in error is corrected in a source system, it is vital such corrections are corrected everywhere, immediately
© 2011 IBM Corporation 18 PbD: Analytics on Anonymized Data ABOUT THE FEATURE Owners of data can anonymize selected fields before an information transfer Despite the cryptographic form of the data, deep predictive analytics (including some fuzzy matching) can still be accomplished when fusing this data for discovery and analysis IMPORTANCE With every copy of data, there is an increased risk of unintended disclosure Data anonymized before transfer and anonymized at rest reduces the risk of unintended disclosure And with full attribution, re-identification is by design to ensure reconciliation and audit
© 2011 IBM Corporation 19 PbD: Tamper Resistant Audit Logs ABOUT THE FEATURE Who searches for what is logged in a consistent manner Even the database administrator cannot alter the evidence contained in this log IMPORTANCE Every now and then people with access and privileges take a look at records without a legitimate business purpose, e.g., an employee of a banking system looking up their neighbor Tamper resistant logs make it possible to audit user behavior and can cause chilling-effects on misuse
© 2011 IBM Corporation 20 PbD: False Negative Favoring Methods Patrick T Smith 340-900-9000 Patricia Smith 340-900-9000 Pat T Smith 340-900-9000 Student ? ? 123 Patrick T Smith 340-900-9000 Patricia Smith 340-900-9000 Pat T Smith 340-900-9000 Student Closest. Hence, for sure EXISTING BEST PRACTICE 1 2 3
© 2011 IBM Corporation 21 PbD: False Negative Favoring Methods ABOUT THE FEATURE A false negative occurs when something that is true is not detected Sometimes a new record can belong to two different entities Usually systems select the strongest of the two But had there been only one choice, it would have matched to the other This is now properly handled, in real-time IMPORTANCE If a new record gets arbitrarily assigned, you may have inadvertently created a false positive False positives can adversely effect peoples lives – e.g., the police find themselves knocking down the wrong door or an innocent passenger is denied the ability to board a plane
© 2011 IBM Corporation 22 PbD: False Negative Favoring Methods Patrick T Smith 340-900-9000 Patricia Smith 340-900-9000 Pat T Smith 340-900-9000 Student ? ? NEW BEST PRACTICE Patrick T Smith 340-900-9000 Patricia Smith 340-900-9000 Pat T Smith 340-900-9000 Student 100% 123123
© 2011 IBM Corporation 23 PbD: Self-Correcting False Positives Which reveals this is a FALSE POSITIVE John T Smith Jr 123 Main Street 703 111-2000 DOB: 03/12/1984 John T Smith 123 Main Street 703 111-2000 DL: 009900991 A plausible claim these two people are the same 1 2 John T Smith Sr 123 Main Street 703 111-2000 DL: 009900991 Until this record comes into view 3
© 2011 IBM Corporation 24 PbD: Self-Correcting False Positives John T Smith Jr 123 Main Street 703 111-2000 DOB: 03/12/1984 John T Smith 123 Main Street 703 111-2000 DL: 009900991 John T Smith Sr 123 Main Street 703 111-2000 DL: 009900991 New Best Practice: FIXED IN REAL-TIME (not end of month) John T Smith 123 Main Street 703 111-2000 DL: 009900991 1322
© 2011 IBM Corporation 25 PbD: Self-Correcting False Positives ABOUT THE FEATURE A false positive is an assertion (claim) that is made, but not true With every new data point presented, all prior assertions are re- evaluated to ensure they are still correct, and if now incorrect, these are repaired If two people were thought to be the same because they share the same name, address and phone – then later it is discovered this is a JR and SR (two different people), this is now remedied In real-time, not end of month IMPORTANCE False positives can adversely effect peoples lives Without self-correcting false positives, databases start to drift from the truth and become visibly wrong – necessitating periodic reloading to fix this Periodic monthly reloading would mean wrong decisions are possible all month until the next reload, even though you knew beforehand
© 2011 IBM Corporation 26 PbD: Information Transfer Accounting Basic Data Name:Mark T Smith Address:POB 1346 City:Seattle Phone:(310) 555-0000 Tax ID:556-99-9999 Balance:$361.43
© 2011 IBM Corporation 27 PbD: Information Transfer Accounting Who Looked DateNameWhy 01/09/2010Ken WalesTeller trans 11/24/2010Susan CallieFraud invest
© 2011 IBM Corporation 28 PbD: Information Transfer Accounting Sent Where DateSent toWhy 04/19/2010ADPPayroll synch 06/01/2010AmexMarketing alliance 07/16/2010S&J IncThird party deal 12/31/2010IRS Annual compliance
© 2011 IBM Corporation 29 PbD: Information Transfer Accounting ABOUT THE FEATURE Can record who inspected each record and record this with the record, mush like a credit report has a list of recent parties who have inquired Can record what records were transferred to secondary systems, allowing users to inspect information flows IMPORTANCE It is often cumbersome to learn who has seen what records or what records have been shared system-to-system Users can now be easily provided such disclosures increasing transparency and control e.g., able to recall or cancel information transfers from selected sharing partners
© 2011 IBM Corporation 30 A Wide Number of Privacy by Design Features Data Tethering Analytics on Anonymized Data Tamper Resistant Audit Log Information Transfer Accounting Full Attribution False Negative Favoring Self-Correcting False Positives By design Mandatory
© 2011 IBM Corporation 31 IBM InfoSphere Sensemaking V220.127.116.11 Smarter More Responsible &
© 2011 IBM Corporation 32 IBM InfoSphere Sensemaking V18.104.22.168 Challenge Try to find another general purpose advanced analytics technology with more privacy and civil liberties enhancing features baked-in by design! In this competition everyone wins.
© 2011 IBM Corporation 33 And more likeminded, nifty features to come …
© 2011 IBM Corporation 34 IBM InfoSphere Sensemaking V22.214.171.124 Date of availability: January 28th, 2011 (TODAY!) ~~ Caveat: Limited availability, subject to lab approval ~~
© 2011 IBM Corporation 35 Related Reference Material Big Data. New Physics. Decommissioning Data: Destruction of Accountability Source Attribution, Don’t Leave Home Without It Data Tethering: Managing the Echo Out-bound Record-level Accountability in Information Sharing Systems To Anonymize or Not Anonymize, That is the Question Immutable Audit Logs (IAL’s) Big Data Flows vs. Wicked Leaks
© 2011 IBM Corporation 36 Privacy-Enhancing Technology, State of the Union Yesterday: Stand-alone privacy-enhancing technologies – Exist – If cost extra, adoption is low and slow – Some researchers wander off – placing attention elsewhere Today: Privacy by Design – Baked in – No additional cost – Some privacy and civil liberties enhancing functionality can even be embedded without an off switch
© 2011 IBM Corporation 37 Finally … Privacy by design is more than just technology. Equal, if not more attention, must be placed on privacy by design when conceiving process and policy.
© 2011 IBM Corporation 38 Privacy by Design (PbD) Confessions of an Architect Privacy by Design | Time to Take Control Toronto, Canada January 28th, 2011 Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics JeffJonas@us.ibm.com
© 2010 IBM Corporation 1 Big Data Flows vs. Wicked Leaks Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics
Improving Compliance with ISAs Presenters: Al Johnson & Pat Hayle.
Playback for Epic Ability to turn off default thresholds 1.
Click to add text © 2012 IBM Corporation 1 Streams Toolkit Landscape InfoSphere Streams Version 3.0 Mike Branson Toolkits.
Click to add text © 2012 IBM Corporation 1 InfoSphere Streams Streams Console Applications InfoSphere Streams Version 3.0 Warren Acker InfoSphere Streams.
® IBM Software Group © 2007 IBM Corporation Achieving Harmony IBM's Platform and Methodology for Systems Engineering and Embedded Software Development.
® IBM Software Group © 2013 IBM Corporation Innovation for a smarter planet Timeboxes in a New Paradigm of Behavior Modeling Barclay Brown, ESEP IBM
1 Information Sharing Environment (ISE) Privacy Guidelines Jane Horvath Chief Privacy and Civil Liberties Officer.
WEEK TWO, Session 2 Information Gathering. Helpdesk metrics must be reprioritized from measuring internal efficiencies to evaluating customer retention.
Copyright Irwin/McGraw-Hill Data Modeling Prepared by Kevin C. Dittman for Systems Analysis & Design Methods 4ed by J. L. Whitten & L. D. Bentley.
© 2015 IBM Corporation Big Data Journey. © 2015 IBM Corporation 2.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Receipts.
1 Chapter 13: Representing Identity What is identity Different contexts, environments Pseudonymity and anonymity.
Click to add text © 2012 IBM Corporation 1 Visualization of View Data Susan L. Cline SWS Visualization.
John Whittle Sales Specialist Case Study: Manage Transactions Across the Enterprise Featuring BMC Middleware Management.
© 2011 IBM Corporation 1 Big Data. New Physics. And Why Geospatial Data is Analytic SuperFood Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Unclaimed Property: Reporting and Claims Processes Bryant Clayton and Verna Bright Unclaimed Property Division Texas Comptroller of Public Accounts (CPA)
Lesson 5. International standard on auditing 315, states that the auditor should: “…obtain an understanding of the entity and its environment sufficient.
STEP 4 Manage Delivery. Role of Project Manager At this stage, you as a project manager should clearly understand why you are doing this project. Also.
1 University of New South Wales School of Accounting Auditing and Assurance Services 2010 LECTURE 5 Assertions and Tests of Detail.
© 2010 IBM Corporation 1 Mass Declassification What If? Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics
The Research Paper Created by A. Smith, T. Giffen & G. AuCoin Prince Andrew High School, January 2008.
Jason Houle Vice President, Travel Operations Lixto Travel Price Intelligence 2.0.
Data Mining Methodology 1. Why have a Methodology Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
CREDIT RECOVERY AND COLLECTION. CHALLENGERS 1.Longer repayment period 2.Higher loan limits 3.Higher monthly installments 4.Many cases handling cash in.
© 2009 IBM Corporation iEA16 Defining and Aligning Requirements using System Architect and DOORs Paul W. Johnson CEO / President Pragmatica Innovations.
Models and Designs Investigation 1. Label your new section Models and Designs Draw pictures of a “model” and “design”
Easy to use Ability to attach policies/procedures to call types Ability to schedule calls in advance Officer safety alerts Robust search capabilities.
General Business Secure Information Sharing in SharePoint 2010 Antonio Maio Senior Product Manager, Titus Inc.
Oracle Fusion Applications 11gR1 ( ) Functional Overview (L2) Manage Inbound Logistics (L3) Manage Supplier Returns.
Privacy: Challenges and Opportunities Tadayoshi Kohno Department of Computer Science and Engineering University of Washington.
Definition Importance of SRS Characteristics of SRS Format of SRS- e.g.
CS CS 5150 Software Engineering Lecture 27 People 2.
Microsoft Confidential © 2012 Microsoft Corporation. All rights reserved.
DUNSRight and XBRL – Enhancing Transparency through a Common Commitment to Global Standards 13 th XBRL International Congress June, 2006 CONFIDENTIAL &
After the 7 transactions, the ledger looks like Page 105 Figure 4.5. (Show On the White board) There are 10 accounts in the ledger. How do you calculate.
Data Mining. The Mining Analogy Data mining gains its name to some degree its popularity, by playing off a meaning that the data you have stored is much.
© 2013 IBM Corporation IBM UrbanCode Deploy v6.0.1 Support Enablement Training Source Configuration and Database Upgrades Michael Malinowski
Four tips to mitigate Mobile fraud in the future.
1 The Impact of SAS 112 on Governmental Financial Statement Audits GAQC Member Conference Call January 4, 2007 Presented by Chuck Landes, CPA.
Copyright© 2010 WeComply, Inc. All rights reserved. 5/7/2015 Careful Communication.
ITIL® Service Asset & Configuration Management Foundations Service Transition Thatcher Deane 02/17/2010.
© 2017 SlidePlayer.com Inc. All rights reserved.