Presentation on theme: "Inside the MATRIX: Fair Information Practices in a World of Data Mining Professor Peter Swire Ohio State University DePaul Symposium on Privacy and Identity."— Presentation transcript:
Inside the MATRIX: Fair Information Practices in a World of Data Mining Professor Peter Swire Ohio State University DePaul Symposium on Privacy and Identity October 15, 2004
The Challenge Federal official, involved in funding information sharing systems, recently asked me: What can we do to address the concerns of privacy proponents so that they will stop complaining about MATRIX and other needed systems? –Todays talk in that national security context. –This was a good-faith question from an honorable person. –He was sobered by my answer.
Overview Pattern analysis and link analysis Current MATRIX as link analysis system Open questions on effectiveness of MATRIX and overall lawfulness This talk: the hard privacy issues that exist even if assume MATRIX is effective and lawful
Pattern & Link Analysis Pattern analysis as data mining Seek statistical correlations, then act DeRosa/CSIS describes pattern analysis issues Policy approaches: –Original MATRIX system: use data mining –Dempsey & Rosenzweig as D.C. policy compromise –ACLU and others oppose it entirely
Pattern & Link Analysis Link analysis: learn more about one suspect –More traditional police work –Warrants, subpoenas, and public records, depending on type of information –Current MATRIX system and focus of this talk
MATRIX Multi-State Anti-Terrorism Information Exchange (MATRIX) –$12 million from DHS & DOJ –Project security and access in Florida First proposed after 9/11 At the peak,12 states had agreed to participate –Currently FL, CT, MI, OH, PA are in program –States that have left or decided not to join after actively considering it: AL, CA, CO, GA, LA, KY, OR, SC, TX, UT, WV –Privacy and cost cited as reasons not to do it
The Current MATRIX Information accessible includes criminal history records, drivers license data, vehicle registration records, and incarceration/corrections records, including digitized photographs, with significant amounts of public records data. This capability will save countless investigative hours and drastically improve the opportunity to successfully resolve investigations. The ultimate goal is to expand this capability to all states. Official site:
2 Early Objections System was created and pushed by admitted drug smuggler, Herb Asher of Seisent –This is not relevant to how we should view the current system –It made it harder to say Trust Us on MATRIX After 9/11, 120,000 names sent to law enforcement for high terrorism factor –This is data mining, without individualized suspicion, with no transparency or known checks against abuse –Today, MATRIX is not a data mining application.
Jan Seisent Documents HTF based on factors including: Age, gender & ethnicity What they did with their drivers licenses Pilots or associations to pilots Proximity to dirty addresses/phone numbers Investigational data SSN anomalies Credit histories
Seisent Documents The associative links, historical residential information, and other information, such as an individuals possible relatives and associates, are deeper and more comprehensive than other commercially available database systems presently on the market.
Answering the Federal Official Privacy experts (not necessarily advocates) will have a list of questions: –About current configuration of system and its compliance with fair information practices –About system as designed (it had original, broader functions) –How system could easily evolve over time (mission creep)
Florida, Other States More States Supply Data Public Records Private Records (?) MATRIX Police & Other State Subscribers Intel (?) Feds (?)
Florida, Other States More States Supply Data Public Records Private Records (?) MATRIX Police & Other State Subscribers Intel (?) Feds (?) The Inputs
Florida, Other States More States Supply Data Public Records Private Records (?) Questions on Inputs: Data Quality: 2003 FBI announcement that NCIC data could no longer be subject to accuracy requirements of the Privacy Act Are state criminal, prison, and similar records more accurate? If record are fixed in one place, is that correction spread to all the other databases?
Florida, Other States More States Supply Data Public Records Private Records (?) Questions on Inputs: Sensitive data: Sources of identity theft -- SSNs are listed in many public records; bank account records in bankruptcy public records Known privacy concerns of American people on medical, financial, childrens, & other sensitive records
Florida, Other States More States Supply Data Public Records Private Records (?) Questions on Inputs: Private sector data. Was there notice & consent for these uses? For medical, credit history, and other sensitive data? Are these secondary uses appropriate? Federal data under the Privacy Act, with public oversight. What similar checks and balances for how private data is gathered and used?
Questions on Outputs: For secret/confidential data, assume good security in data center. How many people have access to the outputs of MATRIX? 800,000 uniformed police, for traffic stops, etc. Non-uniformed? Firefighters? Others? Police & Other State Subscribers Intel (?) Feds (?)
Questions on Outputs: How to secure outputs to 1 million people? Assume few/no secrets for what the million can see about the system – Swire paper on security/obscurity Training Audit trails Anti-browsing laws & enforcement But, what can terrorist or organized crime group learn by bribing one out of the million? Police & Other State Subscribers Intel (?) Feds (?)
Questions on the Data Center/System: A principle: the more important the decisions made, the more important it is to have due process and fair information practices. E.g., denied for mortgage or job, so have FCRA. Decisions here might include: Arrest the person (my student Greg Smith) Deny ability to travel, enter secured spaces Deny job, on a background check Suspicion on a persons associates? Other uses over time?
Questions on the Data Center/System: Access and correction as key fair information practices. Currently no access by individual to data held in MATRIX. Instead, individual told to go to every data source and get access there. Problems include: Burdensome to go to numerous sources Data sources not all publicly listed. Even if correct mistake once, it often reappears
The Sobering List of Privacy Issues for the Federal Official Inputs: data quality Inputs: sensitive data Inputs: private-sector data Outputs: secrets when thousands or a million receive data Outputs: anti-browsing and good security at the edges Important decisions by government require due process Access and correction (when secrecy unlikely to work) Transparency and governance, to reduce mistakes and improve public acceptance
Is It Worth Answering Those Questions? To the Homeland Security official: –If the privacy homework assignment seems too burdensome, then temptation is to minimize or ignore privacy issues –But the privacy homework is good policy and good government –Markle report and the need to do the privacy homework or else watch public opposition undermine the potential benefits of a system –Transparent, good governance as the touchstone
Conclusion The official who questioned me was surprised and sobered by the number of significant and difficult privacy issues in MATRIX Should be sobering to all of us how little the funders of MATRIX had worked through these issues This conference, and ongoing vigilance, are needed on these issues
Contact Information Professor Peter Swire Moritz College of Law of the Ohio State University Phone: (240) Web: