False Positives Financial Services Technology Consortium credit card fraud analysis –500,000 samples, 100,000 of them fraudulent –20% false positive and 20% false negative rates. 8 * 100,000 = 80,000;.2 * 400,000 = 80,000 –Therefore, half of all samples not fraud "Credit card fraud detection using meta-learning: Issues and initial results” by Stolfo et al Suppose 500 terrorists out of 200,000,000 –Same percentages 40,000,300 of which 400 terrorists –False positive 0.2%; false negative 5% = 400,474 of which 475 terrorists
Goals of HAVA Mandates replacement of punchcard and lever machines by 2006 Mandates the creation of statewide databases of registered voters by Jan. 2006 –Dems: disenfranchisement –Reps: padding of voter roles Privacy: identity theft + FIPs Can’t necessarily throw computers at problems
Usage of database Any election official in the State including any local election official, may obtain immediate electronic access to the information contained in the computerized list.
Drivers’ license database The chief State election official and the official responsible for the State motor vehicle authority of a State shall enter into an agreement to match information in the database of the statewide voter registration system with information in the database of the motor vehicle authority … to enable each such official to verify the accuracy of the information provided on applications for voter registration.
Verification of voter info Driver’s license number or last 4 digits of soc sec #
ACM study of databases of registered voters http://www.acm.org/usacm/weblog/index.php ?p=277#more-277
Recommendations to Election Assistance Commission (EAC)
General recommendations States will face many technical challenges in implementing these databases in a secure, accurate, and reliable manner, while protecting sensitive information and minimizing the risk of identity theft. The databases must also be easy to use and able to withstand the kinds of extreme demands to which they are likely to be subjected on Election Day. While the current guidance recognizes some of these challenges, it addresses the technical issues only at the highest level of detail. We urge the Commission to provide more technical detail on a broader set of issues as it further develops this guidance.
The ACM committee urges the EAC to specify: (1)methods or best practices for states to use in limiting and auditing access to SSN data within their databases and (2)the usage of SSNs only for verification purposes, not as identifiers. In any case, more detailed guidance is needed regarding steps to ensure the security and privacy of SSN data.
Risks of linking to other databases HAVA’s mandate that voter registration databases be coordinated with other statewide databases can, if not properly handled, undermine the accuracy of the voter registration data. Therefore the committee urges the EAC to pay careful attention to ensuring the accuracy of data as it develops this guidance.
Inaccuracies in databases [T]he guidelines should include more detail on the coordination of voter registration databases with other state agency databases (e.g., DMV records, death records, felony records, and so on). Such database integration represents a major potential source of inaccurate data as a person’s address and legal name may differ among state databases due to differing policies among state agencies for sourcing, updating, and validating data.
Accuracy and Auditing Knowing when and how voter registration records are created or amended or when active status is changed to inactive is important to establishing and maintaining accuracy. The committee feels that all information gathered during the registration process, including information about applications that are rejected or incomplete, should be retained for an appropriate period in order to support auditing.
Risks of automatic merges & purges states should … resist the temptation for automated 'merges' and 'purges' of voter registration data based on matching with other state databases. In the case that such merges and purges are carried out, we recommend that they are done with care. For example, changed, added, or deleted fields should be notated with the date and source of the change. This will make it easier for corrections to be made, as well as for databases that introduce too many errors to be identified…
Limiting access to databases [E]lection officials [should] detail fine- grained permissions for users of the database. Each user should be allowed to read (or update) only those data fields that are relevant to his or her role.
Voter Privacy in the Digital Age a study by the California Voter Foundation http://www.calvoter.org/issues/votprivacy/pub/voter privacy/index.html
Information collected by states All states require voters to provide their name, address and signature Every state but one requires voters to provide their date of birth 46 states ask voters to provide their phone number 34 states ask voters to declare their gender 30 states ask voters to provide all or part of their Social Security number
Info collected (con’t) 27 states require voters to select a party affiliation 14 states ask voters to provide their place of birth 11 states ask voters for their drivers’ license number 9 states ask voters to declare their race
Info collected (con’t) 4 states ask voters if they need special assistance at the polls 3 states require voters to provide a parent’s name 2 states ask voters to provide an email address 1 state, Arizona, requires voters to state their occupation.
What info is redacted? 11 states redact some or all of voters’ birthdates from voter rolls; 38 do not 5 states redact voters’ phone numbers; 41 do not All but one of the 30 states that collect Social Security numbers redact these numbers before redistribution to secondary users
Redacted info (con’t) 2 states redact voters’ birthplaces; 12 do not 6 states that collect voters’ drivers license numbers redact these numbers; 5 do not 27 states give certain voters the right to remove their records from voter lists obtained by secondary users
Access to voter lists granted to: Candidates and political parties - all states Juror source list - 43 states Unrestricted access, including commercial uses - 22 states Scholars and academics - 4 states Journalists - 4 states
Distribution of information Internet? - phones? Denial of service CD Roms - who has access