Presentation on theme: "Privacy & Stylometry: Practical Attacks Against Authorship Recognition Techniques Michael Brennan and Rachel Greenstadt"— Presentation transcript:
Privacy & Stylometry: Practical Attacks Against Authorship Recognition Techniques Michael Brennan and Rachel Greenstadt
Overview 2 Introduction to Authorship Recognition The Threat to Privacy & Anonymity Experimental Results Attacking Authorship Recognition Study Setup Methodology Results Updated Work Translation Attacks Data Set Release Building an Attack Corpus Future Work
What is Authorship Recognition? The basic question: “who wrote this document?” Stylometry: The study of attributing authorship to documents based only on the linguistic style they exhibit. “Linguistic Style” Features: sentence length, word choices, syntactic structure, etc. Handwriting, content-based features, and contextual features are not considered. In this presentation, stylometry and authorship recognition are used interchangeably.
Stylometry: A Brief History The classic stylometry problem: The Federalist Papers. 85 anonymous papers to persuade ratification of the Constitution. 12 of these have disputed authorship. Stylometry has been used to show Madison authored the disputed documents. Used as a data set for countless stylometry studies. Modern Stylometry Based in Machine Learning SVMs, Genetic Algorithms, Neural Networks, Bayesian Classifiers… used extensively.
Why is Stylometry Important? Great, you can figure out who wrote a 200 year old document, so what? From the Institute for Linguistic Evidence: “In some criminal, civil, and security matters, language can be evidence… When you are faced with a suspicious document, whether you need to know who wrote it, or if it is a real threat or real suicide note, or if it is too close for comfort to some other document, you need reliable, validated methods.” Plagiarism, Forensics, Anonymity…
Stylometry: the Threat to Privacy 6 Good techniques for location privacy (Tor, Mixes, etc). But it may be insufficient! Stylometry can identify authors based on their writing. Can anonymous authors defend against this? ~6500 words to leak identity – Rao, Rohatgi “The Multidisciplinary Requirement for Privacy” – Carlisle Adams
Supervised Stylometry 7 Given a set of documents of known authorship, classify a document of unknown authorship. Classifier trained on undisputed text. Scenario: Alice the Anonymous Blogger vs. Bob the Abusive Employer. Alice blogs about abuses at Bob’s company. Blog posted anonymously (Tor, pseudonym, etc). Bob obtains words of each employee’s writing. Bob uses stylometry to identify Alice as the blogger.
Unsupervised Stylometry 8 Given a set of documents of unknown authorship, cluster them into author groups. No existing author information, “Similarity Detection” Scenario: Anonymous Forum vs. Oppressive Government. Participants organize protests. Posts are completely unlabeled (no pseudonyms) Unknown organizational structure, number of authors, etc. The government applies unsupervised stylometric techniques. # of authors may be discovered, author profiles created. Fed into supervised stylometry system to identify individuals.
The Threat in Numbers (Unsupervised) 9 Experiments with “Forum” Scenario 9 Anonymous authors, raw forum data: Only 35% Accuracy. Posts words long 5 authors, same data set, modified for longer posts: 88.2% Accuracy Passages words long Other Research on 50 Unique Authors (WritePrints): Unsupervised “Pairing”: 95%
The Threat in Numbers (Supervised) 10 “Workplace” Scenario is Well Researched Supervised Classification of 2-5 Authors: 79-99% Accuracy (IAAI ‘09). Random sets taken from 15 subject pool. 3 Different Methods Used Other Research on 50 Unique Authors (WritePrints): Supervised: 90%
Protecting Privacy: Attacking Stylometry 11 Problem Can stylometry be attacked? How so? How Easily? Evaluation In depth study on attacking multiple methods of Stylometry. Conclusion Stylometry is very vulnerable to attack by inexperienced human adversaries. Attacks can be used to protect privacy.
Related Work Adversarial stylometry is not a well-researched part of the field. Somers & Tweedie, 2003: Authorship Attribution & Pastiche. Mixed Results. Kacmarcik & Gamon, 2006: Obfuscating Document Stylometry to Preserve Anonymity. Encouraging results. Our goal: Assess the general vulnerability of authorship recognition systems when under attack by human adversaries. Results Published in “Practical Attacks Against Authorship Recognition Techniques.” (IAAI '09)
We’re Under Attack! Obfuscation Attack An author attempts to write a document in such a way that their personal writing style will not be recognized. Imitation Attack An author attempts to write a document such that the writing style will be recognized as that of another specific author.
Study Setup & Format 3 representative methods of stylometry. 15 Individual Authors. Participation had three parts: Submit 5000 words of pre-existing writing from a formal source. Write a new 500 word passage as an obfuscation attack. Task: Describe your neighborhood. Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy, describe your day. Authors had no formal training or knowledge in linguistics or stylometry.
Imitating Cormac McCarthy “On the far side of the river valley the road passed through a stark black burn. Charred and limbless trunks of trees stretching away on every side. Ash moving over the road and the sagging hands of blind wire strung from the blackened lightpoles whining thinly in the wind.”
Imitation Attack Examples “Light sliced through the blinds, and construction began in the adjacent apartment. The harsh cacophony crashed through the wall.” “Hot water in the mug. Brush in the mug. The blade read ‘Wilkinson Sword’ on the layered wax paper packaging.” “He fills the coffee pot with water, after cleaning out the putrid remains of yesterday's brew. The beans are in the freezer, he remembers.”
Methodology & Evaluation Validation Experiment: Verify accuracy claims of all 3 methods. Attack Experiment: Determine accuracy when attacked using the same experimental conditions. Experimental Conditions: Each test was applied to four random sets of 2, 3, 4, and 5 distinct authors. Same sets used in each method. Cross-validation used to measure accuracy (repeated sub sampling and leave-one-out).
Machine Learning: Quick Introduction Machine Learning is AI Learn & recognize patterns automatically, make decisions. Most common incarnation of ML: Classifiers Spam Filter: Bayesian Classifier Given a message, what is the probability that it is spam or not spam given the extracted features. Important to pick good features.
A Closer Look at Linguistic Features Basic Measurements: Average syllable/word/sentence count, letter distribution, punctuation. Lexical Density Unique_Words / Total_Words Gunning-Fog Readability Index: 0.4 * ( Average_Sentence_Length * Complex_Word_Ratio ) Result: years of formal education required to read the text.
Stylometry Methods Method 1: Signature Stylometric System. Three features: word length, letter usage, punctuation usage. 95% base accuracy.
Stylometry Methods Method 2: Neural Network Classifier Based on Singh & Holmes, 9 linguistic features. Lexical density, unique word count, character counts with and without spaces, two readability indexes, sentence count, average sentence length, and average syllables per word. 78.5% base accuracy. NN: simple switches, complex overall system. Input Output Hidden
Stylometry Methods Method 3: Synonym-Based Classifier. Based on Clark & Hannon, Examines word choice when presented with all possible synonyms % accuracy. (Diagram from “A Classifier System for Author Recognition Using Synonym-Based Features” by Clark & Hannon)
Accuracy on Training Set All three methods were highly accurate in recognizing the correct author of the training set.
Obfuscation Attack Detection Obfuscation attacks brought the accuracy of each system to that of random chance or worse.
Imitation Attack Detection Imitation attacks all but completely circumvented all three authorship recognition methods.
Imitation Attack Success The imitation attacks were very successful in targeting the intended victim of the attack.
Questioning the Results Why are some authors more susceptible to attack? Why did certain methods perform better than others? Would these results hold up in a stricter domain? What if the participants were skilled in linguistics or stylometry? What features and methods could offer some resistance to these attacks?
Translation Attacks Common suggestion: run some text through a translator and back! Limited testing show this to be ineffective. Translation Test Setup: Five Subjects, (88-98% base accuracy) Single passage translated to/from German and Japanese. Tested on Neural Network & Synonym Methods 100% effectiveness in identifying true author.
Using Attacks to Improve Privacy 29 What you can do: Important: These these methods failed, others might not. Best: Try to imitate someone else. Write less. What is needed A tool that will assist in anonymizing writing styles.
Building a Tool What a style-modifying tool needs: A variety of methods & features. A large corpus of existing authors. AI-Assisted (Human oversight to preserve content) Open Source Non-Stylometry Features: Contextual Clue Identifiers (Time, Content)
Building A Corpus IRB Approved Survey: dragonauth.com We need your help! Mandatory for participation: 6000 words untampered writing 1 Obfuscation Attack 1 Imitation Attack Optional: Demographic Information Verification (untampered) passage Second Imitation Attack
Building A Corpus Early Stages of Data Collection Please Tell Others! English Targeted (For Now) Suggestions for imitation authors for other languages are welcome To participate in another language, do the imitation attack for an author of your choosing & let us know who it is. We aren't web developers!
Releasing the Original Data Set 8 Authors Consented (post-IRB approval) Available Jan 6th at psal.cs.drexel.edu Contains: Original sample passages 1 Obfuscation Attack 1 Imitation Attack
Future Work Testing more stylometric methods. Imitation attacks against other authors. Effectiveness in unique domains (aggregated short messages) Extracting additional information (demographic) Better demonstration of effective unsupervised stylometry.
Conclusion Not the end of stylometry in sensitive areas New methods should test for adversarial threats. Stylometry is useful, but can also present a threat to privacy. Attacking stylometry to preserve privacy has high potential. Potential for arms race: Developing attack-resistant methods of stylometry vs. creating new attacks to preserve privacy.
Questions? Contact: Mike Brennan: Rachel Greenstadt: psal.cs.drexel.edu Interested in learning more? “Writing Style Fingerprint Tool Easily Fooled” - New Scientist, August “Can Pseudonymity Really Guarantee Privacy?” – Josyula Rao, Pankaj Rohatgi “Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace” – A. Abbasi, H. Chen
FAQ “When will your data set be released?” “I can think of features for a stylometry system that I bet these authors did not consider” “I know a person in a certain Three Letter Organization that has subtle ways of detecting authorship that would surely not be broken by these attacks.” “What about function words?”
Message Board Posts vs. Original Data
Study Setup: Detailed 3 representative methods of stylometry. 15 Individual Authors. Participation had three parts: Submit 5000 words of pre-existing writing from a formal source. Formal means school essays, professional reports, etc. No slang, abbreviations, casual conversation. Write a new 500 word passage as an obfuscation attack. Task: Describe your neighborhood. Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy using a passage from The Road, write a third person narrative about your day starting from when you wake up. Authors had no formal training or knowledge in linguistics or stylometry.
Discussion Training Set Content and Size The amount of training text for each author not exceptionally large, but the number of authors is significantly larger than most studies. Allowed us to see interesting patterns, such as authors who were better at creating attacks. Certain authors particularly susceptible to obfuscation attacks. Do they have a “generic” writing style? Could be beneficial to study the effects of adversarial attacks in stricter domains. Participant Skill Level The lack of linguistic expertise of the participants strengthens our conclusions. It would be reasonable to expect authors with some level of expertise to do a better job at attacking these methods. Examining what features and methods might offer resistance to these attacks is a viable avenue of research.