Presentation is loading. Please wait.

Presentation is loading. Please wait.

BLUEPRINT: Robust Prevention of Cross-site Scripting Attacks for Existing Browsers Mike Ter Louw V.N. Venkatakrishnan University of Illinois at Chicago.

Similar presentations

Presentation on theme: "BLUEPRINT: Robust Prevention of Cross-site Scripting Attacks for Existing Browsers Mike Ter Louw V.N. Venkatakrishnan University of Illinois at Chicago."— Presentation transcript:

1 BLUEPRINT: Robust Prevention of Cross-site Scripting Attacks for Existing Browsers Mike Ter Louw V.N. Venkatakrishnan University of Illinois at Chicago

2 Outline Intro to Cross-site Scripting Objective Approach Technical details Evaluation Related work

3 Cross-site scripting (XSS) A widespread web application vulnerability – In the last few weeks… – Time magazine Top 100 influential people poll defaced by XSS (Apr 2009) – Twitter XSS worm (Apr 2009) – McAfee web site attacked (May 2009) The #1 threat on the Internet (OWASP)


5 Problem: Malicious user created content! Benign comment Pete is… Malicious comment doEvil()

6 Our Objective To develop a robust defense for cross-site scripting attacks

7 Typical Web Application Goals Allow user created content to be expressive, containing rich HTML content – Format text ( bold, italics ) – Hyperlinks ( … ) – Embedded images Prevent scripts in user created content Todays web browsers / standards do not easily facilitate these goals to be met simultaneously

8 Content Isolation User-created content should always be treated as data, never as code Need to isolate user created content as data only

9 Content Isolation for Browsers Content Isolation can be achieved for future browsers – Requires changes to standards and browser parser implementations – Standards / Browsers revision cycles may take several years Todays browsers continue to remain vulnerable to XSS in the near term

10 Our Goal Construct a robust defense for cross-site scripting attacks that – permits rich HTML content – works on todays browsers configured to default settings without requiring changes of any form, including patches, plug-ins, add-ons, etc.

11 Most popular defense: Content filtering Involves sanitization of untrusted HTML by removing script content – Mainly done using regular expressions / parsing HTML Absence of strong isolation facilities for HTML has made content filtering the current main line of defense

12 Problem with Content Filtering The web applications interpretation of sanitized content may differ from the browsers interpretation Example: +ADw-SCRIPT+AD4-attack(); Web Applications understanding : raw text Browsers understanding: attack();

13 The parsing gap Browser generated Parse Tree div text divtext div text divscript Server intended parse tree XSS Cheat Sheet provides approx. 100 examples of such browser quirks

14 Our Approach: Server intended parse tree of untrusted content Reproduce on Browser div text divtextdiv text divtext Challenge : Parsers on existing browsers are unreliable

15 The Blueprint Approach Take control content interpretation process on the browser – Avoid untrusted content parsing by browser No parsing of untrusted content by browser No scripts identified in untrusted content! Robust XSS Prevention

16 High level overview Generate a parse tree of untrusted content on the server – Remove script content by applying whitelist of known-static content types Automatically generate a (trusted) JavaScript program to reconstruct this parse tree on the browser

17 Approach Overview HTML parse tree via document.createElement() et al.

18 Problem: Transporting data without invoking browsers parser Parse tree is constructed using both JavaScript code and data – Code constructs various tree nodes (e.g. ) – Data that annotates tree nodes (e.g. text content) Exposing raw data to browser parser may lead to unpredictable behavior Our Solution: Encode data using safe alphabet – E.g. a-z – transport encoded data to the JavaScript interpreter

19 Transporting data HTML parse tree via document. createElement() et al. Text node Plain text String variable

20 DOM API used document. createElement() createTextNode() getElementById() element. appendChild() insertBefore() parentNode() removeChild() setAttribute() style[ ] style.setExpression()

21 Instrumenting web application with Blueprint

22 Transformed web application output

23 XSS Vector II: Cascading Style Sheets

24 CSS without XSS Use style object to apply style rules['width'] = decode( untrusted ); Dynamic properties not allowed by whitelist['behavior'] = …['-moz-binding'] = …

25 CSS expression vector Any static property can be promoted to dynamic via expression() syntax[width] = expression( attack()); Threat exists only on Internet Explorer IE has no DOM interface to directly force static value

26 Protection against CSS expressions Use setExpression( … ) to apply style rules Forces all CSS rules to be dynamic Trusted script invoked to retrieve property value Script looks up untrusted value in array, then returns it Returned value observed to be static Evaluated unobfuscated expression() for all allowed CSS properties

27 XSS vector III: Uniform Resource Identifiers (URI)

28 URI URI scheme indicates static / dynamic nature Static: https:, ftp:, mailto: Dynamic: javascript: No direct interface to URI parser to enforce a particular (whitelisted) scheme We use a 3-tiered defense

29 Evaluation

30 Effectiveness at preventing XSS attacks on existing browsers Compatibility with common use cases Performance overhead on server and browser

31 Browser evaluation Chrome 1 Firefox 3 Firefox 2 IExplorer 7 8 browsers tested Total over 96% market share of browsers in active use Internet Explorer 6 Opera 9.6 Safari 3.2 Safari 3.1

32 Defense effectiveness XSS Cheat Sheet [Ha09] 94 XSS attack examples Designed to target server-side defenses Embedded in several syntactic contexts Developed automated test platform Identified which attacks successful on which browser Evaluated defense effectiveness All 94 attacks successfully defended on all 8 evaluated browsers

33 Compatibility Modified source code for two popular web applications: WordPress MediaWiki Modified output of two popular websites NY Times blog

34 WordPress (compatibility) Added protection for 3 low integrity outputs (per user comment to blog article) Name (plain text) Website link (anchor element) Comment body (mixed HTML) Allows testing of pages with hundreds of (relatively simple) models Tested real-world blogs, comments No negative compatibility impact observed

35 MediaWiki (compatibility) Added protection for 2 low integrity outputs Article (i.e., web page) title Article content Allows testing of large, complex models Tested Featured article from Wikipedia Content rendered very faithfully to original Problems: not in whitelist Relocate trusted script

36 Performance overhead measurements Server page generation latency Browser memory overhead Browser page rendering latency Combined effect of server and browser latencies

37 WordPress page generation latency Measured significant overhead Partly due to redundant content filter (KSES)

38 MediaWiki page generation latency Better performance than WordPress Redundant intermediate HTML stage

39 Client memory overhead Minor overhead

40 WordPress page rendering latency

41 MediaWiki page rendering latency

42 User experience impact of combined latencies Tested with Firefox 2 (mid-road performance) WordPress with 100 blog comments Low perception of delays for common case

43 Related Work Server-side (XSS-Guard, NeatHTML) – Prevent injected scripts in final output – Vulnerable to attacks exploiting parsing differences Client-side (NoMoXSS, Noxes) – Identification and prevention of data leaks – Cannot detect XSS within same origin Black box / proxy (XSS-DS, Taint inference) – Server: Detect and prevent reflected scripts – Client: Detect and prevent data leaks

44 Related work (cont.) Server and browser collaboration (BEEP, DSI, Noncespaces) – Server: Identify policy regions and declare policies – Client: Enforce policies over policy regions – Require browser changes Systems supporting benign scripts in user- created content – Caja, Web Sandbox, Facebook – Complimentary to our approach

45 Conclusion Cross-site scripting attacks can be prevented entirely if browsers and web applications can come to a common understanding of the structure of untrusted content Blueprint faciliates this goal and provides a novel defense for XSS Project page: –

46 References [Ha09] Hansen, Robert. XSS Cheat Sheet [Di07] Di Paola, Stefano. Preventing XSS with Data Binding

47 XSS Detail Challenge for attacker: Embed content the browser will interpret as script Many vectors – Script tags attack(); – Script attributes: onmousemove=attack(); – CSS Style rules: width: expression( attack() ); – URI: src=javascript:void attack()

48 Encoding Search engine optimization (SEO) Screen readers View source Solutions: – Less destructive encoding – Modify reader – Add feature to browser

49 Dynamic attacks UCC added to a page dynamically must also be protected Current implementation requires remote procedure call (via XHR / AJAX) to request model Blueprint can ensure a base document free of user-embedded scripts Trusted code must then take precautions to maintain security

50 Whitelist Whitelist can be site-specific Whitelist can be grown, gradually adding content known to be static Used off-the-shelf whitelist from HTMLPurifier

51 URI Defense 3-tiered defense: 1. Character-level whitelist Only allow syntactically-inert untrusted chars 2. Parse behavior sensing a.protocol DOM property [Di07] Assumes URI parsing same for all contexts a.href, img.src, url() 3. Impact mitigation Rewrite URI pointing to redirection service Attacks execute in different origin, void of sensitive data

52 Eliminate dependency on browser parser Transform user-created content into static content models on web server – Model reflects approved content parse tree Propagate static content models into JavaScript interpreter of web browser Reconstruct server-approved parse tree using client-side model interpreter

53 Create static content model Parse untrusted HTML Prune resulting parse tree in accordance with whitelist of known-static node types Serialize parse tree into stream of benign data characters Wrap in … tags Attach trusted script for invoking model interpreter

54 Model interpreter Interprets model as stream of declarative statements Uses reliable DOM API to generate content – document.createElement( … ) – element.appendChild( … ) Enforces server-intended parse tree in browser

Download ppt "BLUEPRINT: Robust Prevention of Cross-site Scripting Attacks for Existing Browsers Mike Ter Louw V.N. Venkatakrishnan University of Illinois at Chicago."

Similar presentations

Ads by Google