Presentation is loading. Please wait.

Presentation is loading. Please wait.

Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Joint work with Michael Kozuch 1, Theodoros Strigkos 2, Babak Falsafi 3, Phillip.

Similar presentations


Presentation on theme: "Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Joint work with Michael Kozuch 1, Theodoros Strigkos 2, Babak Falsafi 3, Phillip."— Presentation transcript:

1 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Joint work with Michael Kozuch 1, Theodoros Strigkos 2, Babak Falsafi 3, Phillip B. Gibbons 1, Todd C. Mowry 1,2, Vijaya Ramachandran 4, Olatunji Ruwase 2, Michael Ryan 1, Evangelos Vlachos 2 Shimin Chen 1 Intel Research Pittsburgh 2 CMU 3 EPFL 4 UT Austin

2 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 2 Instruction-Grain Monitoring Software often contain bugs –Memory corruptions, data races, …, crashes –Security attacks often designed to exploit bugs Instruction-grain lifeguards can help –Dynamic monitoring: during application execution –Instruction-grain: e.g., memory access, data flow Enables a wide range of powerful lifeguards Application Lifeguard

3 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 3 Example Instruction-Grain Lifeguards AddrCheck: –Monitor malloc/free, memory accesses –Check if all memory accesses visit allocated memory regions MemCheck: AddrCheck + check uninitialized values –Copying partially uninitialized structures is not an error –Lazy error detection to avoid many false positives –Track propagation of uninitialized values TaintCheck: detect overwrite-based security exploits –Tainted data: data from network or disk –Track propagation of tainted data to detect violations LockSet: detect data races in parallel programs [Nethercote’04] [Nethercote & Seward ’03 ’07] [Savage et al.’97] [Newsome & Song’05]

4 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 4 Design Space of Support Platform Specific Lifeguard General Purpose: Wide Range of Lifeguards Dynamic binary instrumentation (DBI) 10-100X slowdowns General-Purpose HW improving DBI 3-8X slowdowns Lifeguard-specific hardware This paper Performance Good Poor [Bruening’04] [Luk et al’05] [Nethercote’04] [Crandall & Chong’04], [Dalton et al’07], [Shetty et al’06], [Shi et al’06], [Suh et al’04], [Venkataramani’07], [Venkataramani’08], [Zhou et al’07] [Chen et al’06] [Corliss’03]

5 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 5 Outline Introduction Background Three Hardware Acceleration Techniques Experimental Evaluation Conclusion

6 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 6 ApplicationTaintCheck Lifeguard if (taint(F)==1) error; Example Lifeguard: TaintCheck Purpose: detect overwrite-based security exploits –Metadata kept for application memory and registers –Tainted data: data from network or disk –Track taint propagation –Detect violation: e.g., tainted jump target address mov %eax  A mov B  %eax add %ebx  D jmp *(F) taint(%eax) = taint(A) taint(B) = taint(%eax) taint(%ebx)|= taint(D) [Newsome & Song’05] Detect exploit before attack code takes control

7 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 7 TaintCheck w/ Detailed Tracking TaintCheck: –Detect violation –1 taint bit / application byte TaintCheck w/ detailed tracking: –Construct taint propagation trail –More detailed metadata per application location PC of Instruction that tainted this location “tainted from” address Not supported by previous lifeguard-specific HW Input Violation [Newsome & Song’05]

8 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 8 Instruction-Grain Lifeguard Metadata Characteristics Organization varies –per application byte/word –size, format, semantics vary greatly Frequently updated –e.g., propagation tracking Frequently checked –e.g., memory accesses

9 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 9 Lifeguard Support rare events Rare Update Check metadata Event-capture and delivery Application (unmodified) Lifeguard (software) Event Handlers Rare e.g., malloc/free, system calls Frequent e.g., memory access, data movement Events General-Purpose HW improving DBI Performance bottlenecks: metadata mapping, updates, and checks 123

10 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 10 Our Contributions rare events Rare Update Check metadata Event-capture and delivery Application (unmodified) Lifeguard (software) Event Handlers Rare e.g., malloc/free, system calls Frequent e.g., memory access, data movement Events M-TLB IF IT Metadata-TLBfor metadata mapping Inheritance Trackingfor metadata updates Idempotent Filtersfor metadata checks

11 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 11 Outline Introduction Background Three Hardware Acceleration Techniques –Metadata-TLB –Inheritance Tracking –Idempotent Filters Experimental Evaluation Conclusion

12 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 12 Metadata-TLB: Motivation Metadata per app byte/word –Element size may vary Two-level structure: –Robustness & space efficiency Mapping: application address  metadata address –Frequently used in almost every handler –Can be very costly metadata Level-1 index Level-2 chunks

13 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 13 Example (TaintCheck) map *mp = level1_index[src_addr>>16]; mov %eax, %ecx shr $16, %ecx mov level1_index(,%ecx,4),%ecx int idx = (src_addr & 0xffff)>>2; and $0xffff, %eax shr $2, %eax UChar mem_taint = mp[idx]; movzbl (%ecx,%eax,1), %eax reg_taint[dest_reg] |= mem_taint; or %al, reg_taint(%edx) nlba (); nlba void dest_reg_op_mem_4B (UINT32 src_addr /*%eax*/, UINT32 dest_reg /*%edx */) // app instruction type: dest_reg  dest_reg op mem(src_addr) // handler operation: reg_taint(dest_reg)|= mem_taint(src_addr) Metadata Mapping takes 5 out of 8 instructions !

14 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 14 Our Solution: Metadata-TLB A TLB-like HW associative lookup table LMA (Load Metadata Address) instruction: –Application address  lifeguard metadata address Managed by (user-mode) lifeguard software

15 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 15 Example (TaintCheck) w/ M-TLB map *mp = level1_index[src_addr>>16]; mov %eax, %ecx shr $16, %ecx mov level1_index(,%ecx,4),%ecx int idx = (src_addr & 0xffff)>>2; and $0xffff, %eax shr $2, %eax UChar mem_taint = mp[idx]; movzbl (%ecx,%eax,1), %eax reg_taint[dest_reg] |= mem_taint; or %al, reg_taint(%edx) nlba (); nlba void dest_reg_op_mem_4B (UINT32 src_addr /*%eax*/, UINT32 dest_reg /*%edx */) // app instruction type: dest_reg  dest_reg op mem(src_addr) // handler operation: reg_taint(dest_reg)|= mem_taint(src_addr) UChar *p = LMA_macro(src_addr); LMA %eax, %ecx UChar mem_taint = *p; mov (%ecx), %al reg_taint[dest_reg] |= mem_taint; or %al, reg_taint(%edx) nlba (); nlba Reduce handler size by half !

16 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 16 Inheritance Tracking: Motivation Propagation tracking is expensive –Metadata updates for almost every app instruction Previous hardware solutions track propagation –automatically update metadata in hardware –Problem: only support simple metadata semantics e.g., do not support TaintCheck w/ detailed tracking Our goal: flexibility AND performance Idea: inheritance structure is common, so let’s track inheritance in hardware!

17 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 17 Problem with General Inheritance Tracking Problem: state explosion for binary operations ! mov %eax  A mov B  %eax taint(%eax) = taint(A) taint(B) = taint(%eax) Application Propagation Tracking %eax inherits from A B inherits from %eax Inheritance Tracking add %ebx  D taint(%ebx) |= taint(D) insert D into %ebx’s inherit-from list

18 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 18 Unary Inheritance Tracking Many lifeguards can take advantage of unary IT: –MemCheck –TaintCheck Large performance improvements if used –Can be disabled if unary IT does not match the lifeguard check known

19 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 19 Tracking Register Inheritance Original event IT table for registers State Transition & Event to Deliver Deliver event IT(%rs) IT(%rd) Transformed event More details in the paper: IT table and state transition table details Conflict detection

20 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 20 Example mem_to_reg reg_to_mem Application Before mem_to_mem Inheritance Tracking mem_to_reg dest_reg_op_mem reg_to_mem imm_to_mem Can significantly reduce metadata update events! mov %eax  A mov B  %eax mov %ebx  C add %ebx  D mov E  %ebx

21 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 21 Idempotent Filters: Idea Typically, metadata checks give the same result if –Event parameters are the same and –Metadata are the same Idea: filter out idempotent (redundant) events For example: –AddrCheck: After checking that a memory location is allocated Subsequent loads/stores to the same location are safe Until the next free() event –LockSet: (surprisingly) In between synchronization events (e.g., lock/unlock) Check first load to a location Check first store to a location

22 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 22 Outline Introduction Background Three Hardware Acceleration Techniques Experimental Evaluation –Log-Based Architectures (LBA) –Simulation Study (w/ reduced input sets) –PIN-based Analysis (w/ full inputs) Conclusion

23 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 23 Log-Based Architectures rare events Rare Update Check metadata Event-capture and delivery Application (unmodified) Lifeguard (software) Event Handlers Rare e.g., malloc/free, system calls Frequent e.g., memory access, data movement Events Log-Based Architecture (LBA)

24 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 24 PPP PPPP PPPP PPPP P Idea: Exploiting Chip Multiprocessors LBA components

25 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 25 Simulation Setup: Dual-Core LBA System Log Transport (e.g. L2 cache) Core 1 Core 2 decompress Compress capture dispatch Operating System: Fedora Core 5 ApplicationLifeguard IT & IF M-TLB Application and lifeguard are processes Application is stalled when log buffer is full Model a 2-level cache hierarchy Extend Virtutech Simics

26 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 26 Overall Performance: TaintCheck 1.36X LBA baseline LBA optimized Slowdown = application execution time w/o lifeguard application execution time w/ lifeguard

27 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 27 Applying Our Techniques One by One AddrCheck TaintCheck TaintCheck w/ detailed tracking LockSet MemCheck 3.23 1.90 1.02 7.80 6.05 3.81 3.27 3.36 2.29 1.36 4.21 2.71 1.51 4.25 3.20 1.40 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 BASE MTLB MTLB+IF BASE MTLB MTLB+IT MTLB+IT+IF BASE MTLB MTLB+IT BASE MTLB MTLB+IT BASE MTLB MTLB+IF average slowdowns IT, IF, and M-TLB are indeed complementary Achieve dramatically better performance

28 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 28 PIN-Based Analysis: IT IT removes 35.8% to 82.0% of the propagation events

29 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 29 PIN-Based Analysis: IF IF can effectively reduce check events 4-way works as well as fully-associative

30 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 30 Conclusion Our focus: Instruction-Grain Lifeguards Three complementary hardware techniques: –Metadata-TLB (M-TLB) –Inheritance Tracking (IT) –Idempotent Filters (IF) Flexible to support a wide range of lifeguards –Reducing overheads by 2-3X in our experiments –Achieving 2-51% overheads for all but MemCheck

31 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 31 Thank you!

32 Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Shimin Chen 32 People Working on LBA Project Intel Research: Shimin Chen Phillip B. Gibbons University Faculty: Babak Falsafi (EPFL) Todd C. Mowry (CMU) CMU Students: Michelle Goodstein Olatunji Ruwase Previous Contributors: Limor Fix (IRP) Steve Schlosser (IRP) Anastasia Ailamaki (CMU) Greg Ganger (CMU) Bin Lin (Northwestern) Radu Teodorescu (UIUC) Theodoros Strigkos Evangelos Vlachos Vijaya Ramachandran (UT Austin) Mike Kozuch Michael Ryan


Download ppt "Flexible Hardware Acceleration for Instruction-Grain Program Monitoring Joint work with Michael Kozuch 1, Theodoros Strigkos 2, Babak Falsafi 3, Phillip."

Similar presentations


Ads by Google