Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 RAMP Jan’08 Raksha & Atlas: Prototyping & Emulation at Stanford Christos Kozyrakis work done by S. Wee, N. Njoroge, M. Dalton, H. Kannan Computer Systems.

Similar presentations


Presentation on theme: "1 RAMP Jan’08 Raksha & Atlas: Prototyping & Emulation at Stanford Christos Kozyrakis work done by S. Wee, N. Njoroge, M. Dalton, H. Kannan Computer Systems."— Presentation transcript:

1 1 RAMP Jan’08 Raksha & Atlas: Prototyping & Emulation at Stanford Christos Kozyrakis work done by S. Wee, N. Njoroge, M. Dalton, H. Kannan Computer Systems Laboratory Stanford University

2 2 RAMP Jan’08 Outline  Raksha  prototyping security architectures Raksha goals Generations of Raksha prototypes Experience & lessons  Atlas  emulating transactional memory architectures Atlas goals Architecture overview New programmability features Experience & lessons

3 3 RAMP Jan’08 Raksha Goals  Architectural support for software security 1. Protect existing software from attacks  Prevent buffer overflows, SQL injections, …  Based on dynamic information flow tracking (DIFT) 2. Reduce trusted code base (TCB) for new software  Simplify design & verification of security guarantees  Using word-granularity protection on physical memory  Robust, flexible, practical, end-to-end, fast

4 4 RAMP Jan’08 Raksha Architecture, Version 1 Policy Decode Tag ALU Tag Check PCPC DecodeD-CacheRegFile ALU I-CacheTraps WBWB  Modified Sparc V8 processor (Leon) 4 programmable security policies using 4-bits/word User-level handling of security exceptions +7% logic, +0% clock cycle time over base design  Full Linux distribution with > 120 software packages  1 st DIFT architecture to detect high-level attacks on binaries Have shared this design with 3 other institutions so far…

5 5 RAMP Jan’08 Raksha Architecture, Version 2  Small off-core coprocessor for all DIFT functionality + state Can be reused across multiple chips  Requires minimal changes to main processor core <1% for our Sparc V8 processor  Same security features as original architecture 8% performance overhead for SpecInt2000 Processor Core I CacheD Cache ROB Policy Decode Tag ALU Tag Check Tag Cache Tag RF WBWB DIFT Coprocessor PC, Inst, Address Security exception L2 Cache

6 6 RAMP Jan’08 Raksha Architecture, Version 3 (Loki)  Supports fine-grain permission check on physical memory All words associated with a 32-bit tag Permission table provides access rights for different tags Trusted SW specifies permissions; HW enforces them  Independently from OS; checks on device accesses as well  Reduces TCB of a full OS down to 5KLOC Invariant: malicious user/kernel code cannot access data without permission Virtual memory & all device drivers outside of the TCB PCPC Decode D-Cache RegFile ALU I-Cache Traps WBWB I-TLB P-cache D-TLB P-cache Check

7 7 RAMP Jan’08 Experience & Lessons  HW: a stable starting point is critical Despite deficiencies, Leon has been a reasonable base Good compromise of size, performance, flexibility, support  Even for ISA-level research Can we match this with upcoming RAMP models?  SW: full system is important (full OS + devices) Enables experimentation with wide range of apps Increases credibility of results What is the OS story for RAMP models?  System: need low-cost board option Makes it easier to attract collaborators & disseminate design What is the replacement plan for XUPv5?

8 8 RAMP Jan’08 Repeat outline  Raksha  prototyping security architectures Raksha goals Generations of Raksha prototypes Experience & lessons  Atlas  emulating transactional memory architectures Atlas goals Architecture overview New programmability features Experience & lessons

9 9 RAMP Jan’08 Atlas Goals  Fast: at speed experiments with hardware TM ~100x faster than simulator  Comfortable: full-system environment Full Linux OS Integration with standard debugging tools  Easy-to-use: rich support for programmability Automatic detection of performance bottlenecks Deterministic replay Automatic detection of atomicity bugs

10 10 RAMP Jan’08 ATLAS Hardware Architecture  9-way CMP with hardware support for TM TM support builds upon private caches & coherence protocol One processor dedicated for system code Uses hardcore PowerPC codes in user & control FPGAs in BEE2 TCC PPC 0 TCC PPC 1 I/O Linux PPC TCC PPC 2 TCC PPC 3 TCC PPC 4 TCC PPC 5 TCC PPC 6 TCC PPC 7 Control Switch Main Memory User Switch

11 11 RAMP Jan’08 ATLAS Software Architecture Application (OpenMP+TM) TM APIATLAS Profiler ATLAS Runtime System Linux OS ATLAS HW on BEE2  High-level application development OpenMP + TM, (Java + TM), …  High-level application debugging Gdb based for common & new features (e.g., infinite watchpoints)

12 12 RAMP Jan’08 Deterministic Replay with ReplayT  A critical tool for multiprocessor debugging Small system variations can mask bugs  ReplayT: record & replay transaction commit order Sufficient for TCC’s “all transaction, all the time” execution model  Serializable commit order captures all thread interactions Minimal runtime & space overhead (1 byte/transaction) Logging phaseReplay phase Commit time LOG: T0 T1 T2 write-set T0 T1 T2 Commit protocol replays logged commit order T0 T1T2 ComputationArbitrationCommitAbort

13 13 RAMP Jan’08 ReplayT Runtime Overhead (logging phase)  Average slowdown is 1.05%  Can continuously log on production runs

14 14 RAMP Jan’08 ReplayT Extensions  Unique replay Problem: maximize usefulness of test runs Approach: shuffle commit order to generate unique scenarios  Replay with monitoring code Problem: replay accuracy after recompilation Approach: faithfully repeat commit order if binary changes  E.g., printf statements inserted for monitoring purposes  Cross-platform replay Problem: debugging on multiple platforms Approach: support for replaying log across platforms & ISAs

15 15 RAMP Jan’08 Atomicity Bug Detection  Problem: user breaks an atomic task as two transactions Hard to pinpoint problem even with replay  The AVIO proposal [Lu et al. @ ASPLOS’06] Unserializable access interleavings are likely bugs Whitelist unserializable interleavings from correct runs  Performed during application testing AVIO challenges  Long & intrusive data collection phase  Long analysis phase  Corner cases (false positives & false negatives)

16 16 RAMP Jan’08 Atomicity Bug Detection on ATLAS  Based on the general approach of AVIO but Fast & non-intrusive data collection  Single log for each address accessed in transaction  Log collected during deterministic replay Fast analysis  Interleavings examined at transaction granularity More accurate analysis  Eliminated false-negatives due to intermediate writes

17 17 RAMP Jan’08 Experience & Lessons  HW: need multiple grades of hardware modeling Enable fast prototyping of new ISA & HW features  Even if timing or other details not exactly accurate Atlas experience: 40+ tutorial participants enjoyed using new features in a timing “inaccurate” system  SW: full system is important (full OS + devices) Enables experimentation with wide range of apps  System: need low-cost board option Makes it easier to attract collaborators & disseminate design  Scalability: need access to multiple boards Students will not scale design until 2 nd board arrives   ISA: unfortunately, the key to more sharing of HW & SW models Difficult to share across ISAs due to differences in specification, interfaces, etc Should RAMP simply adapt Sparc?

18 18 RAMP Jan’08 Questions?


Download ppt "1 RAMP Jan’08 Raksha & Atlas: Prototyping & Emulation at Stanford Christos Kozyrakis work done by S. Wee, N. Njoroge, M. Dalton, H. Kannan Computer Systems."

Similar presentations


Ads by Google