Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.

Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

COMP 512, Rice University2 The Problem Programmers are not perfect  Write programs that perform out-of-bounds references  All of these are questionable; some are malicious Technology for avoiding out-of-bounds references is easy  Bounds check each reference  1970s compilers could insert such checks (PL/I) Code with checking runs slowly  Obvious opportunity for optimization  We need compilers that implement this kind of checking  Buffer overflow attacks & other subtle bugs PL.8 philosophy: check everything & optimize checks

COMP 512, Rice University3 Obvious Solution Add a dynamic check to each reference check(a,i) performs two tests  Min(a) ≥ i raises an exception (lbcheck(a,i))  i ≥ Max(a) raises an exception (ubcheck(a,i)) In a loop, the compiler may be able to move one or both tests based on knowledge of the induction variable’s behavior Same abstraction fits structures of arrays, arrays of structures, … …  a[i] check(a,i) …  a[i] Assume a[1:100], so Min(a) is 1 and Max(a) is 100 IBM PL/I compilers implemented “check” as a built-in function (a procedure call).

COMP 512, Rice University4 Obvious Solution Implementing check in the compiler Treat check as an atomic action in the IR  Each check represents a potential exit ( abnormal termination )  Each check entails tests and control-flow operations  Each check has a reasonably high overhead Easier to optimize check as an atomic operation Implemented this way in the early IBM PL/I compilers and PL.8 …  a[i] check(a,i) …  a[i] check(a,i) if (lb(a) ≥ i) then raise exception if (I ≥ ub(a)) then raise exception

COMP 512, Rice University5 Obvious Solution References in loops can be optimized Repeated checks replaced by check of endpoints Code motion combined with special case reasoning about arithmetic comparisons  Min(a) ≤ j ≤ Max(a) and Min(a) ≤ k ≤ Max(a) and j ≤ k > is equivalent to Min(a) ≤ j and Min(a) ≤ k and j ≤ k  Is equivalent to lbcheck(a,j) and ubcheck(a,k) if (j ≤ k) then check(a,j) check(a,k) for i = j to k …  a[i] for i = j to k check(a,i) …  a[i]

COMP 512, Rice University6 Complications Unfortunately, it is not that simple … Our hand transformation assumes that k is invariant If loop modifies k, transformation must be more complex  Pre-loop check cannot determine range of i  Need a pre-loop check and a post-loop check  Recall the loop-exit landing pads in Cytron, Lowry, & Zadeck if (j ≤ k) then lbcheck(a,j) ubcheck(a,k) for i = j to k …  a[i] k  fee(i) for i = j to k check(a,i) …  a[i] k  fee(i) No longer correct

COMP 512, Rice University7 Complications Unfortunately, it is not that simple … Loop exits normally if k ≤ Max(a) Loop exits prematurely if k > Max(a)  Original loop would have referenced beyond a’s bounds. Post-loop test determines which exit was taken …  “raise exception” is same as ubcheck(a,i) if (j ≤ k) then lbcheck(a,j) for i = j to min(k,Max(a)) …  a[i] if k > Max(a) then raise exception (die) for i = j to k check(a,i) …  a[i]

COMP 512, Rice University 8 Minor improvement Peeling first iteration can eliminate one extra test This example is a minor win from code shape Trades minor code space for minor speed eliminates one dynamic test if (j ≤ k) then lbcheck(a,j) …  a[i] for i = j+1 to min(k,Max(a)) …  a[i] if k > Max(a) then raise exception (die) for i = j to k check(a,i) …  a[i]

COMP 512, Rice University9 Using Contextual Knowledge With known loop bounds, the checks become static Overhead of checking goes to zero in this case (pretty good) Known lower bound eliminates pre-loop check  for i = 1 to k is a common case …  for i = 1 to 63 is less common, but still happens … for i = 1 to 100 check(a,i) …  a[i] if (1 ≤ 100) then lbcheck(a,1) for i = 1 to 100 …  a[i] if (100 > Max(a)) then raise exception evaluate at compile time

COMP 512, Rice University10 Details Markstein, Cocke, and Markstein viewed this transformation as a form of strength reduction Replaces repeated strong tests (check) inside a loop with two weaker tests (lbcheck and ubcheck) Focuses on tests related to induction variables (as with OSR ) Strength Reduction for check The variable used in the check, t, must be linearly related to the induction variable, i, used in the end-of-loop test  i x c 1 - t = c 2 must hold where c 1 & c 2 are region constants The check must occur in an articulation node of the loop  The “loop” is an SCC of the CFG  Articulation point has property that it lies on every path through the SCC ( e.g., loop header is an articulation point )

COMP 512, Rice University11 Details The algorithm Create lbcheck in loop’s landing pad & copy support operations Replace loop exit test  Original test was i < n  New test is i < min(n,Max(a)) Insert a new test after the loop exit  On exit, if i > ub, original loop would have tripped on check  Raise exception if i > ub if (j ≤ k) then lbcheck(a,j) for i = j to min(k,Max(a)) …  a[i] if k > Max(a) then ubcheck(a,I) for i = j to k check(a,i) …  a[i] In terms of first iteration In terms of new exit test Original check is dead Will always fail

COMP 512, Rice University12 Details The algorithm If possible, place the exit test in the entry landing pad  Creates local common subexpressions & simplifies test Safety conditions:  Loop has a single exit  Loop ending branch is in an articulation point  Induction variable increment is ± 1  Upper bound is invariant in the loop With these conditions, can place exit check in entry landing pad  Now, code looks like version produced by hand … if (j ≤ k) then lbcheck(A,j) ubcheck(A,k) for i = j to k …  A[i] for i = j to k check(A,i) …  A[i] Otherwise, place it in a loop exit landing pad (CLZ) One minor issue is that reduced check triggers exception too early

COMP 512, Rice University13 Extensions to Markstein, Cocke, Markstein Local Improvements check can be redundant  Value number them or use special case algorithm One check can subsume another ( multiple references )  lbcheck(a,i) and lbcheck(a,j)  lbcheck(a,i) if i ≤ j  ubcheck(a,i) and upcheck(a,j)  ubcheck(a,j) if i ≤ j  If the subsumption is local, applying this insight is easy Global check elimination Subsuming checks can cover the entry or exit of the loop Similar to available expressions and very busy expressions Can formulate check hoisting & sinking as DF problems, too  Similar results to MC&M’s OSR of range checking

COMP 512, Rice University14 Control Flow The examples show simple loops with no internal control flow Conditionals pose serious problem  Evaluate check if reference is evaluated  Other references may subsume the guarded reference  Markstein, Cocke, Markstein does not address the issue  Other strategies (Gupta, LCM) have problems Room for further work on this issue  Replication, more aggressive code motion, … for i  j to k … if (f(i)) then …  a[i+c] … …  a[i] c = 0  check is subsumed c ≥ 1  lbcheck is subsumed c ≤ -1  ubcheck is subsumed

COMP 512, Rice University15 And What About Pointers? Pointer checking is, in principle, a similar problem … Easy to see in simple cases for (i = 1; i <= n; i++) *p ++ = 0; if (1 < n) { check(p); check(p+n*sizeof(*p)); } for (i = 1; i <= n; i++) *p ++ = 0; for (i = 1; i <= n; i++ ) { check(p); *p ++ = 0; } Even this case has difficulties What does p reference? How big is it?  check needs to know Can use run-time tags on the pointer or the object  Requires more memory references  Need sizes on any object whose address is taken ( &x ) Can eliminate some checks (& tags) statically  Known sizes & unambiguous pointers

COMP 512, Rice University16 Bibliography “Optimization of Range Checking,” V. Markstein, J. Cocke, P. Markstein, Proceedings of the 1982 ACM SIGPLAN Conference on Compiler Construction, SIGPLAN Notices, pages 114-119. “Optimizing Array Bound Checks Using Flow Analysis,” R. Gupta, ACM Letters on Programming Languages and Systems (LOPLAS), 2(1-4), pages 135-150.

Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.

Similar presentations

Presentation on theme: "Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.

Similar presentations

Presentation on theme: "Eliminating Array Bounds Checks — and related problems — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved."— Presentation transcript:

Similar presentations

About project

Feedback