Presentation is loading. Please wait.

Presentation is loading. Please wait.

R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter.

Similar presentations


Presentation on theme: "R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter."— Presentation transcript:

1 R Environment and Variable Lookup Apr. 2012 1

2 R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter Variable Cache Mechanism  Unboxed Value Cache Proposal  Others 2

3 R Environment and Variable Lookup R Environment Organization  Environment Frames – Frames are connected as a tree structure 3 // One variable binding cell structure struct listsxp_struct { struct SEXPREC *carval; //  value of the symbol struct SEXPREC *cdrval; //  next binding cell struct SEXPREC *tagval; //  symbol }; // One frame structure struct envsxp_struct { struct SEXPREC *frame; struct SEXPREC *enclos; //  parent struct SEXPREC *hashtab; // optional }; Frame B Frame A Frame C enclos Var binding cell SEXP symbol SEXP value SEXP symbol SEXP value R_nill tagcar cdr tag car hashtab Hashtable frame

4 R Environment and Variable Lookup R Environment Organization (2)  Hashtable – Implemented by VECSXP structure A vector, each vector element is a R SEXP object (listsxp_struct) – Calculating the buckle number Hash(symbol) & hashTableMask 4 Var binding cell SEXP symbol SEXP value SEXP symbol SEXP value R_NilValue tagcar cdr tag car Var binding cell SEXP symbol SEXP value SEXP symbol SEXP value R_NilValue tagcar cdr tag car Hashtable VECSXP object Buckle 0 Buckle 1 Buckle 2 Buckle 3

5 R Environment and Variable Lookup R Environment – Variable Lookup  Steps – Get the environment frame From the current execution frame Or from recursive lookup – Check if it has the hashtable No: – start from the first binding cell, do list search, compare symbol Yes: – Calculate the hash buckle number – Get the corresponding buckle’s first binding cell, do list search, compare symbol – No found: return R_NilValue – Found: could return a binding cell 5

6 R Environment and Variable Lookup R Byte-Code Symbol  In R byte-code, each symbol has an index  A simple optimization – Use the index value to do directly look up 6 run <-function() { b <- a+202; print(b); }; GETVAR 1 LDCONST 2 ADD 3 SETVAR 4 POP GETFUN 5 MAKEPROM 6 CALL 7 RETRUN IdxValue 1a 2202 3a+202 4b 5print 6 list(.Code, list(7L, GETVAR.OP, 0L, RETURN.OP), list(b)), 7print(b) Constant table Instructions Byte-code compiling

7 R Environment and Variable Lookup R byte-code Interpreter Variable Cache  A cache to store the “bindings” – Not the exactly value  Cache size: 128 – “More than 90% of the closures in base have constant pools with fewer than 128 entries when compiled”  Cache Space Wasting – “On average about 1/3 of constant pool entries are symbols” – Optimization: re-order the constant table (not implemented) 7 Var binding cell SEXP symbol SEXP value SEXP symbol SEXP value tagcar tag car Var binding cell SEXP symbol SEXP value tag car Var binding cell SEXP symbol SEXP value tag car index 0123

8 R Environment and Variable Lookup R byte-code Interpreter Variable Cache (2)  Cache Storage – On Stack (by default) 8 Var binding cell One Var … … Stack top Current frame Previous frame 128 entries

9 R Environment and Variable Lookup R byte-code Interpreter Variable Cache (3)  The reason to cache binding cell, not the exactly value – Easy for child frames to modify the value 9 Frame AA Frame A Frame AAA enclos Var binding cell Symbol a Value 5 tagcar frame Var binding cell One Var … AA’s frame AA <-function() { a <- 5 AAA() print(a); }; AAA <-function() { a <<- 100 }; Case 1 AAA <-function() { … //remove parent //frame’s val “a” }; Case 2 In AAA: set the binding cell’s value to 100. Return back to AA, the value got in the cache is the right value In AAA: set the binding cell’s value to “unbounded” value Return back to AA, the value got in cache is “unbounded value”  try to look up “a” in AA’s parent frame

10 R Environment and Variable Lookup R byte-code Interpreter Variable Cache (4)  Cache Target: only variables defined in current frame – Not variables found in parent frame – Reason: intersection define problems  Example: Suppose AAA caches g parent A’s “a” 10 Frame AA Frame A Frame AAA enclos Var binding cell One Var … AAA’s frame Var binding cell: a  5 Frame AAAA enclos Var binding cell: a  100 A <-function() { a <- 5; AA() }; AAA <-function() { b <- a; //cache a AAAA() print(a); //use a }; AA <-function() { AAA() }; AAAA <-function() {...//define “a” //in AA frame }; The second using “a” should use the one in AA frame’s If using the cached one  incorrect semantics cache Define later

11 R Environment and Variable Lookup R byte-code Interpreter Variable Cache Steps  Used in SETVAR, GETVAR and similar instructions  Two modes – SmallCache: constant table size <= 128 Use symbol index as direct reference Get the binding cell – Normal: constant table size > 128 Symbol index % 128  reference number Get the binding cell Compare the binding cell’s symbol  Cache initial value – R_NilValue 11

12 R Environment and Variable Lookup R byte-code Interpreter Variable Cache Steps (2)  SETVAR – Finding Cell Step SmallCache – Get the binding cell directly by symbol index  may return R_nilValue Normal – Get the binding cell by symbol index, symbol » If get the cell with right symbol and value is not unbounded  return the cell » Use base method to find variable in local frame » Find the ncell in current frame  update the local cache, and return the ncell » Not find and if the cell from previous step is not null but is unbounded value  Clean the local cache (The value is totally removed, no need cache) – Setting Value Step Use the cell to update the value directly If the cell is R_NilValue  Use base method to define a var in local frame 12

13 R Environment and Variable Lookup R byte-code Interpreter Variable Cache Steps (3)  SETVAR cache update Normal mode – SETVAR first time: Finding Cell Step: No valid binding cell Setting Value Step: use base method to define a var – SETVAR second time: Finding Cell Step: find the cell and update the cache Setting Value Step: use the cell to directly update the value  SETVAR in SmallCache Mode (Pure SETVAR) – SETVAR first time: Finding Cell Step: No valid binding cell Setting Value Step: use base method to define a var – SETVAR second time: Finding Cell Step: No valid binding cell because it only uses directly lookup – Still not update the cell Setting Value Step: use base method to define a var 13

14 R Environment and Variable Lookup R byte-code Interpreter Variable Cache Steps (4)  GETVAR – SmallCache Mode Directly lookup the cell – Invalid cell  goto Normal Model – Valid Cell » Check the value type, may return the value directly, or force promoise – Normal Mode Get the binding cell by symbol index, symbol – If get the cell with right symbol and value is not unbounded  return the cell – Use base method to find variable in local frame – Find the ncell in current frame  update the local cache, and return the ncell – Not find and if the cell from previous step is not null but is unbounded value  Clean the local cache (The value is totally removed, no need cache) Use the returned cell to get the value – May return the valid value or return error 14

15 R Environment and Variable Lookup R byte-code Interpreter Variable Cache Steps (5)  GETVAR in SmallCache Model followed by SETVAR – SETVAR: not update the cache at all – GETVAR: first time Goto normal model  Update the Cache Return the value – GETVAR: second time Return the value use the cache 15 PC STMT 1 LDCONST, 1 3 SETVAR, 2 5 POP 6 GETVAR, 2 8 SETVAR, 3 10 POP 11 GETVAR, 2 13 SETVAR, 3 15 POP 16 GETVAR, 2 18 SETVAR, 3 20 INVISIBLE 21 RETURNt run <-function() { a <- 101; b <- a; #get var first time b <- a; #get var second time b <- a; #get var third time }; Set a Get a: normal, and update cache Get a: cache

16 R Environment and Variable Lookup R Byte-Code Interpreter Variable Cache Mechanism  Others – There are some additional codes to handle Unbounded values Force Promise if the symbol’s value is a promise Missing value handing  Some conclusion – The cache mechanism is correct – But very complex due to the complex R semantics – Optimize the Cache Mechanism is possible E.g. cache parent frame’s variable – But should be very complex 16

17 R Environment and Variable Lookup Unboxed Value Cache Proposal  Basic Assumptions – Not change the current cache mechanism – An additional cache only for unboxed values Something like local register files – Rules should be simple  Basic Logic – GETVAR: Get the var from the byte-code interpreter logic, unbox and populate the cache – SETVAR: only update the register files – Context Change Box, and Write back using the byte-code interpreter logic Context Change: function call, return, … 17

18 R Environment and Variable Lookup Basic Cache Design  One Cache and One Cache State – Cache only store the value, not the binding cell Each cell, 64 bit width: store unboxed Real, Int, Logical – Each Cache State Not valid: no value available, need get it and unbox it Valid: an unboxed version is stored in the cache Modified: the value in the cache is modified  Need write back later – Cache State also stores the type of the value – Global Cache Counter NumModified: How many cache cell’s values are modified If >0, need write back during context change 18 Cache Cache State

19 R Environment and Variable Lookup Code Transformation For The Cache  The Current Sequence Example – It’s hard to populate the cache from this sequence  Need combine – GETVAR + UNBOXREAL  About GUARD – No context change, no need additional guard  Define a New Instruction to replace the sequnce – GETUNBOXREAL 19 PC STMT... 18 GETVAR, 2 20 GUARD, 2, 20 23 UNBOXREAL

20 R Environment and Variable Lookup Logical of GETUNBOXREAL  GETUNBOXREAL – Check the cache’s state – Valid/Modified: Directly return the unboxed value – Not valid (first time or context changed) Get var first execute the guard logic – May fall back to the un-opt code If success – Populate the cache with the unboxed value, set valid state, and return the value  Also define SETUNBOX – If the value on top of the stack is unboxed, use the SETUNBOX to replace SETVAR – The shape of the stack is known during compiling time 20

21 R Environment and Variable Lookup Write-Back Policy  If meeting context change – Function call, return – Check the global state NumModified = 0, no action >0, iterate the cache – Use index to look for the symbol – Box value according to the type of this value – Set the var back 21

22 R Environment and Variable Lookup Some Findings in the Latest R-2.15.0  R-2.15.0 – Released Mar. 2012 – Many function level improvement – No found R interpreter/byte-code interpreter/Runtime changes Very draft performance evaluation: No big changes in micro-test Our current working version is R-2.14.1  Another finding – Started from R-2.14.0, there is a package called “parallel” – High level parallel wrapper to some coarse grain computation tasks R-2.15.0: some new APIs. Something like map/reduce style 22


Download ppt "R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter."

Similar presentations


Ads by Google