Presentation is loading. Please wait.

Presentation is loading. Please wait.

Version 1: Front page, headline size 50 pt Parsing APL for Static Analysis Speaker: Anders Schack-Nielsen, Ph.D. Sept. 23 rd 2014.

Similar presentations


Presentation on theme: "Version 1: Front page, headline size 50 pt Parsing APL for Static Analysis Speaker: Anders Schack-Nielsen, Ph.D. Sept. 23 rd 2014."— Presentation transcript:

1 Version 1: Front page, headline size 50 pt Parsing APL for Static Analysis Speaker: Anders Schack-Nielsen, Ph.D. Sept. 23 rd 2014

2 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Background and Motivation Variable Types Static Analysis Tool Parsing APL Kind Inference BNF Grammar Outline 2

3 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. APL codebase in SimCorp: 68000 functions 1.7m lines of code 215 APL developers actively developing and maintaining this codebase Additional functions and developers covering utilities, etc. Background

4 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Motivation – example Programmer A writes function foo. A makes certain assumptions about the input arguments. A documents his assumptions. Programmer K writes function bar and calls foo. K has read the header of foo so he knows what sort of arguments to supply. For good measure he also tests it. ∇ foo args ⍝ : args should be... mat1 mat2 strings←args... (implicit assumptions) ∇ ∇ bar... foo mat1 mat2 strings... ∇

5 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Translating: A’s assumptions  documentation of foo  K’s understanding A lot can be missed, misinterpreted, or left out Test might not catch this Maintenance: Updates to foo Updates to bar Assumptions change – requires a synchronous update in three places to be correct. Motivation – what can go wrong?

6 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Solution: Variable Types Formalize assumptions – make them checkable. Introduce variable types and static analysis. Check header specification. Check foo against its header. Check the call to foo from bar. ∇ foo args ⍝ : args[1] : mat1 As vtINT[;] ⍝ : [2] : mat2 As vtINT[mat1:1;mat1:2] ⍝ : [3] : strings As vtCHAR[][] mat1 mat2 strings←args... ∇

7 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. First type checker was introduced in SimCorp 10 years ago. Worked well, but had many flaws. Recently, the tool has been rewritten from scratch. Many interesting challenges, e.g. parsing APL. 8k lines of F# including 500 lines of FsLex/FsYacc. Understands the semantics of all APL symbols and control- flow constructs. New type checker catches many things the old did not, e.g. potentially all rank errors. Static Analysis Tool 7

8 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Real life example 8 ∇ r←y textStringRemove x;h ⍝ 2: y As vtSTRING|vtSTRING[] : (string1)(string2)..... ⍝ 3: x As vtSTRING|vtCHAR[;] : text vector or matrix ⍝ 4: r As vtSTRING|vtCHAR[;] : resulting text vector or m... ∇... dbsource←' 'textStringRemove dbsource tokens←'('textSplitAt')'textStringRemove dbsource... vtSTRING is a short-hand for vtCHAR[]

9 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. APL is statically un-parsable! However, it becomes parsable with only a few very minor restrictions. In fact, we can make an LALR(1) parser: Parsing APL 9 It is possible to define a completely disambiguated BNF grammar, allowing us to code-generate the parser using Yacc. I.e. we can parse APL from left to right with only a single token lookahead and no backtracking.

10 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Parsing APL 10 x/¨y MonadicApply ArrayVariable(y)OperatorApply Each FunctionVariable(x) OperatorApply Reduce DyadicApply ArrayVariable(y)ArrayVariable(x) OperatorApply EachReplicate

11 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Values come in 3 kinds: Arrays, Functions, and Operators. Sequences of Arrays form vectors. Functions associate to the right. Operators associate to the left. Parsing needs complete kind information. Solution: Separate parsing in two steps with a kind inference algorithm sandwiched in-between: 1.Parse control-flow and matching parentheses, effectively representing expressions as mere token trees. 2.Do kind inference on the token trees. 3.Parse the token trees as full-fledged expressions. Parsing APL 11

12 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Kind inference naturally proceeds from left to right: Consider e.g.: “ x.y ”, “ x/y ”, “ x[y] ” Left-to-right, depth-first scan: Individual tokens can be inferred based on the kinds of the tokens to the left of it. Parenthesized expressions can have their compound kind inferred based on the kinds of their subparts. Tag all tokens with their kind and all left-parentheses with the compound kind they enclose. Kind Inference 12

13 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Kind sequence rewrite algorithm: Uses an elaboration into 5 kinds: Array (A), Function (F), Namespace indexer (.), Monadic operator (M), and Dyadic operator (D). Inferring compound kinds 13 K  K (done) A A Ks  A Ks A. Ks  Ks A F Ks  A (done) K D D Ks  A (done) // outer product F F Ks  A (done) F A Ks  A (done) [AF] M Ks  F Ks K D A A Ks  K D A Ks K D A. Ks  K D Ks [AF] D [AF] Ks  F Ks K  K (done) A A Ks  A Ks A. Ks  Ks A F Ks  A (done) K D D Ks  A (done) // outer product F F Ks  A (done) F A Ks  A (done) [AF] M Ks  F Ks K D A A Ks  K D A Ks K D A. Ks  K D Ks [AF] D [AF] Ks  F Ks *Assumes a minor preprocessing step that wraps “A. F” with parentheses. Also slightly simplified assuming no “A. D” or “A. M”.

14 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. Expr: | Vector Func Expr { DyadicApply(vector $1, $2, $3) } | FuncLeftmost Expr { MonadicApply($1, $2) } | Vector { vector $1 } Vector: | SimpleExprLeftmost { [$1] } | SimpleExprLeftmost SimpleVector { $1 :: $2 } SimpleExprLeftmost: | AtomicExpr { $1 } | Vector LBRACKET IdxList RBRACKET { Index(vector $1, $3) } | NameSpaceExprLeftmost AtomicExpr { NameSpace($1, $2) } AtomicExpr: | LPAREN Expr RPAREN { $2 } | IDARRAY { IdenArray($1) } | INT { Value(parseInt($1)) } | FLOAT { Value(Float(parseDouble($1))) } | STRING { Value(parseStringValue($1)) } | APLVALUE { Value(AplNil(parseNiladic($1))) } Func: | Func MonadicOperator { MonadicOpApply($1, $2) } | Func DyadicOperatorFuncFunc SimpleFunc { DyadicOpApply($2, FF($1, $3)) } | JOT DyadicOperatorFuncFunc SimpleFunc { DyadicOpApply($2, FF(AplFunction(OuterProduct), $3)) } | SimpleFunc { $1 } BNF Grammar (sample excerpt) 14

15 To remove bullets: Use backspace. To reinsert bullets: Choose Home/Paragraph/ Filled Round Bullets. For next bullet level, use the tab button. In case you need to reset the slide, right-click on the slide and choose ‘Reset Slide’. What were those restrictions to allow parsing? Defined operators need a static description of whether their operands are functions or arrays. This is not a problem in practice. We need an environment describing all global variables and functions. We need this anyway to typecheck function calls. (Minor quirk related to the :Until-:AndIf construction.) Restrictions – the fine print 15

16 16


Download ppt "Version 1: Front page, headline size 50 pt Parsing APL for Static Analysis Speaker: Anders Schack-Nielsen, Ph.D. Sept. 23 rd 2014."

Similar presentations


Ads by Google