Presentation is loading. Please wait.

Presentation is loading. Please wait.

Perl 6 Internals Dan Sugalski TPC 5.0 “Here there be dragons”

Similar presentations


Presentation on theme: "Perl 6 Internals Dan Sugalski TPC 5.0 “Here there be dragons”"— Presentation transcript:

1 Perl 6 Internals Dan Sugalski TPC 5.0 “Here there be dragons”

2 The big goals of perl 6's internals  Speed  Extendibility  Cleanliness  Compatibility  Modularity  Thread Safety  Flexibility

3 Some global decisions  The core will be in C. (Like it or not, it's appropriate for code at this level)  The core must be modular, so pieces can be swapped out without rebuilding  It must be fast  Long-term binary compatibility is a must  Your average perl coder or extension writer shouldn't need any info about the guts  Things should generally be thought out, documented, and engineered

4 The quick overview  Parser  Compiler  Optimizer  Runtime engine

5 Parser Compiler Optimizer Interpreter Syntax Tree Unoptimized Bytecode Optimized Bytecode Fully-laden Interpreter Precompiled Bytecode

6 The parser  Where the whole thing starts  Generally takes source of some sort and turns it into a syntax tree

7 The Bytecode Compiler  Turns a syntax tree into bytecode  Performs some simple optimization

8 The optimizer  Takes the plain bytecode from the compiler and abuses it heavily  An optional step, generally skipped for compile- and-go execution  Should be able to work on small parts of a program for JIT optimization

9 The Interpreter  Takes compiled (and possibly optimized) bytecode and does something with it  Generally that something is execute, but it might also be:  Save to disk  Translate to another format (.NET, Java bytecode)  Compile to machine code

10 The Parser “Double, double, toil and trouble Fire burn, and cauldron bubble”

11 Parser goals  Extendible in perl  More powerful than what we have now  Retargetable  Self-contained and removable

12 Parsing perl isn't easy  May well be one of the toughest languages to properly parse  If we get perl right other languages are easy. Or at least easier  We have the full power of perl to draw on to do the parsing (Including the regex engine and Damian's Bizarre Idea de Jour)

13 The parser will be in C  We will be using C for the parser  A full set of callbacks will be available to hook into the parser in lots of places  Adding new parsing rules (probably with regexes describing them) will be easy  The parser will be extendable via perl code

14 The Compiler “Mmmmm, tasty!”

15 From syntax tree to bytecode  The compiler takes a syntax tree and turns it into bytecode  Very little optimization is done here.  Optimization is expensive and optional  Pretty straightforward—this isn't rocket science

16 The Optimizer “We can rebuild it. Make it better, faster, stronger”

17 The Optimizer  Takes plain bytecode and makes it faster  Does all the sorts of things that you expect an optimizer to do—code motion, loop unrolling, common subexpression work, etc.  Will be an iterative process  This will be interesting, as perl's a pain to optimize  An optional step, of course

18 Things that make optimizing perl tough  Active data  Runtime redefinitions of everything  Really, really late binding (Waiting for Godot late)  Perl programmers are used to more predictable runtime characteristics than, say, C programmers.

19 The Interpreter “Polly want a cracker?”

20 Interpreter goals  Fast  Tuned for perl  Language neutral where possible  Event capable  Sandboxable  Asynchronous I/O built in  Built with an eye towards TIL and/or native code compilation  Better debugging support than perl 5

21 The perl 6 interpreter is software CPU  Complete with registers and an assembly language  This can make translating perl 6 bytecode into native machine code easier  There's a lot of literature on building optimzing compilers that can be leveraged  While more complex than a pure stack-based machine, it's also faster  Opcode dispatch needs to be faster than perl 5  Opcode functions can be written in perl

22 CPU specs  64 int, float, string, and PMC registers  A segmented multiple stack architecture  Interrupt-capable (for events)  Pretty much completely position independent— everything is referenced via register, pad entry, or name

23 The regex engine  The regex engine is going to be part of the perl 6 CPU, not separate as it is now  A good incentive to get opcode dispatch fast  Makes expanding the regex engine a bit easier  Details will be hidden as a set of regex opcodes

24 A few words on the stack system  Each register file has an associated stack  All registers of a particular type can be pushed onto or popped off the stack in one go  Individual registers or groups of registers can be pushed or popped  The stacks are all segmented so we're not relying on finding contiguous chunks of memory for them  There's also a set of call and scratch stacks

25 Bytecode “Could you say that a little differently?”

26 What is bytecode?  A distilled version of a program  Machine language for the PVM  Can contain a lot of 'extra' information, including full source  Designed to be platform independent  Should be mostly mappable as shared data (modulo the fixup sections)

27 Data Structures “Vtables and strings and floats, oh my!”

28 Variables Vtable Pointer Data Pointer Integer Value Float Value Flags Synchronization GC Data  Generically called a PMC  Bigger than Perl 5's base data structure  Synchronization data built-in  Same for all variable types  GC data is not part of base structure

29 Scalars  Built off the base PMC structure  Use the integer and float areas as caches  Data pointer points off to string, large int, or large float  Vtable functions determine how it all works

30 Arrays  Built off the base PMC structure  Data pointer points to array data  All perl 6 arrays are typed  May have an array of scalars, strings, integers, or floats  Array only takes up enough memory to hold their types

31 Hashes  Built off the base PMC structure  Data pointer points to array data  All perl 6 hashes are typed  May have a hash of scalars, strings, integers, or floats  Hashes only takes up enough memory to hold their types  Hashing function is overridable

32 Strings Encoding Type Buffer Start Buffer Length String Length String Size  Strings are sort of abstract  Perl 6 can mix and match string data (Unicode, ASCII, EBCDIC, etc)  New string types can be loaded on the fly Flags Unused

33 String handling  Perl 6 has no 'built-in' string support—all string support is via loadable libraries  There'll be Unicode, ASCII, and EBCDIC support provided (at least) to start

34 Numbers  Bigints and bigfloats share the same header  Arbitrary-length floating point and integer numbers are supported  Perl automagically upgrades ints and floats when needed Buffer Pointer Length Exponent Flags

35 Vtables  All variable data access is done through a table of functions that the variable carries around with it  This allows us faster access, since code paths are specialized for just the functions they need to perform  Isolates us from the implementation of variables internally  Allows special purpose behaviour (like perl 5's magic) to be attached without cost to the rest of perl

36 Vtables (cont'd)  Makes thread safety easier  A little bit more overhead because of the extra level of indirection, but the smaller functions make up for that  Vtable functions can be written in perl. (Each class with objects blessed into it will have at least one)  There may be more than one vtable per package

37 Vtables hide data manipulation  Pretty much all the code to handle data manipulation will be done via variable vtables  Ths allows the variable implementation to change without perl needing to know  Allows far more flexibility in what you can make a variable do  Shortens the code path for data functions and trims out extraneous conditionals

38 For example: Fetching the string value of a scalar For scalars with strings: String *get_str(PMC *my_PMC) { return my_PMC->data_pointer; } For int-only scalar: String *get_str(PMC *my_PMC) { my_PMC->data_pointer = make_string(my_PMC->integer); my_PMC->vtable = int_and_string_vtable; return my_PMC->data_pointer; }

39 Memory Management “Now where did I put that?”

40 Getting headers  All the fixed-size things (PMCs, string/number headers) get allocated from arenas  All headers, with the exception of PMCs (maybe) are moveable by the garbage collector  Non-PMC header allocation is very fast  PMC allocation is only mostly fast

41 Buffer Management  Anything that isn't a fixed size gets allocated from the buffer pools  All buffered data, with the exception of data allocated in special pools, is moveable by the garbage collector  Because of GC, allocation is very quick

42 Garbage Collection “Bring out yer dead!”

43 The perl 6 GC is a copying collector  Everything except PMCs is moveable in Perl 6  PMCs might be moveable too  We get a compact memory heap out of this, which allows for fast allocation  Perl 6 will release empty memory back to the system when it can  Refcounts are used only to note object lifetimes, not for GC  Refcounts, for the most part, are dead

44 GC considerations for Objects  Garbage collection and object death are now separate things  Perl's guarantee of timely object death is stronger  We still don't guarantee perfect collection (but it sucks less)  We still refcount for real perl references, but only 2 bits are used  Objects with more than two simultaneous references won't get collected until a full dead variable scan is made

45 Extensions beware!  Since we have no refcounts, extensions must tell perl when they hold on to PMCs  Not a huge deal, as we piggy-back on the cross- interpreter PMC tracking we use for threads  No more struct PMC; in extensions...

46 Extending Perl 6

47 Extensions Made Easier  Perl 6 will have a real API  The API is multilevel  Simple for embedders  More complex for extension authors  Pretty messy for vtable or opcode writers  Binary compatibility is a very strong consideration

48 Embedding  Guaranteed stable and binary compatible for the life of perl 6  Very simple API  Create interpreter  Destroy interpreter  Parse source  Run code  Register native functions

49 Extensions  Much simpler interface to perl's internals  The gory details are hidden  Stable binary compatibility is a very strong goal  We may add functions or options, but we won't take them away  Extensions built for perl 6.0.1 should still run with perl 6.8.12 without rebuilding  Manipulating perl data should be much easier  If you have to resort to Inline to wrap a library then it means we've not got it right

50 Extensions (cont)  Inline, or something like it, is probably going to be the standard for extending perl  XS, when you have to resort to it, will be far less nasty than it is now

51 Homegrown Opcodes and Vtables  This is part of the grubby inside of perl 6  You can use any of the internal routines of perl  If you do, though, you may run into backward- compatibility issues at some point. (If it's not part of the embedding, utility, or extension API, we make no promises)  There's no guarantee that calling conventions won't change.  No guarantees that perl 6.4 will even use vtables or opcodes

52 Utility library  Perl 6 will provide a set of utility routines to handle common tasks  String manipulation  Encoding changes (Shift-JIS to Unicode, EBCDIC to ASCII)  Conversion routines (string to int or float)  Extended precision math (int and float)  These will be stable, like the rest of the API

53 Variations on a Theme “Tocatta and Fuge in perl minor by Wall”

54 The source doesn't have to be perl  The parser isn't obligated to be parsing perl  Input source could be Python, Ruby, Java, or INTERCAL  The full perl parser is optional

55 The interpreter doesn't have to interpret  The interpreter is the destination for bytecode, but it doesn't have to interpret it  It might save directly to disk  It might translate the bytecode into an alternate form—Java bytecode,.NET code, or executable code, for example  The interpreter might translate to machine code on the fly, as a sort of JIT compiler. (Well, really a TIL, but...)


Download ppt "Perl 6 Internals Dan Sugalski TPC 5.0 “Here there be dragons”"

Similar presentations


Ads by Google