1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge.

1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge

2 Or: Forall for all Andrew Kennedy Microsoft Research Cambridge (Joint work with Don Syme)

3 Curriculum Vitae for FOOLs http://research.microsoft.com/~akenn

4 Parametric polymorphism Parameterize types and code by types  Concept: Strachey (1967)  Language: ML (Milner, 1975), Clu (Liskov, 1975)  Foundations: System F (Girard, 1971), Polymorphic lambda calculus (Reynolds, 1974)  Engineering benefits are well-known (code re-use & strong typing)  Implementation techniques are well-researched

5 Polymorphic Programming Languages Standard ML O’Caml Eiffel Ada GJ C++ Mercury Miranda Pizza Haskell Clu

6 Widely-used Polymorphic Programming Languages C++

7 Widely-used Strongly-typed Polymorphic Programming Languages

8 In 2004? C# Visual Basic? Java Cobol, Fortran, …?

9 This talk The.NET “generics” project:  What was challenging?  What was surprising?  What’s left?

10 What is the.NET CLR (Common Language Runtime)?  For our purposes: the CLR Executes MS-IL (Intermediate Language) programs using just-in-time or way-ahead-of-time compilation Provides an object-oriented common type system Provides managed services: garbage collection, stack-walking, reflection, persistence, remote objects Ensures security through type-checking (verification) and code access security (permissions + stack inspection) Supports multiple source languages and interop between them

11 Themes  Design: Can multiple languages be accommodated by a single design? What were the design trade-offs?  Implementation: How can run-time types be implemented efficiently?  Theory: How expressive is it?  Practice: Would you like to program in it?  Future: Have we done enough?

12 Timeline of generics project May 1999Don Syme presents proposal to C# and CLR teams Feb 2000Initial prototype of extension to CLR Jan 2002Our code is integrated into the product team’s code base Nov 2002Anders Hejlsberg announces generics at OOPSLA’02 late 2004?Product release of.NET v1.2 with generics Feb 2001Product Release of.NET v1.0

13 Design

14 Design for multiple languages C++ Can I write class C : T ML Functors are cool! Visual Basic Don’t touch my language! C++ Give me template specialization C++ And template meta-programming Java Run-time types please Scheme Why should I care? C# Just give me decent collection classes Haskell Rank-n types? Existentials? Kinds? Type classes? Eiffel All generic types covariant please

15 Some design goals  Simplicity Don’t surprise the programmer with odd restrictions  Consistency Fit with the object model of.NET  Separate compilation Type-check once, instantiate anywhere

16 Non-goals  C++ style template meta-programming Leave this to source-language compilers  Higher-order polymorphism, existentials Hey, let’s get the basics right first!

17 What’s in the design?  Type parameterization for all declarations classes e.g. class Set interfaces e.g. interface IComparable structs e.g. struct HashBucket methods e.g. static void Reverse (T[] arr) delegates (“first-class methods”) e.g. delegate void Action (T arg)

18 What’s in the design (2)?  Bounds on type parameters single class bound (“must extend”) e.g. class Grid where T : Control multiple interface bounds (“must implement”) e.g. class Set where T : IComparable

19 Simplicity => no odd restrictions interface IComparable { int CompareTo(T other); } class Set : IEnumerable where T : IComparable { private TreeNode root; public static Set empty = new Set (); public void Add(T x) { … } public bool HasMember(T x) { … } } Set > s = new Set >(); Type arguments can be value or reference types Even statics can use type parameter Bounds can reference type parameter (“F-bounded polymorphism”) Interfaces and superclass can be instantiated

20 Consistency => preserve types at run-time  Type-safe serialization:  Interop with legacy code:  Reflection: Object obj = formatter.Deserialize(file); LinkedList list = (LinkedList ) obj; // Just wrap existing Stack until we get round to re-implementing it class GStack { Stack st; public void Push(T x) { st.Push(x); } public T Pop() { return (T) st.Pop(); } … object obj; … Type ty = obj.GetType().GetGenericArguments()[0];

21 Separate compilation => restrict generic definitions  No dispatch through a type parameter  No inheritance from a type parameter class C { void meth() { T.othermeth(); } // don’t know what’s in T } class Weird : T { … } // don’t know what’s in T

22 Implementation

23 Compiling polymorphism, as was Two main techniques:  Specialize code for each instantiation C++ templates, MLton & SML.NET monomorphization good performance code bloat   Share code for all instantiations Either use a single representation for all types (ML, Haskell) Or restrict instantiations to “pointer” types (Java) no code bloat poor performance  (extra boxing operations required on primitive values)

24 Compiling polymorphism in the Common Language Runtime  Polymorphism is built-in to the intermediate language (IL) and the execution engine  CLR performs “just-in-time” type specialization  Code sharing avoids bloat  Performance is (almost) as good as hand-specialized code

25 Code sharing  Rule: share field layout and code if type arguments have same representation  Examples: Representation and code for methods in Set can be also be used for Set (string and object are both 32- bit pointers) Representation and code for Set is different from Set (int uses 32 bits, long uses 64 bits)

26 Exact run-time types  We want to support if (x is Set ) {... } else if (x is Set ) {... }  But representation and code is shared between compatible instantiations e.g. Set and Set  So there’s a conflict to resolve…  …and we don’t want to add lots of overhead to languages that don’t use run-time types (ML, Haskell)

27 Object representation in the CLR vtable ptr fields normal object representation: type = vtable pointer vtable ptr elements array representation: type is inside object element type no. of elements

28 Object representation for generics  Array-style: store the instantiation directly in the object?  extra word (possibly more for multi-parameter types) per object instance  e.g. every list cell in ML or Haskell would use an extra word  Alternative: make vtable copies, store instantiation info in the vtable  extra space (vtable size) per type instantiation  expect no. of instantiations << no. of objects  so we chose this option

29 Object representation for generics vtable ptr fields x : Set vtable ptr fields y : Set Add HasMember ToArray Add HasMember ToArray code for HasMember code for ToArray code for Add stringobject ……

30 Type parameters in shared code  Run-time types with embedded type parameters e.g. class TreeSet { void Add(T item) {..new TreeNode (..).. } } Q: Where do we get T from if code for m is shared? A: It’s always obtainable from instantiation info in this object Q: How do we look up type rep for TreeNode efficiently at run-time? A: We keep a “dictionary” of such type reps in the vtable for TreeSet

31 Dictionaries in action class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray();

32 Dictionaries in action string … vtable slots … class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); vtable for Set

33 Dictionaries in action string … vtable slots … TreeNode class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); vtable for Set

34 Dictionaries in action string … vtable slots … TreeNode class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); Set … vtable slots … vtable for Set vtable for Set >

35 Dictionaries in action string … vtable slots … TreeNode class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); Set … vtable slots … TreeNode > vtable for Set vtable for Set >

36 Dictionaries in action string … vtable slots … TreeNode class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); Set … vtable slots … TreeNode > Set [] vtable for Set vtable for Set >

37 Dictionaries in action string … vtable slots … TreeNode class Set { … public void Add(T x) { … …new TreeNode ()… } public T[] ToArray() { … …new T[]… } } Set s = new Set (); s.Add(“a”); Set > ss = new Set >(); ss.Add(s); Set [] ssa = ss.ToArray(); string[] sa = s.ToArray(); Set … vtable slots … TreeNode > Set [] vtable for Set vtable for Set > string[]

38 x86 code for new TreeNode mov ESI, dword ptr [EDI] mov EAX, dword ptr [ESI+24] mov EAX, dword ptr [EAX] add EAX, 4 mov dword ptr [EBP-0CH], EAX mov EAX, dword ptr [EBP-0CH] mov EBX, dword ptr [EAX] test EBX, EBX jne SHORT G_M003_IG06 G_M003_IG05: push dword ptr [EBP-0CH] push ESI mov EDX, 0x1b000002 mov ECX, 0x903ea0 call @RuntimeHandle jmp SHORT G_M003_IG07 G_M003_IG06: mov EAX, EBX G_M003_IG07: mov ECX, EAX call @newClassSmall Retrieve dictionary entry from vtable If non-null then skip Look up handle the slow way Create the object with run-time type

39 Is it worth it?  With no dictionaries, just run-time look-up: new Set () is 10x to 100x slower than normal object creation  With lazy dictionary look-up: new Set () is ~10% slower than normal object creation

40 Shared code for polymorphic methods  Polymorphic methods Specialize per instantiation on demand Again share code between instantiations where possible Run-time types issue solved by “dictionary- passing” style

41 Performance  Non-generic quicksort: void Quicksort(object[] arr, IComparer comp)  Generic quicksort void GQuicksort (T[] arr, GIComparer comp)  Compare on element types int, string, double

42 Performance

43 Theory

44 Transposing F to C#  As musical keys, F and C ♯ are far apart  As programming languages, (System) F and (Generic) C ♯ are far apart  But: Polymorphism in Generic C ♯ is as expressive as polymorphism in System F

45 System F and C ♯ System FGeneric C ♯ Structural equivalence for typesName equivalence for types No subtypingSubtyping & inheritance First-class functionsVirtual methods Quantified types (“first-class polymorphism”) Parameterized classes & polymorphic methods

46 System F into C ♯  Despite the differences, we can formalize a translation from System F into (Generic) C ♯ that is fully type-preserving (no loss of information) is sound (preserves program behaviour) makes crucial use of the fact that: polymorphic virtual methods express first-class polymorphism

47 Polymorphic virtual methods  Define an interface or abstract class: interface Sorter { void Sort (T[] a, IComparer c); }  Implement the interface: class QuickSort : Sorter {... } class MergeSort : Sorter {... }  Use instances at many type instantiations: void TestSorter(Sorter s, int[] ia, string[] sa) { s.Sort (ia, IntComparer); s.Sort (sa, StringComparer); } TestSorter(new QuickSort(),...); TestSorter(new MergeSort(),...);

48 Compare:  Define an SML signature: signature Sorter = sig val Sort : ‘a array * (‘a*’a->order) –> unit end  Define structures that match the signature: structure QuickSort :> Sorter =... structure MergeSort :> Sorter =...  Use structures at many type instantiations: functor TestSorter(S : Sorter) = struct fun test (ia, sa) = (S.Sort(ia, Int.compare); S.Sort(sa, String.compare) end structure TestQS = TestSorter(QuickSort); TestQS.test(...); structure TestMS = TestSorter(MergeSort); TestMS.test(...);

49 Or (Russo first-class modules):  Define an SML signature: signature Sorter = sig val Sort : ‘a array * (‘a*’a->order) –> unit end  Define structures that match the signature: structure QuickSort :> Sorter =... structure MergeSort :> Sorter =...  Use a function to test the structures: fun TestSorter (s, ia, sa) = let structure S as Sorter = s in (S.Sort(ia, Int.compare); S.Sort(sa, String.compare)) end TestSorter ([structure QuickSort as Sorter],...); TestSorter ([structure MergeSort as Sorter],...);

50 Observations  Translation from System F to C# is global generates new class names for (families of) polymorphic types  The generics design for Java (GJ) also supports polymorphic virtual methods  C++ has “template methods” but not virtual ones for good reason: it compiles by expansion  Distinctiveness of polymorphic virtual methods shows up in (type-passing) implementations (e.g. CLR) requires execution-time type application

51 Practice

52 Type inference?  ML and Haskell have type inference  C# programs must be explicitly-typed  Is this a problem in practice? not for the most-frequent application: collection classes but try parser combinators in C#...

53 Parser combinators (Sestoft) class SeqSnd : Parser { Parser tp; Parser up; public SeqSnd(Parser tp, Parser up) { this.tp = tp; this.up = up; } public Result Parse(ISource src) { Result tr = tp.Parse(src); if (tr.Success) { Result ur = up.Parse(tr.Source); if (ur.Success) return new Succ (ur.Value, ur.Source); } return new Fail (); } }

54 On the other hand… .NET generics are supported by debugger profiler class browser GUI development environment

55 Try it!  Rotor = shared-source release of CLR and C# http://msdn.microsoft.com/NET/sscli  Generics + Rotor = Gyro  Gyro extends Rotor with generics support in CLR and C# http://research.microsoft.com/projects/clrgen

56 Future

57 Extension: Variance  Should we add variance? e.g. IEnumerator IComparer  Can even use this to support “broken” Eiffel: class Cell { T val; void Set(T newval) { val = newval; } T Get() { return val; } } class Cell { T val; void Set(object newval) { val = (T) newval; } T Get() { return val; } } invariant in Tcovariant in T Run-time check

58 Extension: Parameterize by superclass  Can type-check given sufficient constraints: class D { virtual void m1() { … } virtual void m2() { … } } class C : T where T : D { int f; override void m2(T x) { …x.m1()… } new virtual void m3() { … } } T must extend D Override method D.m2 Know m1 exists because of constraint on T New method, name can clash with method from T

59 Extension: Parameterized by superclass (2)  Provides a kind of “mixin” facility  Unfortunately, implementation isn’t easy  We’d like to share rep & code for C and C for reference types P and Q, but it may be the case that object size of C ≠ size of C field offset of C.f ≠ offset of C.f vtable slot of C.m3 ≠ slot of C.m3 => abandon sharing, or do more run-time lookup

60 Open problem  Most widely used polymorphic library is probably C++ STL (Standard Template Library)  STL gets expressivity and efficiency from checking and compiling instantiations separately Really : ML functors can’t match it  How can we achieve the same expressivity and efficiency with compile-time-checked parametric polymorphism?

61 Questions?

1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge.

Similar presentations

Presentation on theme: "1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge.

Similar presentations

Presentation on theme: "1 Parametric Polymorphism for Popular Programming Languages Andrew Kennedy Microsoft Research Cambridge."— Presentation transcript:

Similar presentations

About project

Feedback