Presentation is loading. Please wait.

Presentation is loading. Please wait.

Telescoping Languages A Framework for Generating High- Performance Problem-Solving Systems Ken Kennedy Center for High Performance Software Rice University.

Similar presentations


Presentation on theme: "Telescoping Languages A Framework for Generating High- Performance Problem-Solving Systems Ken Kennedy Center for High Performance Software Rice University."— Presentation transcript:

1 Telescoping Languages A Framework for Generating High- Performance Problem-Solving Systems Ken Kennedy Center for High Performance Software Rice University http://www.cs.rice.edu/~ken/Presentations/Telescope.pdf Center for High Performance Software Research

2 Collaborators Bradley Broom Arun Chauhan Keith Cooper Jack Dongarra Rob Fowler Lennart Johnsson Chuck Koelbel Cheryl McCosh John Mellor-Crummey Linda Torczon

3 Center for High Performance Software Research Philosophy Compiler Technology = Off-Line Processing —Goals: improved performance and language usability –Making it practical to use the full power of the language —Trade-off: preprocessing time versus execution time —Rule: performance of both compiler and application must be acceptable to the end user Examples —Macro expansion –PL/I interpretive macro facility –Fixed macros can be compiled  10x improvement with compilation —TransMeta “Code Morphing” –Dynamic compilation of machine code

4 Center for High Performance Software Research Making Languages Usable It was our belief that if FORTRAN, during its first months, were to translate any reasonable “scientific” source program into an object program only half as fast as its hand-coded counterpart, then acceptance of our system would be in serious danger... I believe that had we failed to produce efficient programs, the widespread use of languages like FORTRAN would have been seriously delayed. — John Backus

5 Center for High Performance Software Research A Java Experiment Scientific Programming In Java —Goal: make it possible to use the full object-oriented power for scientific applications –Many scientific implementations mimic Fortran style OwlPack Benchmark Suite —Three versions of LinPACK in Java –Fortran style –Lite object-oriented style –Full polymorphism  No differences for type Experiment —Compare running times for different styles on same Java VM —Evaluate potential for compiler optimization

6 Center for High Performance Software Research Performance Results Results Using JDK 1.2 JIT on SUN Ultra 5

7 Center for High Performance Software Research Programming Productivity Challenges —programming is hard —professional programmers are in short supply —high performance will continue to be important One Strategy: Make the End User a Programmer —professional programmers develop components —users integrate components using: –problem-solving environments (PSEs) based on scripting languages (possibly graphical)  examples: Visual Basic, Tcl/Tk, AVS, Khoros Compilation for High Performance —translate scripts and components to common intermediate language —optimize the resulting program using interprocedural methods

8 Center for High Performance Software Research Script-Based Programming Component Library Component Library User Library User Library Script

9 Center for High Performance Software Research Script-Based Programming Component Library Component Library User Library User Library Script Intermediate Code Intermediate Code Translator

10 Center for High Performance Software Research Script-Based Programming Component Library Component Library User Library User Library Script Intermediate Code Intermediate Code Global Optimizer Global Optimizer Translator

11 Center for High Performance Software Research Code Generator Code Generator Script-Based Programming Component Library Component Library User Library User Library Script Intermediate Code Intermediate Code Global Optimizer Global Optimizer Translator

12 Center for High Performance Software Research Code Generator Code Generator Script-Based Programming Component Library Component Library User Library User Library Script Intermediate Code Intermediate Code Global Optimizer Global Optimizer Translator Problem: long compilation times, even for short scripts!

13 Center for High Performance Software Research Code Generator Code Generator Script-Based Programming Component Library Component Library User Library User Library Script Intermediate Code Intermediate Code Global Optimizer Global Optimizer Translator Problem: long compilation times, even for short scripts! Problem: expert knowledge on specialization lost

14 Center for High Performance Software Research Telescoping Languages L 1 Class Library L 1 Class Library

15 Center for High Performance Software Research Telescoping Languages L 1 Class Library L 1 Class Library Compiler Generator Compiler Generator L 1 Compiler Could run for hours

16 Center for High Performance Software Research Telescoping Languages L 1 Class Library L 1 Class Library Script Compiler Generator Compiler Generator L 1 Compiler Script Translator Script Translator Optimized Application Optimized Application Vendor Compiler Vendor Compiler Could run for hours understands library calls as primitives

17 Center for High Performance Software Research Telescoping Languages: Advantages Compile times can be reasonable —More compilation time can be spent on libraries —Script compilations can be fast –Components reused from scripts may be included in libraries High-level optimizations can be included —Based on specifications of the library designer –Properties often cannot be determined by compilers –Properties may be hidden after low-level code generation User retains substantive control over language performance —Mature code can be built into a library and incorporated into language Reliability can be improved —Specialization by compilation framework, not user

18 Center for High Performance Software Research Applications Matlab Compiler —Automatically generated from LAPACK or ScaLAPACK –With help via annotations from the designer Generator for ARPACK —Library developer maintains code in Matlab —Currently recodes in Fortran by hand — could be automated Flexible Data Distributions —Failing of HPF: inflexible distributions —Data distribution == collection of interfaces that meet specs —Compiler applies standard transformations Generator for Grid Computations —GrADS: automatic generation of NetSolve

19 Center for High Performance Software Research Application: Matlab for Signal Processing Automatically generated from LAPACK or ScaLAPACK —With help via annotations from the designer Special project: Signal Processing Applications written in Matlab —Users want simplicity and performance —Matlab currently gives them the first but not the second –Codes rewritten in C for communications devices —Run signal processing procedures through the generator –Many code modules reused

20 Center for High Performance Software Research Application: POOMA Procedure library for computational hydrodynamics —Distributed data structures –vectors, arrays, tensors —Coded in C++ —Context optimizations coded into template expansion mechanism –20-line program compiles for over an hour on 32 processors —Enhanced reliability Telescoping languages —Generate POOMA from simpler libraries for Fortran and Java

21 Center for High Performance Software Research Requirements of Script Compilation Scripts must generate efficient programs —Comparable to those generated from standard interprocedural methods —Avoid need to recode in standard language Script compile times should be proportional to length of script —Not a function of the complexity of the library —Principle of “least astonishment”

22 Center for High Performance Software Research Telescoping Languages Script L 1 Compiler Script Translator Script Translator Optimized Application Optimized Application Vendor Compiler Vendor Compiler understands library calls as primitives

23 Center for High Performance Software Research Script Compilation Algorithm Propagate variable property information throughout the program —Use jump functions to propagate through calls to library Apply high-level transformations —Driven by information about properties —Ensure that process applies to expanded code Select and substitute specialized variants for library calls —At each call site, determine the best approximation to parameter properties that is reflected by a specialized fragment in the code database –Use a method similar to “unification” —Substitute fragment from database for call –This could contain a call to a lower-level library routine.

24 Center for High Performance Software Research Telescoping Languages L 1 Class Library L 1 Class Library Compiler Generator Compiler Generator L 1 Compiler Could run for hours

25 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction Analysis of Transformation Specifications —Construction of a specification-driven translator for use in compiling scripts Code Specialization for Different Sets of Parameter Properties

26 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction —Which properties of parameters affect optimization –Examples: value, type, rank and size of matrix

27 Center for High Performance Software Research Discovery of Critical Properties From specifications by the library designer —If the matrix is triangular, then… From examining the code itself —Look at a promising optimization point —Determine conditions under which we can make significant optimizations —See if any of these conditions can be mapped back to parameter properties From sample calling programs provided by the designer  call average(shift(A,-1), shift(A,+1)) –Can save on memory accesses

28 Center for High Performance Software Research Examining the Code Example from LAPACK subroutine VMP(C, A, B, m, n, s) integer m,n,s; real A(n), B(n), C(m) i = 1 do j = 1, n C(i) = C(i) + A(j)*B(j) i = i + s enddo end VMP Vectorizable if s != 0

29 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction —Which properties of parameters affect optimization –Examples: value, type, rank and size of matrix —Construction of jump functions for the library calls –With respect to critical properties

30 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction —Which properties of parameters affect optimization –Examples: value, type, rank and size of matrix —Construction of jump functions for the library calls –With respect to critical properties Analysis of Transformation Specifications —Construction of a specification-driven translator for use in compiling scripts

31 Center for High Performance Software Research High-level Identities Often library developer knows high-level identities —Difficult for the compiler to discern —Optimization should be performed on sequences of calls rather than code remaining after expansion Example: Push and Pop —Designer Push(x) followed by y = Pop() becomes y = x –Ignore possibility of overflow in Push Example: Trigonometric Functions —Sin and Cos used in same loop—both computed using expensive calls to the trig library —Recognize that cos(x) and sin(x) can be computed by a single call to sincos(x,s,c) in a little more than the time required for sin(x).

32 Center for High Performance Software Research Out of Core Arrays —Operations Get(I,J) and GetRow(I,Lo,N) Get in a loop  Do I  Do J  … Get(I,J)  Enddo When can we vectorize? —Turn into GetRow —Answer: if Get is not involved in a recurrence. –How can we know? Contextual Expansions

33 Center for High Performance Software Research Contextual Expansions Out of Core Arrays —Operations Get(I,J) and GetRow(I,Lo,N) Get in a loop  Do I  Do J  … Get(I,J)  Enddo When can we vectorize? —Turn into GetRow —Answer: if Get is not involved in a recurrence. –How can we know? Vector versions of library routines can often be constructed

34 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction —Which properties of parameters affect optimization –Examples: value, type, rank and size of matrix —Construction of jump functions for the library calls –With respect to critical properties Analysis of Transformation Specifications —Construction of a specification-driven translator for use in compiling scripts Code Specialization for Different Sets of Parameter Properties —For each set, assume and optimize to produce specialized code

35 Center for High Performance Software Research Code Selection Example Library compiler develops inlining tables subroutine VMP(C, A, B, m, n, s) integer m,n,s; real A(n), B(n), C(m) i = 1 do j = 1, n C(i) = C(i) + A(j)*B(j) i = i + s enddo end VMP case on s: ==0: C(1) = C(1) + sum(A(1:n)*B(1:n)) !=0: C(1:n:s) = C(1:n:s) + A(1:n)*B(1:n) default: call VMP(C,A,B,m,n,s) Inlining Table: vector

36 Center for High Performance Software Research Application: Matlab for Signal Processing Signal processing users want simplicity, programming power, and performance —Currently over 500,000 Matlab licenses Matlab gives them simplicity and power but not performance —Codes prototyped in Matlab —Codes rewritten in C for communications devices –Users would rather not do this Telescoping Languages: —Many signal processing code modules reused over and over —Run these procedures through the language generator –Produce Matlab SP, a high-level domain-specific environment

37 Center for High Performance Software Research Matlab SP: Preliminary Findings Optimizations That Pay Off —Vectorization –Wins because of hand coded vector/matrix primitives —Elimination of common array subexpressions —Optimization of array allocation and reshape operations New Optimizations —Procedure vectorization –Interchange call and loop after distribution —Procedure strength reduction –Subdivide procedure in to variant and invariant components –Use invariant component only once

38 Center for High Performance Software Research Procedure Strength Reduction Procedure called in loop for i = 1:N x = f(c 1,c 2,i,c 3 ) end Becomes f  (c 1,c 2, c 3 ) for i = 1:N x = f  (i) end Further improvements possible —Use code differentiation to compute differences –ADIFOR

39 Center for High Performance Software Research Procedure Strength Reduction Performance

40 Center for High Performance Software Research http://www.cs.rice.edu/~ken/Presentations/Telescope.pdf Summary Optimization enables language power —Principle: encourage rather than discourage use of powerful features –Good programming practice should be rewarded Programming support is challenging —Particularly with application and platform complexity on the rise –Compounded by the shortage of IT professionals Strategy: make end users into application developers —Telescoping languages: Framework for generating high-level problem-solving systems —Must produce high-quality code –Avoid the need to recode by hand

41 Center for High Performance Software Research Summary PITAC: Focus on long-term, high-risk research The scalable infrastructure should be a scalable problem-solver —Access to information is not enough —Linked computation is not enough Programming support is still relatively primitive —Application and platform complexity increasing —Compounded by the shortage of IT professionals Strategy: make end users into application developers —Professional programmers focus on components —End users build applications in scripting systems Telescoping languages: —Framework for generation of high-level problem-solving systems

42 Software Support for High- Performance Problem Solving (With Application to Grid Programming) Ken Kennedy Center for High Performance Software Rice University http://www.cs.rice.edu/~ken/Presentations/GridTelescope.pdf Center for High Performance Software Research

43 Collaborators Bradley Broom Arun Chauhan Keith Cooper Jack Dongarra Rob Fowler Dennis Gannon Lennart Johnsson John Mellor-Crummey John Reynders Linda Torczon

44 Center for High Performance Software Research Lessons from PITAC Findings —Research funding increasingly focused on short term —Universities weakened –Impact on workforce —Industry cannot fill the gap –Return on investment: 24 percent versus 66 percent Refocus Research on Long-Term, High-Risk Problems —Requires an expansion of the base Invest in Key Areas —Software —Scalable Information Infrastructure —High Performance Computing —Social, Economic, and Workforce Issues (Education)

45 Center for High Performance Software Research Two IT Grand Challenges The Internet as Problem-Solving Engine —Challenge: How do we develop applications and manage their execution? –Reliable performance under varying load –Accessibility to ordinary scientists and engineers —GrADS Project Software Productivity —Challenge: How do we increase the nation’s productivity in software development –Too much software to be written, too few developers –Application and platform complexity increasing —Idea: make it possible for end users to be application developers

46 Center for High Performance Software Research Grids are “Hot” Computational Data Information Access Knowledge DISCOM SinRG APGrid TeraGrid

47 Center for High Performance Software Research National Distributed Problem Solving

48 Center for High Performance Software Research National Distributed Problem Solving

49 Center for High Performance Software Research National Distributed Problem Solving Supercomput er

50 Center for High Performance Software Research National Distributed Problem Solving Supercomput er Database

51 Center for High Performance Software Research National Distributed Problem Solving Supercomput er Database

52 Center for High Performance Software Research National Distributed Problem Solving Database Supercomput er Database Supercomput er

53 Center for High Performance Software Research Today: Globus Developed by Ian Foster and Carl Kesselman —Grew from the I-Way (SC-95) Basic Services for distributed computing —Resource discovery and information services —User authentication and access control —Job initiation —Communication services (Nexus and MPI) Applications are programmed by hand —Many applications —User responsible for resource mapping and all communication –Existing users acknowledge how hard this is

54 Center for High Performance Software Research GrADSoft Architecture Goal: reliable performance on dynamically changing resources Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

55 Center for High Performance Software Research GrADSoft Architecture Execution Environment Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

56 Center for High Performance Software Research GrADSoft Architecture Execution Environment Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

57 Center for High Performance Software Research GrADSoft Architecture Program Preparation System Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

58 Center for High Performance Software Research GrADSoft Architecture Problem-Solving Environments Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

59 Center for High Performance Software Research Library Analysis and Preparation Discovery of Critical Properties and Propagator Construction —Which properties of parameters affect optimization –Examples: value, type, rank and size of matrix —Construction of jump functions for the library calls –With respect to critical properties Analysis of Transformation Specifications —Construction of a specification-driven translator for use in compiling scripts Code Specialization for Different Sets of Parameter Properties —For each set, assume and optimize to produce specialized code


Download ppt "Telescoping Languages A Framework for Generating High- Performance Problem-Solving Systems Ken Kennedy Center for High Performance Software Rice University."

Similar presentations


Ads by Google