Portability by Automatic Translation: Two Case Studies Yishai A. Feldman The Interdisciplinary Center Herzliya, Israel.

Slides:



Advertisements
Similar presentations
An Introduction to Assembler Language and Subroutine Linkages / Save Areas Ch.5 - Topic 1 See Page 95 Additional information on subroutines in Topic 1.
Advertisements

The University of Adelaide, School of Computer Science
1 Today’s lecture  Last lecture we started talking about control flow in MIPS (branches)  Finish up control-flow (branches) in MIPS —if/then —loops —case/switch.
Program Representations. Representing programs Goals.
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
Manish Bhide, Manoj K Agarwal IBM India Research Lab India {abmanish, Amir Bar-Or, Sriram Padmanabhan IBM Software Group, USA
The SQL Language Presented by Reggie James, Isel Liunoras, and Chris Rollins.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 11 Database Performance Tuning and Query Optimization.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Computer System Overview Chapter 1. Basic computer structure CPU Memory memory bus I/O bus diskNet interface.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Data Transfer & Decisions I (1) Fall 2005 Lecture 3: MIPS Assembly language Decisions I.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Object-Oriented Methods: Database Technology An introduction.
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
Database Management Systems (DBMS)
Making Database Applications Perform Using Program Analysis Alvin Cheung Samuel Madden Armando Solar-Lezama MIT Owen Arden Andrew C. Myers Cornell.
Software Reengineering 2003 년 12 월 2 일 최창익, 고광 원.
Software Re-engineering
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
Precision Going back to constant prop, in what cases would we lose precision?
Programming Languages – Coding schemes used to write both systems and application software A programming language is an abstraction mechanism. It enables.
Compiler Construction Lecture 17 Mapping Variables to Memory.
1 Lecture 2 : Computer System and Programming. Computer? a programmable machine that  Receives input  Stores and manipulates data  Provides output.
Bordoloi and Bock CURSORS. Bordoloi and Bock CURSOR MANIPULATION To process an SQL statement, ORACLE needs to create an area of memory known as the context.
Limitations of the relational model. Just as the relational model supplanted the network and hierarchical model so too will the object – orientated model.
By: Blake Peters.  OODB- Object Oriented Database  An OODB is a database management system in which information is represented in the form of objects.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Compositional correctness of IP-based system design: Translating C/C++ Models into SIGNAL Processes Rennes, November 04, 2005 Hamoudi Kalla and Jean-Pierre.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
COMPILER DESIGN Fourth Year (First Semester) Lecture 1
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages.
C Tutorial Session #2 Type conversions More on looping Common errors Control statements Pointers and Arrays C Pre-processor Makefile Debugging.
CSE 2541 – Advanced C Programming. Course info Prereq – CSE 2221 or CSE 222 Co-req – CSE 2231 Website
1 Lecture 2 : Computer System and Programming. Computer? a programmable machine that  Receives input  Stores and manipulates data  Provides output.
Chapter 5 The LC Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization.
The LC-3. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 5-2 Instruction Set Architecture ISA = All of the.
CIS4368: Advanced DatabaseSlide # 1 PL/SQL Dr. Peeter KirsSpring, 2003 PL/SQL.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
XML and Database.
Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 28Slide 1 CO7206 System Reengineering 4.2 Software Reengineering Most slides are Slides.
Chapter 2 Decision-Making Instructions (Instructions: Language of the Computer Part V)
Optimal Aggregation Algorithms for Middleware By Ronald Fagin, Amnon Lotem, and Moni Naor.
Query Processing – Implementing Set Operations and Joins Chap. 19.
ALGORITHMS AND FLOWCHARTS. Why Algorithm is needed? 2 Computer Program ? Set of instructions to perform some specific task Is Program itself a Software.
Linear Analysis and Optimization of Stream Programs Masterworks Presentation Andrew A. Lamb 4/30/2003 Professor Saman Amarasinghe MIT Laboratory for Computer.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Program Program is a collection of instructions that will perform some task.
CS422 Principles of Database Systems Stored Procedures and Triggers Chengyu Sun California State University, Los Angeles.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CS223: Software Engineering Lecture 34: Software Maintenance.
Writing Functions in Assembly
Assembly language.
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Microprocessor and Assembly Language
ALGORITHMS AND FLOWCHARTS
Computer System and Programming
Writing Functions in Assembly
Introduction to PIG, HIVE, HBASE & ZOOKEEPER
MIPS coding.
The Relational Model Textbook /7/2018.
Programming Languages, Preliminaries, History & Evolution
Software Re-engineering and Reverse Engineering
Presentation transcript:

Portability by Automatic Translation: Two Case Studies Yishai A. Feldman The Interdisciplinary Center Herzliya, Israel

First Case: Bogart Large-Scale Translation from Assembly Language to C Joint Work with Doron A. Friedman

The Problem u 400,000 lines of IBM 370 assembly code u Customers downsizing mainframes u Hand-optimized code over 15 calendar years u Live system

Success Criteria u Portability u Efficiency u Minimum manual work BUT u Readability is not important

Difficult Assembly Features u Registers u Condition code u Untyped language u Unstructured code u Large unstructured memory areas u Portability: u Different byte order u Different word size u Different pointer size

The Simulating Translator STM R14,R12,12(R13) LR R12,R15 USING HORNER,R12 ST R13,SAV+4 LA R13,SAV LA R7,COEF L R5,0(R7) LA R9,0 LOOP CR R9,R2 BNL OUT LA R9,1(R9) LA R7,4(R7) MR R4,R3 A R5,0(R7) B LOOP OUT LR R0,R5 LM R1,R12,24(R13) BR R14 void HORNER(tagSAPReg *Reg) { T_stm(14,12,((Reg[13].ucp+12)),Reg); Reg[12].sw = Reg[15].sw ; SAV[1] = Reg[13].sw ; Reg[13].pv = &(SAV[0]) ; Reg[7].pv = &(COEF[0]) ; Reg[5].sw = *(sWord *)Reg[7].ucp; Reg[9].sw = 0 ; LOOP: if ((Reg[9].sw) >= Reg[2].sw) goto OUT; Reg[9].sw += 1; Reg[7].sw += 4; T_mult(&Reg[4],Reg[3].sw) ; Reg[5].sw += *((sWord *)(Reg[7].ucp)); goto LOOP ; OUT: Reg[0].sw = Reg[5].sw; T_lm(1,12,((Reg[13].ucp+24)),Reg); return; }

Bogart Better Optimizing General-purpose Abstract Representation Translator

Translation by Abstraction, Transformation, and Reimplementation u Control-flow and data-flow analysis u Typing by constraint propagation u Automatic cliche recognition Abstraction Transformation Re-implementation

The Code Produced by Bogart void HORNER(tagSAPReg *Reg) { T_stm(14,12,((Reg[13].ucp+12)),Reg); Reg[12].sw = Reg[15].sw ; SAV[1] = Reg[13].sw ; Reg[13].pv = &(SAV[0]) ; Reg[7].pv = &(COEF[0]) ; Reg[5].sw = *(sWord *)Reg[7].ucp; Reg[9].sw = 0 ; LOOP: if ((Reg[9].sw) >= Reg[2].sw) goto OUT; Reg[9].sw += 1; Reg[7].sw += 4; T_mult(&Reg[4],Reg[3].sw) ; Reg[5].sw += *((sWord *)(Reg[7].ucp)); goto LOOP ; OUT: Reg[0].sw = Reg[5].sw; T_lm(1,12,((Reg[13].ucp+24)),Reg); return; } sWord HORNER(sWord r2sw, sWord r3sw) { sWord r5sw; sWordPtr r7swp; sWord r9sw; r7swp = (sWord *)(&COEF[0]); r5sw = *r7swp; r9sw = 0; while (r9sw < r2sw) { r9sw++; r7swp++; r5sw = r5sw*r3sw + *r7swp; } return r5sw; }

Accumulating Compound Expressions L R11,NIG L R5,0(R4) BAL R14,INCTAB LPR R7,R0 MR R2,R7 LR R0,R3 BR R14 Reg[11].sw = NIG; Reg[5].sw = *(sWord *)Reg[4].ucp; INCTAB(Reg); Reg[7].sw = labs(Reg[0].sw); T_mult(&Reg[2], Reg[7].sw); Reg[0].sw = Reg[3].sw; return; return r3sw * labs(INCTAB(NIG,(*r4swp))); Bogart Assembly Simulating translator

Example: Condition Code Support CGLOOP CH R7,0(R5,R4) SRL R2,1 BH CGADD BNH CGSUB Assembly if (Reg[7].sw == *((sHalf *)((Reg[4].ucp+Reg[5].sw)))) __CC = _CZero; else if (Reg[7].sw < *((sHalf *)((Reg[4].ucp+Reg[5].sw)))) __CC = _COne; else __CC = _CTwo; Reg[2].uw >>= 1; if (__CC & 0x4) goto CGADD; if (__CC & 0x3) goto CGSUB; Simulating translator Bogart r2sh >>= 1; temp = r4ucp + r5sh; if (r7sh > temp) goto CGADD; if (r7sh <= temp) goto CGSUB;

The Plan Representation CGR CH R2,GMF BL CONGR SRL R2,1 B CGR CONGR....

Global Type Analysis by Constraint Propagation

Time and Space Comparison Bogart and Simulating Translator Simulator Bogart Time (sec.) Space (bytes) BIN HORNER RANDOM SAPDBMS ( SAPDBMS is a central Sapiens module)

Time Performance on Several Platforms (For example routine BIN) IBM 370 RS/6000 AS/400 PC (DOS) 0.32— — — Original Assembly Simulator failed1.26 Bogart 1111 Hand Crafted

Results u Bogart produces more portable code u Bogart supports a larger portion of the source language u Bogart requires less manual work in code preparation u Bogart produces more efficient code in terms of time and space performance

Conclusions u Translation by abstraction produces better results than simulation on all criteria u Simulation is simpler and faster to implement u Simulated code is easier for the programmers to debug u The advantages of the abstraction approach grow in the long term u “Research-then-transfer” versus “Industry-as- laboratory” (Colin Potts, 1993)

New Developments: Bogart  Falcon 2000

Second Case: MIDAS Automatic High-Quality Reengineering of Database Programs by Temporal Abstraction Joint Work with Yossi Cohen

The Problem Legacy Database Software u Much legacy software is DP, many database- related programs u Conversion from older models (indexed- sequential, hierarchical, network) to relational/object-oriented databases u Need to convert: u Schema u Data u Software

Network vs. Relational Databases

Network Database Program (1) 01 MOVE 0 TO STATUS1. 02 PERFORM UNTIL STATUS1 IS NOT EQUAL TO ZERO 03 FETCH NEXT STUDENT WITHIN DEPT-OF-STUDENT 04 AT END MOVE 1 TO STATUS1 05 IF STATUS1 IS EQUAL TO 0 THEN 06 IF STUDENT-DEGREE IS EQUAL TO 2 THEN 07 MOVE 0 TO GRADES-SUM 08 MOVE 0 TO GRADES-COUNT 09 PERFORM SUM-STUDENT-GRADES 10 DIVIDE GRADES-SUM BY GRADES-COUNT 11 GIVING GRADES-AVG 12 IF GRADES-AVG > 95 THEN 13 DISPLAY..., GRADES-AVG 14 END-IF 15 END-IF 16 END-IF 17 END-PERFORM.

Network Database Program (2) 18 SUM-STUDENT-GRADES. 19 MOVE 0 TO STATUS2 20 PERFORM UNTIL STATUS2 IS NOT EQUAL TO ZERO 21 FETCH NEXT GRADES WITHIN STUDENT-OF-GRADES 22 AT END MOVE 1 TO STATUS2 23 IF STATUS2 IS EQUAL TO 0 THEN 24 ADD GRD-GRADE TO GRADES-SUM 25 ADD 1 TO GRADES-COUNT 26 END-IF 27 END-PERFORM.

Naive Translation (1) 01 EXEC SQL DECLARE CRS1 CURSOR FOR 02 SELECT... FROM STUDENT 03 WHERE DEPT-NAME = :DEPT-NAME 04 END-EXEC. 05 EXEC SQL DECLARE CRS2 CURSOR FOR 06 SELECT... FROM GRADES 07 WHERE STUDENT-ID = :STUDENT-ID 08 END-EXEC.

Naive Translation (2) 09 MOVE 0 TO STATUS1 10 EXEC SQL OPEN CRS1 END-EXEC 11 PERFORM UNTIL STATUS1 IS NOT EQUAL TO 0 12 EXEC SQL FETCH CRS1 INTO... END-EXEC. 13 IF SQL-STATUS = SQL-NOT-FOUND THEN MOVE 1 TO STATUS1. 14 IF STATUS1 IS EQUAL TO 0 THEN 15 IF STUDENT-DEGREE IS EQUAL TO 2 THEN 16 MOVE 0 TO GRADES-SUM 17 MOVE 0 TO GRADES-COUNT 18 PERFORM SUM-STUDENT-GRADES 19 DIVIDE GRADES-SUM INTO GRADES-COUNT GIVING GRADES-AVG 20 IF GRADES-AVG > 95 THEN 21 DISPLAY..., GRADES-AVG 22 END-IF 23 END-IF 24 END-IF 25 END-PERFORM. 26 EXEC SQL CLOSE CRS1 END-EXEC.

Naive Translation (3) 27 SUM-STUDENT-GRADES. 28 MOVE 0 TO STATUS2 29 EXEC SQL OPEN CRS2 END-EXEC. 30 PERFORM UNTIL STATUS2 IS NOT EQUAL TO 0 31 EXEC SQL FETCH CRS2 INTO... END-EXEC. 32 IF SQL-STATUS = SQL-NOT-FOUND 33 THEN MOVE 1 TO STATUS2. 34 IF STATUS2 IS EQUAL TO 0 THEN 35 ADD GRD-GRADE TO GRADES-SUM 36 ADD 1 TO GRADES-COUNT 37 END-IF 38 END-PERFORM. 39 EXEC SQL CLOSE CRS2 END-EXEC.

MIDAS: Translation by Abstraction, Transformation, and Re-implementation Abstraction Temporal Abstraction Re-implementation Network DB code Relational DB code Plan Temporal Plan

MIDAS Translation 01 EXEC SQL DECLARE CRS1 CURSOR FOR 02 SELECT STUDENT.STUDENT-ID, FIRST-NAME, LAST-NAME, AVG(GRADE) 03 FROM STUDENT, GRADES 04 WHERE DEGREE = 2 05 AND DEPT-NAME = :DEPT-NAME 06 AND GRADES.STUDENT-ID = STUDENT.STUDENT-ID 07 GROUP BY STUDENT.STUDENT-ID, FIRST-NAME, LAST-NAME 08 HAVING AVG(GRADE) > END-EXEC. 10 PERFORM UNTIL SQL-STATUS = SQL-NOT-FOUND 11 EXEC SQL FETCH CRS1 INTO... END-EXEC. 12 DISPLAY..., GRADES-AVG 13 END-PERFORM

The Internal Representation: Query Graphs u Temporal abstraction u Generate / Join u Filter u Map u Aggregate u Wide-spectrum formalism

Filtering Transformation

Join-Down Transformation

Accumulation Transformation

Performance Results

Conclusions u Translation by abstraction, transformation, and re-implementation demonstrated in two domains u Query graphs as abstraction for database operations u Adapts to different schema transformations u Scalability

Conclusions and Future Work u Appropriate domain u Same host language u Few cliches give wide coverage u Important commercially u Generalizations u Other legacy models u OODB / 4GL as targets

Questions? Papers can be downloaded from