Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute.

Slides:



Advertisements
Similar presentations
Debugging ACL Scripts.
Advertisements

Lecture 10 Flow of Control: Loops (Part 2) COMP1681 / SE15 Introduction to Programming.
SIR UK Conference London, June 2007 Multiple Databases or Tabfiles? Frances Williams ISER, University of Essex.
10. NLTS2 Documentation Overview. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
Computer Science 313 – Advanced Programming Topics.
P5, M1, D1.
Variables Conditionals Loops The concept of Iteration Two types of loops: While For When do we use them? Iteration in the context of computer graphics.
Week 5: Loops 1.  Repetition is the ability to do something over and over again  With repetition in the mix, we can solve practically any problem that.
Programming Types of Testing.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Computer Science 1620 Loops.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Sixth Edition Chapter 5: Looping by Tony.
Computer Programming and Basic Software Engineering 4. Basic Software Engineering 1 Writing a Good Program 4. Basic Software Engineering 3 October 2007.
Loops – While, Do, For Repetition Statements Introduction to Arrays
Introduction to a Programming Environment
Chapter 1 Program Design
PRE-PROGRAMMING PHASE
Computer Programming and Basic Software Engineering 4. Basic Software Engineering 1 Writing a Good Program 4. Basic Software Engineering.
สาขาวิชาเทคโนโลยี สารสนเทศ คณะเทคโนโลยีสารสนเทศ และการสื่อสาร.
Tutorial 11 Using and Writing Visual Basic for Applications Code
Standard Grade Computing System Software & Operating Systems.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley STARTING OUT WITH Python Python First Edition by Tony Gaddis Chapter 3 Simple.
CPS120 Introduction to Computer Science Iteration (Looping)
Describe the Program Development Cycle. Program Development Cycle The program development cycle is a series of steps programmers use to build computer.
Professional IT Roles Investigate IT professional roles. Find out what each role involves, what the job entails. Identify what personal qualities are needed.
CSE 219 Computer Science III Program Design Principles.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
System Analysis (Part 3) System Control and Review System Maintenance.
Coding Design Tools Rachel Gauci. What are Coding Design Tools? IPO charts (Input Process Output) Input- Make a list of what data is required (this generally.
A Baker's Dozen Tricks in a Button Thirteen Tricks of the SIR Trade Rolled into a Single Useful Application © Tom Shriver, DataVisor 2002.
Dale Roberts 1 Program Control - Algorithms Department of Computer and Information Science, School of Science, IUPUI CSCI N305.
+ Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy Walters, and Godfrey Muganda Chapter 5: Looping.
CMP-MX21: Lecture 5 Repetitions Steve Hordley. Overview 1. Repetition using the do-while construct 2. Repetition using the while construct 3. Repetition.
The Software Development Process
I Power Higher Computing Software Development High Level Language Constructs.
1 Debugging and Syntax Errors in C++. 2 Debugging – a process of finding and fixing bugs (errors or mistakes) in a computer program.
Starting Out with C++ Early Objects ~~ 7 th Edition by Tony Gaddis, Judy Walters, Godfrey Muganda Modified for CMPS 1044 Midwestern State University 6-1.
GCSE ICT Systems Analysis. Systems analysis Systems analysis is the application of analytical processes to the planning, design and implementation of.
CPS120 Introduction to Computer Science Iteration (Looping)
WATERFALL DEVELOPMENT MODEL. Waterfall model is LINEAR development lifecycle. This means each phase must be completed before moving onto the next!!! WHAT.
FUNCTIONS. Topics Introduction to Functions Defining and Calling a Void Function Designing a Program to Use Functions Local Variables Passing Arguments.
Copyright © 2010 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5: Looping.
© The McGraw-Hill Companies, 2006 Chapter 3 Iteration.
Alternate Version of STARTING OUT WITH C++ 4 th Edition Chapter 5 Looping.
Repetition Statements (Loops). 2 Introduction to Loops We all know that much of the work a computer does is repeated many times. When a program repeats.
Chapter Looping 5. The Increment and Decrement Operators 5.1.
Be “GUI ready” developing in RPG by Robert Arce from PrismaTech. Be “GUI ready” developing in RPG-ILE Presented by: Robert Arce.
Chapter Looping 5. The Increment and Decrement Operators 5.1.
JavaScript 101 Lesson 6: Introduction to Functions.
Repetition In today’s lesson we will look at: why you would want to repeat things in a program different ways of repeating things creating loops in Just.
1 Agenda  Unit 7: Introduction to Programming Using JavaScript T. Jumana Abu Shmais – AOU - Riyadh.
Unit 2 Technology Systems
Unit 2 Technology Systems
Loop Structures.
Chapter 5: Looping Starting Out with C++ Early Objects Seventh Edition
Env. Model Implementation
Chapter 3 Simple Functions
Chapter 5: Looping Starting Out with C++ Early Objects Seventh Edition
Introduction to Computer Programming
Unit 1: Introduction Lesson 1: PArts of a java program
The Metacircular Evaluator
Programming Fundamentals (750113) Ch1. Problem Solving
An Introduction to Structured Program Design in COBOL
T. Jumana Abu Shmais – AOU - Riyadh
Lecture 8 Programming Paradigm & Languages. Programming Languages The process of telling the computer what to do Also known as coding.
ECE 352 Digital System Fundamentals
ICT Gaming Lesson 2.
CodePainter Revolution Trainer Course
1.3.7 High- and low-level languages and their translators
Presentation transcript:

Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute for Social and Economic Research University of Essex

Introduction Data driven systems Data driven systems SIR Compiler Directives SIR Compiler Directives –GLOBAL –CIF –DO REPEAT Brief description of the project Brief description of the project Towards a data driven system Towards a data driven system Hidden time-bombs Hidden time-bombs

Data Driven Systems A large system A large system –Lots of retrievals / programs –Requires substantial modification each year Aim to define what needs changing up top Aim to define what needs changing up top As little modification to underlying code as possible As little modification to underlying code as possible

Compiler Directives - GLOBAL Defined anywhere within SIR environment Defined anywhere within SIR environment –Keeps value across RETRIEVALs –Used in places where variables cannot be used  GLOBAL RECNAM = AREC  ……………..  Process rec  Process rec Value substituted at compile time Value substituted at compile time

Compiler Directives CIF Compile IF Compile IF Determines whether code is to be compiled or not Determines whether code is to be compiled or not Usually used with GLOBALs Usually used with GLOBALs GLOBAL WANTA = 1 CIF DEF CIF DEF.Call FAM.WANTA CIF FALSE.Write “WANTA not called” CIF END CIF EQ, 1

Compiler Directives – DO REPEAT For repetitive pieces of code For repetitive pieces of code Expanded by the compiler Expanded by the compiler Can have any number of repeat symbols which map to a parameter list Can have any number of repeat symbols which map to a parameter list Parameter lists should be of the same length Parameter lists should be of the same length Cannot be nested Cannot be nested Extremely useful Extremely useful

Compiler Directives – DO REPEAT DO REPEAT rtype = AREC BREC CREC / v1 = AVAR1 BVAR1 CVAR1 / DO REPEAT rtype = AREC BREC CREC / v1 = AVAR1 BVAR1 CVAR1 / v2 = AVAR2 BVAR2 CVAR2 /.process rec rtype.Compute v1 = v2. end rec END REPEAT END REPEAT

Compiler Directives – DO REPEAT This is expanded by the compiler to: This is expanded by the compiler to: Process rec AREC.Compute AVAR1 = AVAR2 End rec Process rec BREC.Compute BVAR1 = BVAR2 End rec Process rec CREC.Compute CVAR1 = CVAR2 End rec

Brief Description of Project Very Brief Very Brief Large Social Science Survey Large Social Science Survey –Interview people each year –Interview the same people each year –~20% questions change –13 th year –Data added to Survey database –Converted to User database

Brief Description of Project (cont) Conversion Process Conversion Process –Conversion done once a year  This year’s data added –Structure flattened –Variable names changed –Derived variables calculated –Imputations done –Weightings calculated –Output into SPSS, SAS and Stata

Brief Description of Project (cont) Code written by researchers, not programmers long ago Code written by researchers, not programmers long ago It works but … It works but … –Needs a lot of modifying each year –Difficult to know where things need changing –It contains errors Tight deadlines Tight deadlines Aim to create data driven system Aim to create data driven system

Brief Description of Project (cont) If it aint broke don’t fix it If it aint broke don’t fix it –Particularly if you don’t have much time –But if code needs changing anyway … Step by Step approach Step by Step approach –Choose one section to rewrite –Rewrite so minimal changes required in future –Make sure changes work –More can be done next year Have rewritten the code for derived variables Have rewritten the code for derived variables

Towards a Data Driven System Derived Variables Derived Variables –Non questionnaire variables calculated from questions asked or other derived variables –Same each year – calculated from core variables –Some easy to calculate, some difficult –Re-write so underlying code never needs changing

Towards a Data Driven System Variable naming Variable naming –Wave prefix plus root name –Wave prefix  ‘A’ year 1  ‘B’ year 2  ‘M’ year 13 –Root name invariable - HSROOM –At wave 13, MHSROOM

Towards a Data Driven System GLOBALS GLOBAL WP = M GLOBAL WP = M GLOBAL CURYR = 2003 GLOBAL CURYR = 2003 Globals for values and conditions that might change Globals for values and conditions that might change VOTE VOTE –Calculated from one of two variables, VOTE3 or VOTE5

Towards a Data Driven System GLOBALS Code incorrect! Code incorrect! GLOBAL MAXVOTE = 17 GLOBAL MAXVOTE = 17

Towards a Data Driven System GLOBALS Ask respondents about income received from non-earnings Ask respondents about income received from non-earnings –Pensions –Child benefit –Disability payments –Income support –Rent etc

Towards a Data Driven System GLOBALS Create derived variables Create derived variables –State benefit payments –Non-state pension payments –Rent payments etc Payments change over time Payments change over time Replace hard coded conditions with GLOBALS Replace hard coded conditions with GLOBALS

Towards a Data Driven System GLOBALS ……….. ……….. ifthen (FICODE eq 1 or ifthen (FICODE eq 1 or (FICODE GE 5 and FICODE LE 42)) CCurrent condition for state benefit payments ………………………..……………………….. end if end if

Towards a Data Driven System GLOBALS GLOBAL FIBCOND = GLOBAL FIBCOND = $FICODE eq 1 or (FICODE GE 5 and FICODE LE 42)$.ifthen ( ).ifthen ( )……………………………..……………………………....end if.end if

Towards a Data Driven System CIF Lots of retrievals which update database Lots of retrievals which update database Want to test code first Want to test code first –GLOBAL UPDATE = UPDATE, –GLOBAL DEBUG = 1, –C –Retrieval –Retrieval  ……………  CIF DEF  CIF DEF .Put vars VARA = VARA  CIF FALSE .CIF EQ, 1 .Write “ VARA would be updated to ”, VARA .CIF END  CIF END  ……….. –End retrieval

Towards a Data Driven System DO REPEAT and GLOBALS Determines which job is the last job a respondent has had Determines which job is the last job a respondent has had –Current job if there is one, otherwise use latest job last time respondent was interviewed –Procedure about 250 lines in length –New bits to be added each year throughout the procedure Lots of similar pieces of code e.g. highest educational qualifications Lots of similar pieces of code e.g. highest educational qualifications

Towards a Data Driven System DO REPEAT and GLOBALS GLOBAL RWP = $L K J I H G F E D C B A$ GLOBAL RWP = $L K J I H G F E D C B A$ GLOBAL RVAL = $ $ GLOBAL RVAL = $ $ C BEGIN BEGIN.process record XWAVEID with (PID).process record XWAVEID with (PID) C.do repeat WP = /.do repeat WP = / VAL = / VAL = /.get vars WP!HID WP!PNO.get vars WP!HID WP!PNO.ifthen (WP!IVFIO eq 1).ifthen (WP!IVFIO eq 1).old rec is WP!INDRESP (WP!HID WP!PNO).old rec is WP!INDRESP (WP!HID WP!PNO).get vars JLID=WP!JLID.get vars JLID=WP!JLID.ifthen ( JLID ge 0 and JLID le 12 ).ifthen ( JLID ge 0 and JLID le 12 ).compute JLID = VAL.compute JLID = VAL.end if.end if.end record.end record.exit begin.exit begin.end if.end if. end repeat. end repeat.end rec.end rec END BEGIN END BEGIN

Towards a Data Driven System DO REPEAT and GLOBALS Code never needs changing Code never needs changing Globals defined once at beginning of code Globals defined once at beginning of code Work every year (until year 27!) Work every year (until year 27!) Used as often as required Used as often as required

Hidden Time Bombs Database variables have wave prefix Database variables have wave prefix Procedure calculates identifier of last job as JLID and then updates DB variable JLID Procedure calculates identifier of last job as JLID and then updates DB variable JLID Calculates ‘latest job found’ as temporary variable LJLID Calculates ‘latest job found’ as temporary variable LJLID Fine until wave 12 Fine until wave 12 Even more at wave 14 Even more at wave 14 –N used as counter in many places

Hidden Time Bombs Ensure names are different from any DB variable name Ensure names are different from any DB variable name –Current –Or future –Eg X_TEMPV if underscores never used for DB names –Much easier with SIR XS Would not have detected if run in non- updating mode Would not have detected if run in non- updating mode

Conclusion Make use of SIR compiler directives to produce data driven system Make use of SIR compiler directives to produce data driven system Rewrite sections so that they will never need rewriting again Rewrite sections so that they will never need rewriting again No need to do everything at once No need to do everything at once Name temporary variables carefully Name temporary variables carefully