Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute.

Similar presentations


Presentation on theme: "Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute."— Presentation transcript:

1 Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute for Social and Economic Research University of Essex

2 Introduction Data driven systems Data driven systems SIR Compiler Directives SIR Compiler Directives –GLOBAL –CIF –DO REPEAT Brief description of the project Brief description of the project Towards a data driven system Towards a data driven system Hidden time-bombs Hidden time-bombs

3 Data Driven Systems A large system A large system –Lots of retrievals / programs –Requires substantial modification each year Aim to define what needs changing up top Aim to define what needs changing up top As little modification to underlying code as possible As little modification to underlying code as possible

4 Compiler Directives - GLOBAL Defined anywhere within SIR environment Defined anywhere within SIR environment –Keeps value across RETRIEVALs –Used in places where variables cannot be used  GLOBAL RECNAM = AREC  ……………..  Process rec  Process rec Value substituted at compile time Value substituted at compile time

5 Compiler Directives CIF Compile IF Compile IF Determines whether code is to be compiled or not Determines whether code is to be compiled or not Usually used with GLOBALs Usually used with GLOBALs GLOBAL WANTA = 1 CIF DEF CIF DEF.Call FAM.WANTA CIF FALSE.Write “WANTA not called” CIF END CIF EQ, 1

6 Compiler Directives – DO REPEAT For repetitive pieces of code For repetitive pieces of code Expanded by the compiler Expanded by the compiler Can have any number of repeat symbols which map to a parameter list Can have any number of repeat symbols which map to a parameter list Parameter lists should be of the same length Parameter lists should be of the same length Cannot be nested Cannot be nested Extremely useful Extremely useful

7 Compiler Directives – DO REPEAT DO REPEAT rtype = AREC BREC CREC / v1 = AVAR1 BVAR1 CVAR1 / DO REPEAT rtype = AREC BREC CREC / v1 = AVAR1 BVAR1 CVAR1 / v2 = AVAR2 BVAR2 CVAR2 /.process rec rtype.Compute v1 = v2. end rec END REPEAT END REPEAT

8 Compiler Directives – DO REPEAT This is expanded by the compiler to: This is expanded by the compiler to: Process rec AREC.Compute AVAR1 = AVAR2 End rec Process rec BREC.Compute BVAR1 = BVAR2 End rec Process rec CREC.Compute CVAR1 = CVAR2 End rec

9 Brief Description of Project Very Brief Very Brief Large Social Science Survey Large Social Science Survey –Interview 15000 people each year –Interview the same people each year –~20% questions change –13 th year –Data added to Survey database –Converted to User database

10 Brief Description of Project (cont) Conversion Process Conversion Process –Conversion done once a year  This year’s data added –Structure flattened –Variable names changed –Derived variables calculated –Imputations done –Weightings calculated –Output into SPSS, SAS and Stata

11 Brief Description of Project (cont) Code written by researchers, not programmers long ago Code written by researchers, not programmers long ago It works but … It works but … –Needs a lot of modifying each year –Difficult to know where things need changing –It contains errors Tight deadlines Tight deadlines Aim to create data driven system Aim to create data driven system

12 Brief Description of Project (cont) If it aint broke don’t fix it If it aint broke don’t fix it –Particularly if you don’t have much time –But if code needs changing anyway … Step by Step approach Step by Step approach –Choose one section to rewrite –Rewrite so minimal changes required in future –Make sure changes work –More can be done next year Have rewritten the code for derived variables Have rewritten the code for derived variables

13 Towards a Data Driven System Derived Variables Derived Variables –Non questionnaire variables calculated from questions asked or other derived variables –Same each year – calculated from core variables –Some easy to calculate, some difficult –Re-write so underlying code never needs changing

14 Towards a Data Driven System Variable naming Variable naming –Wave prefix plus root name –Wave prefix  ‘A’ year 1  ‘B’ year 2  ‘M’ year 13 –Root name invariable - HSROOM –At wave 13, MHSROOM

15 Towards a Data Driven System GLOBALS GLOBAL WP = M GLOBAL WP = M GLOBAL CURYR = 2003 GLOBAL CURYR = 2003 Globals for values and conditions that might change Globals for values and conditions that might change VOTE VOTE –Calculated from one of two variables, VOTE3 or VOTE5

16 Towards a Data Driven System GLOBALS Code incorrect! Code incorrect! GLOBAL MAXVOTE = 17 GLOBAL MAXVOTE = 17

17 Towards a Data Driven System GLOBALS Ask respondents about income received from non-earnings Ask respondents about income received from non-earnings –Pensions –Child benefit –Disability payments –Income support –Rent etc

18 Towards a Data Driven System GLOBALS Create derived variables Create derived variables –State benefit payments –Non-state pension payments –Rent payments etc Payments change over time Payments change over time Replace hard coded conditions with GLOBALS Replace hard coded conditions with GLOBALS

19 Towards a Data Driven System GLOBALS ……….. ……….. ifthen (FICODE eq 1 or ifthen (FICODE eq 1 or (FICODE GE 5 and FICODE LE 42)) CCurrent condition for state benefit payments ………………………..……………………….. end if end if

20 Towards a Data Driven System GLOBALS GLOBAL FIBCOND = GLOBAL FIBCOND = $FICODE eq 1 or (FICODE GE 5 and FICODE LE 42)$.ifthen ( ).ifthen ( )……………………………..……………………………....end if.end if

21 Towards a Data Driven System CIF Lots of retrievals which update database Lots of retrievals which update database Want to test code first Want to test code first –GLOBAL UPDATE = UPDATE, –GLOBAL DEBUG = 1, –C –Retrieval –Retrieval  ……………  CIF DEF  CIF DEF .Put vars VARA = VARA  CIF FALSE .CIF EQ, 1 .Write “ VARA would be updated to ”, VARA .CIF END  CIF END  ……….. –End retrieval

22 Towards a Data Driven System DO REPEAT and GLOBALS Determines which job is the last job a respondent has had Determines which job is the last job a respondent has had –Current job if there is one, otherwise use latest job last time respondent was interviewed –Procedure about 250 lines in length –New bits to be added each year throughout the procedure Lots of similar pieces of code e.g. highest educational qualifications Lots of similar pieces of code e.g. highest educational qualifications

23 Towards a Data Driven System DO REPEAT and GLOBALS GLOBAL RWP = $L K J I H G F E D C B A$ GLOBAL RWP = $L K J I H G F E D C B A$ GLOBAL RVAL = $24 23 22 21 20 19 18 17 16 15 14 13 $ GLOBAL RVAL = $24 23 22 21 20 19 18 17 16 15 14 13 $ C BEGIN BEGIN.process record XWAVEID with (PID).process record XWAVEID with (PID) C.do repeat WP = /.do repeat WP = / VAL = / VAL = /.get vars WP!HID WP!PNO.get vars WP!HID WP!PNO.ifthen (WP!IVFIO eq 1).ifthen (WP!IVFIO eq 1).old rec is WP!INDRESP (WP!HID WP!PNO).old rec is WP!INDRESP (WP!HID WP!PNO).get vars JLID=WP!JLID.get vars JLID=WP!JLID.ifthen ( JLID ge 0 and JLID le 12 ).ifthen ( JLID ge 0 and JLID le 12 ).compute JLID = VAL.compute JLID = VAL.end if.end if.end record.end record.exit begin.exit begin.end if.end if. end repeat. end repeat.end rec.end rec END BEGIN END BEGIN

24 Towards a Data Driven System DO REPEAT and GLOBALS Code never needs changing Code never needs changing Globals defined once at beginning of code Globals defined once at beginning of code Work every year (until year 27!) Work every year (until year 27!) Used as often as required Used as often as required

25 Hidden Time Bombs Database variables have wave prefix Database variables have wave prefix Procedure calculates identifier of last job as JLID and then updates DB variable JLID Procedure calculates identifier of last job as JLID and then updates DB variable JLID Calculates ‘latest job found’ as temporary variable LJLID Calculates ‘latest job found’ as temporary variable LJLID Fine until wave 12 Fine until wave 12 Even more at wave 14 Even more at wave 14 –N used as counter in many places

26 Hidden Time Bombs Ensure names are different from any DB variable name Ensure names are different from any DB variable name –Current –Or future –Eg X_TEMPV if underscores never used for DB names –Much easier with SIR XS Would not have detected if run in non- updating mode Would not have detected if run in non- updating mode

27 Conclusion Make use of SIR compiler directives to produce data driven system Make use of SIR compiler directives to produce data driven system Rewrite sections so that they will never need rewriting again Rewrite sections so that they will never need rewriting again No need to do everything at once No need to do everything at once Name temporary variables carefully Name temporary variables carefully


Download ppt "Tips and Tricks for Producing Easily Maintainable Code in SIR or Using SIR Compiler Directives to produce Data Driven Systems Frances Williams Institute."

Similar presentations


Ads by Google