# Combining Lags and Arrays [and a little macro] Lynn Lethbridge SHRUG OCT 28, 2011.

## Presentation on theme: "Combining Lags and Arrays [and a little macro] Lynn Lethbridge SHRUG OCT 28, 2011."— Presentation transcript:

Combining Lags and Arrays [and a little macro] Lynn Lethbridge SHRUG OCT 28, 2011

Lag Function  A tool to manipulate data across observations  Can lag back as far as you like (default is a lag of 1)  Can be frustrating! Don’t put a lag in a conditional statement; create a variable with the lag value first and use the variable in the conditional statement

Example 1a Obs x sex z 1 1 M. 2 1 M 1 3 3 F. 4 5 M 3 5 5 F. 6 8 F. 7 8 F. 8 8 F. 9 9 F. 10 10 M 9 data work.temp; input x sex \$; lx=lag(x); if sex='M' then z=lx; drop lx; cards; 1 M 3 F 5 M 5 F 8 F 9 F 10 M ; run;

Example 1b data work.temp; input x sex \$; if sex='M' then z=lag(x); cards; 1 M 3 F 5 M 5 F 8 F 9 F 10 M ; run; Obs x sex z 1 1 M. 2 1 M 1 3 3 F. 4 5 M 1 5 5 F. 6 8 F. 7 8 F. 8 8 F. 9 9 F. 10 10 M 5

Arrays  Allows you to assign a single name to multiple variables  Can use an array to manipulate data across variables without having to write them out multiple times  Can create new variables in arrays

data work.temp; input id sex \$ a b c; array oldvars {*} a b c; array newvars {*} var1-var3; do i=1 to dim(oldvars); if sex='f' then newvars{i}=oldvars{i}; end; drop i; cards; 1 m 1 3 4 1 m 2 2 4 3 m 6 2 6 3 f 7 1 5 4 m 8 4 7 4 f 1 4 8 5 f 4 9 5 6 m 3 8 9 ; run;

Obs id sex a b c var1 var2 var3 1 1 m 1 3 4... 2 1 m 2 2 4... 3 3 m 6 2 6... 4 3 f 7 1 5 7 1 5 5 4 m 8 4 7... 6 4 f 1 4 8 1 4 8 7 5 f 4 9 5 4 9 5 8 6 m 3 8 9...

Lags and Arrays  Lags allow you to work across observations  Arrays allow you to work across variables  Combining the two lets you work efficiently with your data as a matrix

Combined Example  Suppose you have multiple observations per ID  Ordered first by ID and then by date  You want to know if a code appears twice in a row for any individual ID across multiple variables

data work.temp; input id date a \$ b \$ c \$; cards; 1 12951 a1 b2 g5 1 12999 a4 a1 v4 1 13777 s3 f5 g5 3 11111 a1 a1 g5 3 12222 h6 i8 t5 3 13333 b2 f4 f4 3 14444 d4 f4 s3 4 10000 c1 g4 d4 4 11111 f4 d4 s3 5 9900 d1 s2 w2 5 10000 a1 s2 d1 5 13456 a1 a2 a1 5 14566 f4 f3 f4 5 15999 d1 a1 a1 ; run; Data

Lags and Arrays data work.temp2; set work.temp; lagid=lag(id); array vars {*} a b c; array newvars {*} \$ na nb nc; do i=1 to 3; lagvars=lag(vars{i}); if id=lagid and lagvars=vars{i} then newvars{i}=vars{i}; end; drop lagid i lagvars; run;

Printed Output Obs id date a b c na nb nc 1 1 17JUN95 a1 b2 g5 2 1 04AUG95 a4 a1 v4 3 1 20SEP97 s3 f5 g5 4 3 03JUN90 a1 a1 g5 5 3 18JUN93 h6 i8 t5 6 3 03JUL96 b2 f4 f4 7 3 19JUL99 d4 f4 s3 f4 8 4 19MAY87 c1 g4 d4 9 4 03JUN90 f4 d4 s3 10 5 08FEB87 d1 s2 w2 11 5 19MAY87 a1 s2 d1 s2 12 5 03NOV96 a1 a2 a1 a1 13 5 18NOV99 f4 f3 f4 14 5 21OCT03 d1 a1 a1

Higher Order Lag data work.temp; input id a ; la3=lag3(a); cards; 1 1 2 3 3 4 4 5 4 6 5 7 6 8 ; run; Obs a la3 1 1. 2 2. 3 3. 4 4 1 5 5 2 6 6 3 7 7 4 8 8 5

Higher Order Lag with a Macro %let num=3; data work.temp; input id a ; la&num=lag&num(a); cards; 1 1 2 3 3 4 4 5 4 6 5 7 6 8 ; run; Obs a la3 1 1. 2 2. 3 3. 4 4 1 5 5 2 6 6 3 7 7 4 8 8 5