Download presentation

Presentation is loading. Please wait.

Published bySally Rachford Modified about 1 year ago

1
Spectre (Clinical) %unistats A flexible macro to give you.... “proc univariate” descriptive statistics with category counts and percentages (plus optional p-values)

2
%unistats Using the %unistats macro you will find it easy to create tables of descriptive statistics plus category counts with percentages. Adding p-values is easy and the layout of the report is very much under your control. You will see a number of demonstrations as well as how to tailor the output to meet your needs. The code you see and the output produced is exactly as run.

3
Demo 1 Here is the very first demonstration. You will see the code and on the next slide will be the output produced. You will see the sasuser.demog dataset selected on FASCD=1 and then sorted. The %popfmt macro will be called afterwards to calculate the population totals and then %unistats called to produce the report. Note that unicatrep=yes must be specified in the call to %unistats as no report is produced by default.

4
Demo 1 code proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes, stats=N Min Mean Max STD., varlist=sexcd racecd age weight);

5
Demo 1 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Stat Ambicor (1g/day) Betamaxin (500mg/day) (N=9) (N=8) __________________________________________________________________________________________________ GENDER CODE MALE 7 ( 77.8) 1 ( 12.5) FEMALE 2 ( 22.2) 7 ( 87.5) RACE CODE CAUCASIAN 5 ( 55.6) 1 ( 12.5) BLACK 2 ( 22.2) 3 ( 37.5) ASIAN 2 ( 22.2) 3 ( 37.5) OTHER 0 ( 0.0) 1 ( 12.5) AGE (YEARS) N 9 8 Min Mean Max STD WEIGHT (KG) N 9 8 Min Mean Max STD

6
Demo 1 output comments The previous output could be improved and you will be shown how to improve it. Bad points are that “Descriptive Stat” has been truncated, variable labels are all in capitals (these are the actual variable labels), gender and race text are in capitals and the min and max age would be better shown as integers. Good points are that the report fits the page, the decimal point is aligned, the wider “Betamaxin” column fits well and the statistics labels stay as they are while being correctly mapped to the statistics keywords for “proc univariate”.

7
Demo 2 code proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD., varlist=sexcd racecd age weight);

8
Demo 2 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor (1g/day) Betamaxin (500mg/day) (N=9) (N=8) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) Female 2 ( 22.2) 7 ( 87.5) Race Caucasian 5 ( 55.6) 1 ( 12.5) Black 2 ( 22.2) 3 ( 37.5) Asian 2 ( 22.2) 3 ( 37.5) Other 0 ( 0.0) 1 ( 12.5) Age (yrs) N 9 8 Min Mean Max STD Weight (kg) N 9 8 Min Mean Max STD

9
Demo 2 output comments The new variable labels look better, the “Descriptive Stat” problem has been fixed and setting the text to lowercase has helped. Some issues still remain with numeric formats, especially the min and max age as they should be integers. Before we fix the numeric formats then how about a “Total” column? That one is easy. Just define total=yes to the macro.

10
Demo 3 code proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD., varlist=sexcd racecd age weight);

11
Demo 3 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor (1g/day) Betamaxin (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) N Min Mean Max STD Weight (kg) N Min Mean Max STD

12
Demo 3 output comments You see that the “Total” column has been added and this much narrower column fits well on the report. I could have changed the “Total” label in the call to the %popfmt macro but I let it default to “Total”. Suppose the drug names were longer? The columns could become too wide. To fix that then the format of the treatment variable can be changed as you will see next.

13
Demo 4 code proc format; value trtnarr ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; format trtcd trtnarr.; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD., varlist=sexcd racecd age weight);

14
Demo 4 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) N Min Mean Max STD Weight (kg) N Min Mean Max STD

15
Demo 4 output comments You can see that the treatment arm columns have been made narrower by using the split character in the trtnarr format and assigning that format to the treatment arm variable. There is another problem in the output to do with “race” that is not obvious. On the case report form there was a category for “Hispanic”. This does not appear in the output because there were no Hispanic patients. But if it were defined with the format for race then this category can be forced to appear using the allcatvars= parameter.

16
Demo 5 code proc format; value trtnarr ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; format trtcd trtnarr.; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd,allcatvars=racecd, stats=N Min Mean Max STD., varlist=sexcd racecd age weight);

17
Demo 5 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) N Min Mean Max STD Weight (kg) N Min Mean Max STD

18
Demo 5 output comments You can see that the “Hispanic” row has been added with zero counts and percentages. Now the numeric formatting will be addressed. We can change the format used for min and max and tell the weight to use an extra decimal place. Take a look at the change of code and the output.

19
Demo 6 code proc format; value trtnarr ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; format trtcd trtnarr.; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd,allcatvars=racecd, stats=N Min Mean Max STD.,minfmt=3.,maxfmt=3., varlist=sexcd racecd age weight/+1);

20
Demo 6 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) N Min Mean Max STD Weight (kg) N Min Mean Max STD

21
Demo 6 output comments The formatting for the age variable is now correct but the weight variable formatting is not that good. Weight is showing too many decimal places. The weight formatting can be fixed by telling it to use the “mean” format for the min and the max and then it will look better. See the call adds “/m” at the end of the weight variable instead of “/+1”.

22
Demo 7 code proc format; value trtnarr ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; format trtcd trtnarr.; label sexcd="Gender" racecd="Race" age="Age (yrs)" weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd,allcatvars=racecd, stats=N Min Mean Max STD.,minfmt=3.,maxfmt=3., varlist=sexcd racecd age weight/m);

23
Demo 7 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) N Min Mean Max STD Weight (kg) N Min Mean Max STD

24
Demo 7 output comments The weight formatting is now fixed. Sometimes you will want to look at values in two different ways. For “age” it might be good to look at age categories as well as descriptive statistics. To do this, an extra variable needs to be created. And to show the two variables next to each other, without a new variable label for the second case, then the second variable can be assigned the label “A0”x to stop it displaying a visible label and the macro will then throw one blank line instead of two, keeping the two set of values paired.

25
Demo 8 code proc format; value trtnarr ; value agecat 0-15="<16 yrs" 16-25=" yrs" 26-40=" yrs" 41-65=" yrs" 66-high=">65 years" ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; agecat=age; format trtcd trtnarr. agecat agecat.; label sexcd="Gender" racecd="Race" agecat="Age (yrs)" age="A0"x weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD.,minfmt=3.,maxfmt=3., varlist=sexcd racecd agecat age weight/m, allcatvars=racecd agecat);

26
Demo 8 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) <16 yrs 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) yrs 4 ( 44.4) 4 ( 50.0) 8 ( 47.1) yrs 5 ( 55.6) 4 ( 50.0) 9 ( 52.9) yrs 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) >65 years 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) N Min Mean Max STD Weight (kg) N Min Mean Max STD

27
Demo 8 output comments You can see that the age categories have been added and the following age descriptive statistics has no label and there is only one space line between the two. The %unistats macro calls another macro named %unipvals if you request that p-values be created. For this demonstration then we will ask that p- values be created for all the variables. Let us see what happens to the output when we define all the variables to the pvarlist= parameter.

28
Demo 9 code proc format; value trtnarr ; value agecat 0-15="<16 yrs" 16-25=" yrs" 26-40=" yrs" 41-65=" yrs" 66-high=">65 years" ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; agecat=age; format trtcd trtnarr. agecat agecat.; label sexcd="Gender" racecd="Race" agecat="Age (yrs)" age="A0"x weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD.,minfmt=3.,maxfmt=3., varlist=sexcd racecd agecat age weight/m, pvarlist=sexcd racecd agecat age weight, allcatvars=racecd agecat);

29
Demo 9 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) p-value __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) 0.015^ Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) 0.239^ Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) 0.999^ yrs 4 ( 44.4) 4 ( 50.0) 8 ( 47.1) yrs 5 ( 55.6) 4 ( 50.0) 9 ( 52.9) yrs 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) >65 years 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) N Min Mean # Max STD Weight (kg) N Min Mean # Max STD

30
Demo 9 output comments You can see that there is a p-value column added and that the p-values have symbols after them. The symbols tell you what p-value has been calculated. The “^” at the end of the categorical values indicate that “Fisher's Exact test” has been used. It made this decision based on the low “expected cell count” using “Cochran's recommendation”. For higher expected counts it will use the Chi-squared test. For continuous numeric values it uses the Student's t test and displays the p-value in the “Mean” row (if there is a “Mean” row or the first row if not).

31
Demo 9 output comments (cont.) The macro gives you a lot of control over how p- values are calculated and their display. You can even put them in the footnotes if you prefer but that will not be demonstrated in this slide show. What I will do next is make the Chi-squared test apply to the Gender variable and Fisher's Exact test to apply to the other categorical variables. I will also add footnotes to explain what the p-value symbols mean.

32
Demo 10 code proc format; value trtnarr ; value agecat 0-15="<16 yrs" 16-25=" yrs" 26-40=" yrs" 41-65=" yrs" 66-high=">65 years" ; run; proc sort data=sasuser.demog(where=(fascd=1)) out=demog(drop=fascd); by patno invid; run; data demog; set demog; agecat=age; format trtcd trtnarr. agecat agecat.; label sexcd="Gender" racecd="Race" agecat="Age (yrs)" age="A0"x weight="Weight (kg)" ; run; %popfmt(demog,trtcd,uniqueid=patno invid) %lafootnote(1,"%sysfunc(repeat(_,200))") %lafootnote(2,"~ Chi-square test") %lafootnote(3,"^ Fisher's Exact test") %lafootnote(4,"# Student's t test") %unistats(dsin=demog,unicatrep=yes,total=yes, trtlabel="Number of Patients (%) /" "Descriptive Statistics" " ", lowcasevarlist=sexcd racecd, stats=N Min Mean Max STD.,minfmt=3.,maxfmt=3., varlist=sexcd racecd agecat age weight/m, pvarlist=sexcd racecd agecat age weight, chisqvarlist=sexcd, fishervarlist=racecd agecat, allcatvars=racecd agecat);

33
Demo 10 output __________________________________________________________________________________________________ Number of Patients (%) / Descriptive Statistics Ambicor Betamaxin (1g/day) (500mg/day) Total (N=9) (N=8) (N=17) p-value __________________________________________________________________________________________________ Gender Male 7 ( 77.8) 1 ( 12.5) 8 ( 47.1) 0.007~ Female 2 ( 22.2) 7 ( 87.5) 9 ( 52.9) Race Caucasian 5 ( 55.6) 1 ( 12.5) 6 ( 35.3) 0.239^ Black 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Asian 2 ( 22.2) 3 ( 37.5) 5 ( 29.4) Hispanic 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) Other 0 ( 0.0) 1 ( 12.5) 1 ( 5.9) Age (yrs) 0.999^ yrs 4 ( 44.4) 4 ( 50.0) 8 ( 47.1) yrs 5 ( 55.6) 4 ( 50.0) 9 ( 52.9) yrs 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) >65 years 0 ( 0.0) 0 ( 0.0) 0 ( 0.0) N Min Mean # Max STD Weight (kg) N Min Mean # Max STD __________________________________________________________________________________________________ ~ Chi-square test ^ Fisher's Exact test # Student's t test

34
Demo 10 comments You can see that the Chi-squared test was used for Gender and Fisher's Exact test was used for the other two categorical values. The footnotes explained the p-value symbols and were left- aligned using the %lafootnote macro. By now you will have noticed another, more mundane problem. The code and output are becoming hard to read as the writing is becoming too small. It is time to bring this slide show to an end.

35
Ending comments for %unistats What you have seen in this slide show is just a tiny demonstration of what the %unistats macro can do for you. It is an extremely flexible macro. You can manipulate its output dataset if you want to and display it in any form you choose to. You can calculate extra p-values separately and add these in to the output dataset and then display the results and in this way the macro could help you in your efficacy reporting. There is a much more detailed demonstration of %unistats on the Spectre (Clinical) web site. Just Google “unistats spectre” to find it.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google