Combining (with SQL) HRP223 – 2012 November 05, 2011

Combining (with SQL) HRP223 – 2012 November 05, 2011
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

PROC SQL - Set Operators NO GUI (“noh gooey”)
Outer Union Corresponding concatenates Unions unique rows from both queries Except rows that are part of first query Intersect rows common to both queries

outer union corresponding
You can concatenate data files. I rarely use it. proc sql; create table isOuter as select dude from baseline outer union corresponding select dude from followup; quit;

union You can also concatenate data files and keep unique records:
proc sql; create table isUnion as select dude from baseline union select dude from followup; quit;

except Say you needed everyone who did not come back. Start out with the baseline group and remove the people who came back. proc sql; select id from baseline except select id from followup; quit;

intersect Say you wanted to know who came back. In other words, what IDs are in both files? proc sql; select id from baseline intersect select id from followup; quit;

PROC SQL - Set Operators
When you have tables (with more than one column) with the same structure, you can combine them with these set operators. Be extremely careful because SAS/SQL is forgiving about the structure of the tables and you may not notice problems in the data. For this to work as intended, the two tables must have the same variables, in the same order, and the variables must be of the same type (variables with the same name must both be character or both be numeric). Use the key word corresponding to have it match like-named variables.

corresponding The columns do not need to have matching names or even the same length and it will still operate on them. Use corresponding to help spot this problem. data fName; firstName = "Raymond"; run; data lName; lastName = "Balise"; proc sql; create table name as select firstName from fName union select lastName from lName ; quit; *create table name as union corresponding

Working with Repeated Keys
A file tracking diagnoses or treatments will have multiple records for some people. If you want to count the number of records for a person, specify what variable(s) are used to group by. Count records in the group with count(*) or count not missing values with count(variableName) data dx; input id dxCode; datalines; 1 42 1 17 2 42 3 2 3 42 3 . 3 .A ; proc sql; create table counts as select id, count(*) as countRecords, count(dxCode) as countDx, count(distinct(dxCode)) as cntDistDx from dx group by id quit;

Repeat Counting I want to know:
how many people I have how many diagnoses each person has how many distinct diagnoses each person has You can sort the data and count or use the SQL commands on grouped data.

How many records? Select ID to be included in the new data set.
Add an Advanced expression as a Computed Column and select the count() function.

It automatically groups the data by ID when you do the count(
It automatically groups the data by ID when you do the count(*) function.

Other Aggregates To get the counts of diagnoses and/or the distinct diagnoses, drag the diagnosis (DX) variable over to the select variable list and choose the appropriate summary statistic.

Combining (with SQL) HRP223 – 2012 November 05, 2011

Similar presentations

Presentation on theme: "Combining (with SQL) HRP223 – 2012 November 05, 2011"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Combining (with SQL) HRP223 – 2012 November 05, 2011

Similar presentations

Presentation on theme: "Combining (with SQL) HRP223 – 2012 November 05, 2011"— Presentation transcript:

Similar presentations

About project

Feedback