Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Ready To Become Really Productive Using PROC SQL? Sunil Gupta Gupta Programming.

Similar presentations


Presentation on theme: "1 Ready To Become Really Productive Using PROC SQL? Sunil Gupta Gupta Programming."— Presentation transcript:

1 1 Ready To Become Really Productive Using PROC SQL? Sunil Gupta Sunil@GuptaProgramming.com Gupta Programming

2 2 DATA Step vs. PROC SQL Camps DATA Step Die-Hards PROC SQL Die-Hards Happy Camper – Use the best of both worlds Factors: First learned in SAS, First encounter of flexibility, First frustration of complexity

3 3 Using PROC SQL, can you identify? - at least four ways to select and create columns? - at least five ways to create macro variables? - at least four ways to subset tables? - at least three ways to subquery tables? Everyone should have a copy of the one page summary sheet. Can use as reference when coding in PROC SQL.

4 4 PROC SQL Summary Sheet

5 5 Using PROC SQL to be Productive Topics Covered Table Access, Structure and Retrieval Creating Columns and Macro Variables Useful References, Webcasts and Podcasts Topics Not Covered Table Content for Data Entry Joining Tables to combine Data

6 6 Four PROC SQL components: Columns, Joins, Conditions, Sorts proc sql; select name, sex from sashelp.class where sex = 'F' order by name; quit; Joins Sorts Conditions Columns 80% of PROC SQL’s query can be broken down to 20% of the syntax

7 7 13 PROC SQL Examples Example 1 Essential building block components Example 2a, 2b, 2c, 2d, 2e Four options for selecting columns Example 3c Creating column using summary functions Example 3f Creating column using select-case Example 4a, 4b Subsetting table using calculated column or function Example 5a, 5b Using subquery conditions to create rows Example 6 Creating macro variables

8 8 SASHELP.CLASS Data Set ObsNameSexAgeHeightWeight 1 AliceF1356.584.0 2 BarbaraF1365.398.0 3 CarolF1462.8102.5 4 JaneF1259.884.5 5 JanetF1562.5112.5 6 JoyceF1151.350.5 7 JudyF1464.390.0 8 LouiseF1256.377.0 9 MaryF1566.5112.0 10 AlfredM1469.0112.5 11 HenryM1463.5102.5 12 JamesM1257.383.0 13 JeffreyM1362.584.0 14 JohnM1259.099.5 15 PhilipM1672.0150.0 16 RobertM1264.8128.0 17 RonaldM1567.0133.0 18 ThomasM1157.585.0 19 WilliamM1566.5112.0

9 9 Example 1: four components: Columns, Joins, Conditions, Sort proc sql; select name from sashelp.class where sex = 'F' order by name; quit; Name Alice Barbara Carol Jane Janet Joyce Judy Louise Mary Only One SAS statement

10 10 Example 2. Four options for Selecting Column Definitions a. Basic Structure b. Column Attributes such as length= c. All Columns with ‘*’ d. Distinct Columns such as distinct e. Distinct Columns without Order By

11 11 Example 2a. Select name and sex for all females proc sql; select name, sex from sashelp.class where sex = 'F' order by name; quit; NameSex AliceF BarbaraF CarolF JaneF JanetF JoyceF JudyF LouiseF MaryF Multiple columns, control column and record order

12 12 Example 2b. Define attributes for name: label, format and length proc sql; select name label = 'My label' format = $10. length = 10 from sashelp.class where sex = 'F' order by name; quit; My label Alice Barbara Carol Jane Janet Joyce Judy Louise Mary Column attributes after column name

13 13 Example 2c. Select all columns in table for all females proc sql; select * from sashelp.class where sex = 'F' order by name; quit; NameSex Age HeightWeight AliceF1356.584 BarbaraF1365.398 CarolF1462.8102.5 JaneF1259.884.5 JanetF1562.5112.5 JoyceF1151.350.5 JudyF1464.390 LouiseF1256.377 MaryF1566.5112 Wildcard selects all columns

14 14 Example 2c. Select all columns in table for all females proc sql; select * from sashelp.class (drop = weight) where sex = 'F' order by name; quit; NameSex Age Height AliceF1356.5 BarbaraF1365.3 CarolF1462.8 JaneF1259.8 JanetF1562.5 JoyceF1151.3 JudyF1464.3 LouiseF1256.3 MaryF1566.5 Data set options are still valid

15 15 Example 2d. Select distinct sex for all females proc sql; select distinct sex from sashelp.class where sex = 'F' order by name; quit; Sex F F F F F F F F F Selects unique combination of columns, generally order by sex also, same as distinct column name

16 16 Example 2e. Select distinct sex for all females without repeats proc sql; select distinct sex from sashelp.class where sex = 'F'; quit; Sex F Excluding the order by name clause results in one record by preventing sex from being repeated for each record

17 17 Example 3. Six options for Creating Column Definitions a. Functions such as int((age + 150)/10) b. Functions such as max(height, weight) c. Summary Functions such as sum(weight) d. Constant such as ‘my constant’ e. Character String Expression such as name || ‘,’ || sex f. Select-Case Condition What is the difference between b and c? Which option is used for conditional processing? Remember to add length, format and label attributes.

18 18 Example 3c1. Select weight and percent of total weight using summary functions proc sql; select weight, ((weight/sum(weight))*100) as wpercnt length=8 format=4.1 from sashelp.class; quit; Weightwpercnt 112.55.9 844.4 985.1 102.55.3 Sample Output Once specified, new columns can be referenced with CALCULATED Re-merging technique

19 19 Example 3c2. Select sex, weight and percent of total weight by sex proc sql; select sex, weight, sum(weight) as sum_weight, ((weight/sum(weight))*100) as wpercnt length=8 format=4.1 from sashelp.class group by sex; quit; SexWeightsum_weightwpercnt F9081111.0 F84.581110.4 Sample Output Group by prevents overall summaries

20 20 Example 3f. Select age and new column depthead based on age values proc sql; select age, case when age < 13 then 1 when age between 13 and 15 then 2 when age > 15 then 3 else. end as depthead length = 3 from sashelp.class; quit; Required Tip Mutually exclusive condition Any valid expression Agedepthead 142 Sample Output

21 21 Example 4. Four options for Subsetting Tables a. Calculated variable b. Function such as index() c. Direct variable d. Summary Function using having clause Option d. allows for subsetting by summary function in one step instead of multiple steps using DATA Step.

22 22 Example 4a. Select age and new column depthead based on age values for depthead = 3 proc sql; select age, case when age < 13 then 1 when age between 13 and 15 then 2 when age > 15 then 3 else. end as depthead length = 3 from sashelp.class where calculated depthead= 3; quit; Agedepthead 163

23 23 Example 4b. Select name, sex where name contains ‘J’ proc sql; select name, sex from sashelp.class where index(name, 'J') > 0; quit; NameSex JamesM JaneF JanetF JeffreyM JohnM JoyceF JudyF

24 24 Example 5. Two options for Subqueries (One Column) a. Returns one value b. Returns multiple values Best Practice to apply a three part approach: 1. Subquery Results 2. Population without condition 3. Confirm Subset What do subqueries resemble? What is the difference between WHERE and HAVING clause? For multiple columns, can create a subquery table or an intermediate table and apply multiple column conditions.

25 25 Example 5a. Select sex, weight where weight is greater than the average weight proc sql; Select sex, weight from sashelp.class having weight > (select m_wgt from mean_wgt); quit; 1 2 3 Expect one value from subquery Which column condition applied? Result from mean_wgt table is used in outer query.

26 26 Example 5a. Select sex, weight where weight is greater than the average weight proc sql; create table mean_wgt as select avg(weight) as m_wgt from sashelp.class; select m_wgt from mean_wgt; quit; First create m_wgt column in mean_wgt table.

27 27 Example 5a. Select sex, weight where weight is greater than the average weight 100.0263 1. Subquery Result SexWeight F50.5 F77 F84 F84.5 F90 2. Sample Population SexWeight M112.5 F102.5 M F112.5 F112 M150 M128 M133 M112 3. Final Subset All weights > 100

28 28 Example 5b. Select age where age does not equal any female ages proc sql; Select sex, age from sashelp.class having age ~in (select distinct age from sashelp.class where sex = "F") ; quit; 1 2 3 Expect multiple values from subquery. Without a group by clause, having and where clauses provide similar results.

29 29 Example 5b. Select age where age does not equal any female ages Age 11 12 13 14 15 1. Subquery Result SexAge F11 F12 F F13 F F14 F F15 F 2. Population SexAge M11 M12 M M13 M14 M M15 M M16 SexAge M16 3. Final Subset

30 30 Example 6. Five options for Creating Macro Variables a. Into : b. Into : separated by c. Into : - : d. Summary Function into : e. Select-case into : 1 macro variable, 1 value 1 macro variable, multiple values multiple macro variables, 1 value each 1 macro variable, 1 value 1 macro variable, 1 or more values Key questions: 1. Number of macro variable 2. Number of values 3. Delimiter value (, ‘,’, ‘/’, etc.)

31 31 Example 6. Create macro variable storing male names proc sql; select name as male into :male_name separated by ', ' from sashelp.class where sex = 'M'; quit; %put 'Names of Males = ' &male_name; SAS Log: Names of Males = Alfred, Henry, James, Jeffrey, John, Philip, Robert, Ronald, Thomas, William 1 macro variable, multiple values

32 32 Summary PROC SQL components: Columns, Joins, Conditions, Sorts 1. Generally all or no columns specified. 2. Columns created towards the end instead of beginning in DATA Step. 3. Select-case syntax is the only method for conditional processing. 4. Syntax is easier to code and maintain since more standard structure. Joins Sorts Conditions Columns

33 33 Four Dimensions of PROC SQL over DATA Step proc sql; select name, weight from sashelp.class where sex=‘F’ group by sex having weight > avg(weight); quit; Group By Summary Functions Conditions Subsets How many steps are performed and in what order?

34 34 Four Dimensions of PROC SQL over DATA Step proc sql; select name, weight from sashelp.class where sex=‘F’ group by sex having weight > avg(weight); quit; Name Weight Janet 112.5 Carol 102.5 Mary 112 Barbara 984 Four Steps in One PROC SQL 1. Select female records 2. Group by sex 3. Select weights > average group weight 4. Display name and weight

35 Comparing Data Step with PROC SQL Data StepPROC SQL SAS Functions, Data set options If-Then Statements Do Loop, Output Space to separate variables New variable = valid expression; IF/Where Statements Multiple SAS Statements By default, includes all variables Many-to-many merge By default, If A or B; Full Outer Can recycle data set names N/A SAS Functions, COALESCE(), Data set options Case-Select Clause Joins can simulate Do Loop Comma to separate variables Valid expression AS new variable WHERE for details/HAVING for summaries One SAS Statement By default, excludes all variables Cartesian Product is better By default, If A and B; Inner Join Requires new data set names Unique PROC SQL keywords

36 Unique PROC SQL Keywords KeywordDescription Creating new columns Referencing new columns after being specified Displaying unique combination of variables Creating macro variables Variable type when creating variables AS CALCULATED DISTINCT INTO : Ex. ((weight/sum(weight))*100) as wpercent Ex. where calculated wgroup = 'high' Ex. distinct patno Ex. sum(weight) into :wsum CHAR / DATE / NUM Ex. client char label ='Client' length=25

37 37 Summary to be PROC SQL Productive Four selecting column options Six creating column options Four subsetting table options Two sorting options Three subquery options Five macro variable creation options Helpful to prepare for SAS Advanced Certification exam.

38 38 New SAS e-Guide to Increase Productivity Key Benefits Cut/paste task based syntax from pdf file Internal/external hyperlinks Comprehensive example reference under 10 pages Better prepare for advanced SAS certification exam www.sascommunity.org/wiki/Quick_Results_with_Proc_SQL

39 39 Top 10 PROC SQL References  Lafler, Kirk Paul, PROC SQL Book, Papers, Tips and Techniques Webcast: http://support.sas.com/publishing/bbu/webina r/Lafler_junewebinar.wmv http://support.sas.com/publishing/bbu/webina r/Lafler_junewebinar.wmv  http://www.sascommunity.org/wiki/Quick_Re sults_with_Proc_SQL http://www.sascommunity.org/wiki/Quick_Re sults_with_Proc_SQL

40 40 Ready To Become Really Productive Using PROC SQL? Sunil Gupta Sunil@GuptaProgramming.com Gupta Programming


Download ppt "1 Ready To Become Really Productive Using PROC SQL? Sunil Gupta Gupta Programming."

Similar presentations


Ads by Google