Statistical Computing Spring 2014

Slides:

Advertisements

Similar presentations

COMP 116: Introduction to Scientific Programming Lecture 37: Final Review.

Advertisements

Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.

ITEC113 Algorithms and Programming Techniques

Computer Science 1620 Loops.

VBA Modules, Functions, Variables, and Constants

 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.

Program Design and Development

Chapter 5: Loops and Files.

Review of Matrix Algebra

Concatenation MATLAB lets you construct a new vector by concatenating other vectors: – A = [B C D... X Y Z] where the individual items in the brackets.

C++ for Engineers and Scientists Third Edition

Introduction to Structured Query Language (SQL)

1 Chapter 3 Matrix Algebra with MATLAB Basic matrix definitions and operations were covered in Chapter 2. We will now consider how these operations are.

Chapter 7 Matrix Mathematics Matrix Operations Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

ALGORITHMS AND FLOWCHARTS

Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.

Mathcad Variable Names A string of characters (including numbers and some “special” characters (e.g. #, %, _, and a few more) Cannot start with a number.

Chapter 10 Review: Matrix Algebra

Matlab tutorial course Lesson 2: Arrays and data types

Chapter 7: Arrays. In this chapter, you will learn about: One-dimensional arrays Array initialization Declaring and processing two-dimensional arrays.

PROGRAMMING, ALGORITHMS AND FLOWCHARTS

SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.

1 Chapter 9 Writing, Testing, and Debugging Access Applications.

Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.

REVIEW 2 Exam History of Computers 1. CPU stands for _______________________. a. Counter productive units b. Central processing unit c. Copper.

Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.

ASP.NET Programming with C# and SQL Server First Edition Chapter 3 Using Functions, Methods, and Control Structures.

1 Chapter 4: Selection Structures. In this chapter, you will learn about: – Selection criteria – The if-else statement – Nested if statements – The switch.

Linux+ Guide to Linux Certification, Third Edition

Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.

MATLAB for Engineers 4E, by Holly Moore. © 2014 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. This material is protected by Copyright.

Flow of Control Part 1: Selection

Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.

Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.

Chapter 5: More on the Selection Structure Programming with Microsoft Visual Basic 2005, Third Edition.

SAS Interactive Matrix Language Computing for Research I Spring 2012 Ramesh.

What does C store? >>A = [1 2 3] >>B = [1 1] >>[C,D]=meshgrid(A,B) c) a) d) b)

Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.

Visual Basic 2010 How to Program © by Pearson Education, Inc. All Rights Reserved.1.

Chapter 4 Controlling Execution CSE Objectives Evaluate logical expressions –Boolean –Relational Change the flow of execution –Diagrams (e.g.,

JavaScript, Fourth Edition

Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.

Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.

Visual Basic 2010 How to Program © by Pearson Education, Inc. All Rights Reserved.1.

STAT 534: Statistical Computing Hari Narayanan

Loops and Files. 5.1 The Increment and Decrement Operators.

 2008 Pearson Education, Inc. All rights reserved. 1 Arrays and Vectors.

INTRODUCTION TO MATLAB DAVID COOPER SUMMER Course Layout SundayMondayTuesdayWednesdayThursdayFridaySaturday 67 Intro 89 Scripts 1011 Work

INTRODUCTION TO MATLAB Dr. Hugh Blanton ENTC 4347.

Internet & World Wide Web How to Program, 5/e © by Pearson Education, Inc. All Rights Reserved.

JavaScript, Sixth Edition

An Introduction to Programming with C++ Sixth Edition Chapter 5 The Selection Structure.

Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2

Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.

1-2 What is the Matlab environment? How can you create vectors ? What does the colon : operator do? How does the use of the built-in linspace function.

Haas MFE SAS Workshop Lecture 3:

Chapter 7 Matrix Mathematics

Python: Control Structures

Chapter 5 - Control Structures: Part 2

The Selection Structure

MATLAB: Structures and File I/O

Introduction to MATLAB

ALGORITHMS AND FLOWCHARTS

Arrays, For loop While loop Do while loop

Five Basic Programming Elements

ALGORITHMS AND FLOWCHARTS

INTRODUCTION TO MATLAB

Vectors and Matrices In MATLAB a vector can be defined as row vector or as a column vector. A vector of length n can be visualized as matrix of size 1xn.

Presentation transcript:

Statistical Computing Spring 2014 Using Proc IML Statistical Computing Spring 2014

What is IML? SAS vs R Proc IML Applications SAS: procedures (PROCs) and datasets R: functions/operations and matrices/vectors Proc IML IML = Interactive Matrix Language R-like programming inside of SAS Pros: more flexible Cons: programs are not validated Applications Simulate data Matrix algebra (e.g. contrasts, algorithms) Many things you could normally only do in R Graphics

The Matrix A matrix is a collection of numbers ordered by rows and columns. Matrices are characterized by the number of rows and columns The elements in a matrix are referred to first by their row then column

Special Matrices A 1 x 1 matrix is also known as a scalar r x 1 or 1 x c matrices are known as vectors A diagonal matrix is a square matrix where the off- diagonal elements are zero An identity matrix is a diagonal matrix where the diagonal elements are 1. These are also denoted by Ic, where c is the dimension of the matrix

Creating Matrices in IML PROC IML; A = 1; /* CREATE A SCALAR*/ B = {1 2 3}; /* CREATE A ROW VECTOR OF LENGTH 3*/ C = { 4, 5, 6}; /* CREATE A COLUMN VECTOR OF LENGTH 3*/ D ={ 1 2, 3 4, 5 .}; /* CREATE A 3 BY 2 MATRIX WHERE THE 3,2 ELEMENT IS MISSING*/ PRINT A B C D; /* DISPLAY THE MATRICES IN THE OUTPUT*/ QUIT; *Can assign characters instead of numbers but matrix algebra won’t work

Manipulating Matrices Using brackets inside the specification allows you to request repeats A={ [2] ‘Yes’, [2] ‘No’} is equivalent to A={‘Yes’ ‘Yes’, ‘No’ ‘No’} SAS: {[# Repeats] Value}, R: rep(value, number of times) Select a single element A={1 2, 3 4} To select the number 3: A2=A[2,1] Select a row or column To select the first row: A3=A[1, ] To select the first column: A4=A[ ,1] Select a submatrix B={1 2 0 0, 3 4 00} To select the A matrix from within B: A_new=B[1:2,1:2] or B[,{1 2} ]

Manipulating Matrices (cont.) To define row and column labels, first create a vector with the labels PRINT B[rowname=name label vector] Can also use colname, format, and labels in this way To permanently assign use mattrib matrix rowname= colname= This then allows you to index using the matrix attributes (e.g. A[“True”,]) Selecting elements with logical arguments Instead of listing the specific elements use a logical argument A=[1 2 3 4], B=A[loc(A>2)]=[3 4] Replace elements Option 1: reassign specific elements A[2]=7 will yield A=[1 7 3 4] Option 2: reassign by a rule A[loc(A>2)]=0 will yield A=[1 2 0 0] The MATTRIB statement associates printing characteristics with matrices.

Manipulating Matrices in IML PROC IML; REPEAT_O1={[2]"YES" [2] "NO"}; /*USING THE REPEAT FUNCTION TO FILL THE MATRIX*/ REPEAT_O2={"YES" "YES" "NO" "NO"}; /* REPEATING ELEMENTS MANUALLY*/ PRINT REPEAT_O1 REPEAT_O2; A={1 2, 3 4}; /* DEFINE MATRIX*/ A1=A[2,1]; /* SELECT THE ELEMENT IN THE 2ND ROW, FIRST COLUMN: A1 SOULD EQUAL 3 */ A2=A[1,]; /* SELECT THE FIRST ROW, A2 SHOULD EQUAL A 2 X 1 VECTOR {1 2} */ A3=A[,1]; /* SELECT THE FIRST COLUMN, A3 SHOULD EQUAL A 1 X 2 VECTOR {1,3} */ B={1 2 0 0, 3 4 0 0}; /* DEFINE A MATRIX B, WITH TWO SUBMATRICES A AND A 2 X 2 NULL MATRIX*/ A_NEW=B[1:2,1:2]; /* RECOVER THE A MATRIX FROM B */ A_NEW2=B[,{1 2}]; /*RECOVER THE A MATRIX FROM B, ANOTHER WAY TO WRITE IT*/ C_ROWNM={M F}; /* SET ROW NAMES FOR MATRIX C*/ C_COLNM={TRUE FALSE}; /* SET COL NAMES FOR MATRIX C*/ C={10 25,9 18}; PRINT A A1 A2 A3 B A_NEW C[ROWNAME=C_ROWNM COLNAME=C_COLNM FORMAT=6.1 LABEL="MY MATRIX"] /*MODIFYING PRINTED OUTPUT FOR MATRIX C*/;

Manipulating Matrices in IML C_NEW=C; /* CREATING A DUPLICATE MATRIX*/ MATTRIB C_NEW ROWNAME=C_ROWNM COLNAME=C_COLNM FORMAT=6.1 LABEL="MY MATRIX"; /* PERMANANTLY CHANGING OUTPUT FORMAT*/ PRINT C C_NEW; /* COMPARING DIFFERENT APPROACHES*/ D=A[LOC(A>1)];/* SELECTING ONLY ELEMENTS THAT MEET RULE, NOTE THAT MATRIX STRUCTURE NOT RETAINED*/ PRINT A D; E_TEMP=A; /* CREATING A DUPLICATE MATRIX*/ E_TEMP[1,1]=25 /* CHANGING A SINGLE ELEMENT*/ PRINT E_TEMP; E_TEMP[LOC(E_TEMP>5)]=.; /* SETTING ALL ELEMENTS MEETING RULE TO MISSING*/ QUIT;

Creating Special Matrices Identity Matrix I(r): Identity matrix of size r Dummy Matrix j(nrow,ncol,x) nrow= number of rows, ncol=number of columns, x =fill value Diagonal matrix diag(vector) diag(matrix) Note you can also accomplish this by using a Kroeneker product ( @ ) for multiplying the desired matrix by an identity matrix Shape

Creating Special Matrices Block diagonal matrix Block(M1, M2, …) Repeat(matrix,nrow,ncol) repeats the specified matrix for the number of rows and columns given Shape(vector,nrow,ncol) Repeats the given vector row-wise for the number of rows and number of columns given. Note that the number of cells to repeat must be a multiple of the vector length Generate a sequence Do(start,finish, by) creates a vector using the specified skip pattern. For example do(-1,0,0.5) would return [-1 -0.5 0]. In R you can use seq(start, finish,by)

Brief Introduction to Matrix Algebra

Matrix Addition and Subtraction To add or subtract two matrices, they both must have the same number of rows and columns. The addition or subtraction is element wise Example:

Matrix Multiplication and Division Scalar by Matrix multiplication and division is an element wise operation and commutative. Multiplication of vectors and matrices Not commutative (AB ≠ BA) Requires that the number of columns in A equals the number of rows in B The resulting matrix R will have dimension equal to rows of A and columns of B

Multiplication and Division (cont.)

Special Properties Transpose: A’= (aji) Inverse (indicated with -1 superscript): the inverse of a number is that number which, when multiplied by the original number, gives a product of 1 Must be a square matrix

IML Commands for Special Matrices Function IML Code Transpose ` Determinant Det(matrix) Inverse Inv(matrix) Trace Tr(matrix)

Matrix Algebra in IML

Matrix Operators: Arithmetic Operation IML Code Addition + Subtraction - Division, element wise / Multiplication, element wise # Multiplication, matrix * Power, element wise ## Power, Matrix **

Matrix Algebra in IML PROC IML; *MATRIX ADDITION; A={1 3, 2 5}; /*DEFINE MATRIX*/ B={-5 2, 7 0}; /*DEFINE MATRIX*/ C=A+B; /* ADD A AND B*/ PRINT C; *MATRIX MULTIPLICATION; A={2 3,4 5}; /*DEFINE MATRIX*/ B={1 6,2 0}; /*DEFINE MATRIX*/ AB=A*B; /*MULTIPLY A BY B*/ BA=B*A; /* MULTIPLY B BY A*/ PRINT A B AB BA; /* NOTE THAT MULTIPLICATION IS NOT COMMUTATIVE, AB DOESN'T EQUAL BA*/ QUIT;

Matrix Operators: Comparison Element wise comparison of matrices, result is a matrix of 0(False) and 1 (True) Comparisons Less than (<), less than or equal to (<=) Greater than (>), greater than or equal to (>=) Equal to (=), Not equal to (^=) Can create compound arguments using logical functions And (&) Or ( |) Not ( ^)

Solving Systems of Equations Solve the following system of equations When the problem is rewritten in terms of a matrix AX=B implies that X=A^-1B PROC IML; M= {3 2 -4. 5 -4 0 0 3 10}; B={11.9.42}; A=SOLVE(M,B); PRINT A;

Solving Systems of Equations (cont) To solve, we can rearrange PROC IML; A={3 2 -4, 5 -4 0, 0 3 10}; B={11,9,42}; OPT1=SOLVE(A,B); OPT2=INV(A)*B; PRINT OPT1 OPT2; QUIT;

Working with SAS Datasets

Opening a SAS Dataset Before you can access a SAS dataset, you must first submit a command to open it. To simply read from an existing data set, submit a USE statement. USE <SAS Dataset> VAR <Variable Names> WHERE expression; To read and write to an existing data set, use the EDIT statement. In addition to READ you can also EDIT, DELETE, and PURGE observations from a dataset that has been opened using edit Each dataset must only be opened once

Reading in Datasets Create matrices from a SAS dataset Create a vector for each variable Create a matrix containing multiple variables Select all observations or a subset To transfer data from a SAS dataset to a matrix SETIN Specifies an open dataset as the current input dataset READ Transforms dataset into matrix READ <range> VAR operand <WHERE (expression)> INTO name; READ all VAR VAR1 WHERE VAR1>80 INTO MYMAT; Range: specifies a range of observations Operand: selects a set of variables Expression: is an expression that is evaluated as being true or false Name: names a target matrix for the data Use the WHERE clause to conditionally select observations from within the specified range

Comparison Operators Operation IML Code Less than < Less than or equal to <= Equal to = Greater than > Greater than or equal to >= Not equal to ^= Contains a given string ? Does not contain a given string ^? Begins with a given string =: Sounds like or is spelled like a given string =*

Sorting SAS Datasets First close the dataset SORT dataset out=new_dataset by var_name; Can use the keyword DESCENDING to denote the alternative sort order

Creating Datasets from Matrices When you create a dataset Columns become variables Rows become observations CREATE Opens a new SAS dataset for I/O APPEND Writes to the dataset CREATE SAS-data-set FROM matrix <[COLNAME=column-name ROWNAME=row name]> CREATE SAS-dataset VAR variable-names; APPEND FROM matrix-name;

Data Management Commands Description APPEND Adds observations to the end of a SAS dataset RESET DEFLIB Names default libname CLOSE Closes a SAS dataset SETIN Selects an open SAS dataset for input CREATE Creates and opens a new SAS dataset or input and output SETOUT Selects an open SAS dataset for output DELETE Marks observations for deletion in a SAS dataset SHOW CONTENTS Shows contents of the current input SAS dataset EDIT Opens an existing SAS dataset for I/O SHOW DATASETS Shows SAS datasets currently open FIND Finds observations SORT Sorts a SAS dataset READ Reads observations into IML variables SUMMARY Produces summary statistics for numeric variables REPLACE Writes observations back into a SAS dataset USE Opens an existing SAS dataset for input DATASETS function obtains members in a data library. This function returns a character matrix that contains the names of the SAS data sets in a library. CONTENTS function obtains variables in a member. This function returns a character matrix that contains the variable names for the SAS data set specified by libname and memname. The variable list is returned in alphabetical order. DELETE subroutine deletes a SAS data set member in a specified library.

Reading in SAS data with IML *CREATING A SAS DATASET TO WORK WITH; DATA MYDATA; SET SASHELP.CARS; RUN; PROC IML; USE MYDATA VAR {MSRP MPG_CITY MPG_HIGHWAY} ; /* OPEN DATASET*/ READ ALL VAR _ALL_ WHERE (MSRP<12000) INTO CAR_MAT; /* READ DATASET*/ Z=NROW(CAR_MAT); /* FIGURE OUT HOW MANY ROWS*/ PRINT Z CAR_MAT[COLNAME={MSRP CITY HWY}]; /* LOOK AT DATA*/ QUIT;

Analyzing Data & Writing Programs

Subscript Operations Commands that can be applied to obtain summary statistics on matrices Select a single element, row, column, or submatrix Similar to the APPLY function in R SUMMARY produces summary statistics on the numeric variables of a SAS data set. If you want them by subgroup use the CLASS option. SUMMARY VAR {VARIABLE LIST} <CLASS (By Variables)> STAT (Desired stats) <OPT (SAVE)> Reduction operators Addition + Multiplication # Mean: Sum of Squares ## Maximum <> Minimum >< Index of maximum <:> Index of minimum >:< Additional Operators Concatenation: Horizontal ||, Vertical // Number of rows: nrow(matrix), Number of Columns: ncol(matrix)

Types of Statements Control Statements Functions and CALL statements Direct the flow of execution E.g. IF-THEN/ELSE statement Functions and CALL statements Perform special tasks or user-defined operations Command statements Perform special processing such as setting options, displaying windows, and handling input and output

Control Statements Statement Description PROC IML; QUIT; Initiates and ends an IML session DO; END; Specifies a group of statements Iterative DO; END; Defines an iteration loop IF-THEN;ELSE; Conditionally routes execution START; FINISH; Defines a module RUN; Executes a Module

IF-THEN/ELSE statements IF expression THEN statement-one; ELSE statement-two; IML processess the expression and uses this to decide whether statement one or statement two is executed. You may also nest IF- THEN/ELSE Statements PROC IML; A={12 22 33}; IF MAX(A)<20 THEN P=1; ELSE P=0; PRINT P; QUIT;

DO groups Several statements can be grouped together into a compound statement to be executed as a unit. DO; Statements; END; You can combine DO arguments with IF/ELSE IF (X<Y) THEN DO; Z=X+Y; END; ELSE DO; Z=X-Y; END; The iterative DO <WHILE/UNTIL expression> repeats a set of statements over an number of times defined by the index. If DO WHILE is used, the expression is evaluated at the beginning of each loop with iterations continuing until the expression is false. If the expression begins false the loop does not run. If DO UNTIL is used the expression is evaluated at the end of the loop, this means that the loop will always execute at least once. PROC IML; Y=0; DO I=1 TO 3; Y=Y+1; PRINT Y; END; QUIT; COUNT=1; DO WHILE(COUNT<3); COUNT=COUNT+1; PRINT “WHILE"; DO UNTIL(COUNT>3); PRINT “UNTIL";

Interacting with Procs Option One Write the data to a SAS data set by using the CREATE and APPEND statements Use the SUBMIT statement to call a SAS procedure that analyzes the data Read the results of the analysis into IML matrices using USE and READ statements Option Two Do what can only be done in IML Write the data back out to a SAS dataset Call PROCs normally ODS TRACE ON;/ODS TRACE OFF; Placed before and after a proc will print to the log the names of the various output. Useful for requesting/saving specific parts of the analysis. To use PROCs SUBMIT; Statements; END SUBMIT; Like macros you can list variables already existing in IML that you would like to use in the proc. Then inside the submit command refer to these variables using &Varname Substitutions take place before the block is processed so no macro variable is created If you use SUBMIT *, you indicate a wildcard so that any of the existing variables can be referred Any variable inside the submit block that is referenced (&var) but not created in the IML procedure does not get substituted. This is used for creating true macros.

Interacting with Procs PROC IML; Q={2 5 7 9}; CREATE MYDATA VAR{Q}; APPEND; CLOSE MYDATA; *Table=“Moments”; SUBMIT; *SUBMIT table; PROC UNIVARIATE DATA=MYDATA; VAR Q; ODS OUTPUT MOMENTS=MOMENTS; * ODS OUTPUT MOMENTS=&Table; RUN; ENDSUBMIT; USE MOMENTS; READ ALL VAR{NVALUE1 LABEL1}; CLOSE MOMENTS; LABL ="MY OUTPUT"; PRINT NVALUE1[ROWNAME=LABEL1 LABEL=LABL]; QUIT;

Modules Modules are used for two purposes To execute the module use To create user-defined subroutine or function. To define variables that are local to the module. START MODULE-NAME OPTIONS; STATEMENTS; FINISH MODULE-NAME; To execute the module use RUN MODULE-NAME; execute module first then subroutines CALL MODULE_NAME; execute subroutines then modules A function is a special type of module that only returns a specific value. START MODULE; STATEMENTS; RETURN(VARIABLE); FINISH MODULE; Any variables created inside the module but not mentioned in the return statement will not be retained for future use. Possible to store and load modules (like a macro library or SOURCE in R) STORE MODULE= MODULE NAME; LOAD MODULE=MODULE NAME; These will retain a program after IML has exited start MyMod(a,b,c); a=sqrt(c); b=log(c); finish; run MyMod(S,L,{1 2 4 9}); Execution of the module statements creates matrices S and L which contain the square roots and natural logs, respectively, of the elements of the third argument.

Creating a Permanent Module Library Permanent libraries maintain functions for multiple users. Equivilant to datasets stored in a permanent library vs. work folder LIBNAME LIBRARY ‘PATH’; PROC IML; START FUNC1(X); RETURN(X+1); FINISH; START FUNC2(X); RETURN(X**2); FINISH; RESET STORAGE=SOURCEFILE.LIBRARY; STORE MODULE=_ALL_; QUIT;

Command Statments Statement Description FREE Frees memory associated with a matrix LOAD Loads a matrix or module from a storage library MATTRIB Associates printing attributes with matrices PRINT Prints a matrix or message RESET Sets various system options REMOVE Removes a matrix or module from library storage SHOW Displays system information STORE Stores a matrix or module in the storage library STORE A B /* STORE THE MATRICES*/; SHOW STORAGE /* MAKE SURE THE MATRICES ARE SAVED */ FREE A B; /* FREE THE RAM*/ LOAD A B;

Using R

Calling R from within IML Check to see if R has permission for your SAS PROC OPTIONS OPTION=RLANG; If not, you will have to add the –RLANG option to startup Similar to calling procs SUBMIT/R; ENDSUBMIT; Export ExportDataSetToR: SAS dataset ->R data frame ExportMatrixtoR:IML Matrix->R Matrix Import IMPORTDATASETFROMR: R Expression ->SAS Dataset IMPORTMATRIXFROMR : R Expression ->SAS MATRIX R OBJECTS TEND TO BE COMPLEX SO YOU CAN ONLY TRANSFER SOMETHING THAT HAS BEEN COERCED TO DATA FRAME

SAS to R and back again proc iml; /* Comparison of matrix operations in IML and R */ print "---------- SAS/IML Results -----------------"; x = 1:3; /* vector of sequence 1,2,3 */ m = {1 2 3, 4 5 6, 7 8 9}; /* 3 x 3 matrix */ q = m * t(x); /* matrix multiplication */ print q; print "------------- R Results --------------------"; submit / R; rx <- matrix( 1:3, nrow=1) # vector of sequence 1,2,3 rm <- matrix( 1:9, nrow=3, byrow=TRUE) # 3 x 3 matrix rq <- rm %*% t(rx) # matrix multiplication print(rq) endsubmit; hist(p, freq=FALSE) # histogram lines(est) # kde overlay proc iml; use Sashelp.Class; read all var {Weight Height}; close Sashelp.Class; /* send matrices to R */ call ExportMatrixToR(Weight, "w"); call ExportMatrixToR(Height, "h"); submit / R; Model <- lm(w ~ h, na.action="na.exclude") # a ParamEst <- coef(Model) # b Pred <- fitted(Model) Resid <- residuals(Model) endsubmit; call ImportMatrixFromR(pe, "ParamEst"); print pe[r={"Intercept" "Height"}]; ht = T( do(55, 70, 5) ); A = j(nrow(ht),1,1) || ht; pred_wt = A * pe; print ht pred_wt; YVar = "Weight"; XVar = "Height"; submit XVar YVar / R; Model <- lm(&YVar ~ &XVar, data=Class, na.action="na.exclude") print (Model$call)

Special Data Issues Sample code for several specialized utilities See also Rick Wicklin’s “DO LOOP” Blog

Simulating Data in R proc iml; do i=1 to 3; call randseed(9087235); mean=i*2; var=0.5*i; y = j(400,1); call randgen(y, 'normal'); z = j(100,1); call randgen(z, 'normal',mean, var); x = y // z; create a var {"x"}; append; close a; submit; proc univariate data=a noprint; var x; histogram / kernel; run; endsubmit; end;

How to Reshape a Matrix to Long Format proc iml; start LongForm(X); /* convert numerical matrix to long format */ R = repeat(T(1:nrow(X)), 1, ncol(X)); /* row index */ C = repeat(1:ncol(X), nrow(X)); /* col index */ return( colvec(X) || colvec(R) || colvec(C) ); finish; R = shape(1:25, 5, 5); /* initial 5 x 5 matrix */ S = LongForm(R); /* 25 x 3 matrix */ create Long from S[c={"value" "row" "col"}]; append from S; quit; proc print data=Long nobs; run;

Simulating Multivariate samples Suppose that you want to simulate k samples (each with N observations) from a multivariate normal distribution with a given mean vector and covariance matrix. Because all of the samples are drawn from the same distribution, one way to generate k samples is to generate a single "mega sample" with kN observations, and then use an index variable to indicate that the first N observations belong to the first sample, the next Nobservations belong to the second sample, and so forth. This article implements that technique. I have previously shown how to use the RANDNORMAL function in SAS/IML to simulate multivariate normal data. Now suppose that you want to generate 10 samples, where each sample contains five observations from a trivariate normal distribution. You can generate 5 x 10 = 50 observations as follows: proc iml; N = 5; /* size of each sample */ NumSamples = 10; /* number of samples */ /* specify population mean and covariance */ Mean = {1, 2, 3}; Cov = {3 2 1, 2 4 0, 1 0 5}; call randseed(4321); X = RandNormal(N*NumSamples, Mean, Cov);

MISC