# Unit ii Matrices and models.

## Presentation on theme: "Unit ii Matrices and models."â€” Presentation transcript:

Unit ii Matrices and models

Introduction -What is GSL?
The GNU Scientific Library GSL is a numerical library for C and C++ programmers. GSL provides a well-defined C language GSL is thread-safe i.e they do not use static variables. Memory is always associated with objects and not with functions.

Compiling and Linking The library header files are installed in their own `gsl' directory. One should write include statements with a `gsl/' directory prefix like: #include <gsl/gsl_math.h> Compile: gcc -c example.c Link: gcc example.o -lgsl -lgslcblas -lm

I - Naming conventions Every function in GSL begins with gsl_
The first arguments of these functions will be objects acted upon. Eg : gsl_matrix_ , gsl_vector_ etc.

Apophenia Apophenia is an open statistical library for working with data sets and statistical models. It provides functions on the same level as those of the typical stats package Apopheniaâ€™s functions begin with apop_. Major functions begin with datat type like apop-data,apop_model etc.

II-Basic matrix and vector operations
Matrix is an array of numbers with dimensions M (rows) by N (columns) 3 by 6 matrix element 2,3 is (3) Vector can be considered a 1 x M matrix

Simplest oprerations on matrices are:
Element by element addition, multiplication, etc. Same can be done with vectors Following are the list of functions gsl provides for the same.

Vectors Vectors are defined by aÂ gsl_vectorÂ structure which describes a slice of a block. Different vectors can be created which point to the same block. A vector slice is a set of equally-spaced elements of an area of memory.

Vectors Operations with vectors Structure Vector allocation
Accessing elements of vectors Vector operations

Structure of a vector TheÂ gsl_vectorÂ structure contains five components: theÂ size, theÂ stride, a pointer to the memory where the elements are stored, Â data, a pointer to the block owned by the vector, Â block, if any, and an ownership flag,Â owner.

The structure is very simple and looks like this,
typedef struct { size_t size; size_t stride; double * data; gsl_block * block; int owner; } gsl_vector;

Vector allocation The functions for allocating and accessing vectors are defined inÂ gsl_vector.h The functions for allocating memory to a vector follow the style ofÂ mallocÂ andÂ free. gsl_vector *Â gsl_vector_allocÂ (size_tÂ n) : it cre ates a vector of length n Â gsl_vector *Â gsl_vector_callocÂ (size_tÂ n) : This function allocates memory for a vector of lengthÂ nÂ and initializes all the elements of the vector to zero.

void gsl_vector_free (gsl_vector
voidÂ gsl_vector_freeÂ (gsl_vector *Â v)This function frees a previously allocated vectorÂ v.

Accessing vector elements
The functions for accessing the elements of a vector are : doubleÂ gsl_vector_getÂ (const gsl_vector *Â v, size_tÂ i) This function returns theÂ i-th element of a vectorÂ v. IfÂ iÂ lies outside the allowed range of 0 toÂ n-1Â then the error handler is invoked and 0 is returned or matrix are defined inÂ gsl_vector.h

void gsl_vector_set (gsl_vector * v, size_t i, double x)
This function sets the value of theÂ i-th element of a vectorÂ vÂ toÂ x. IfÂ iÂ lies outside the allowed range of 0 toÂ n-1Â then the error handler is invoked Â double *Â gsl_vector_ptrÂ (gsl_vector *Â v, size_tÂ i) Â const double *Â gsl_vector_const_ptrÂ (const gsl_vector *Â v, size_tÂ i) These functions return a pointer to theÂ i-th element of a vectorÂ v.

Vector operations Â intÂ gsl_vector_addÂ (gsl_vector *Â a, const gsl_vector *Â b) This function adds the elements of vectorÂ bÂ to the elements of vectorÂ a. The resultÂ ai + biÂ is stored inÂ aÂ andÂ bÂ remains unchanged. The two vectors must have the same length. intÂ gsl_vector_subÂ (gsl_vector *Â a, const gsl_vector *Â b) This function subtracts the elements of vectorÂ bÂ from the elements of vectorÂ a .result is stored in a

intÂ gsl_vector_mulÂ (gsl_vector *Â a, const gsl_vector *Â b)
intÂ gsl_vector_divÂ (gsl_vector *Â a, const gsl_vector *Â b) Â intÂ gsl_vector_scaleÂ (gsl_vector *Â a, const doubleÂ x) : This function multiplies the elements of vectorÂ aÂ by the constant factorÂ x. intÂ gsl_vector_add_constantÂ (gsl_vector *Â a, const doubleÂ x) : This function adds the constant valueÂ xÂ to the elements of the vectorÂ a.

Example Output : \$ ./a.out v_0 = 1.23 v_1 = 2.23 v_2 = 3.23
#include <stdio.h> #include <gsl/gsl_vector.h> int main (void) { int i; gsl_vector * v = gsl_vector_alloc (3); for (i = 0; i < 3; i++) { gsl_vector_set (v, i, i); } gsl_vector_free (v); return 0; } Output : \$ ./a.out v_0 = 1.23 v_1 = 2.23 v_2 = 3.23

Matrices Matrices are defined by aÂ gsl_matrixÂ structure which describes a generalized slice of a block. Like a vector it represents a set of elements in an area of memory, but uses two indices instead of one.

TheÂ gsl_matrixÂ structure contains six components,
the two dimensions of the matrix, a physical dimension, a pointer to the memory where the elements of the matrix are stored, Â data, a pointer to the block owned by the matrixÂ block, if any, an ownership flag,Â owner. The physical dimension determines the memory layout and can differ from the matrix dimension to allow the use of submatrices.

Â gsl_matrixÂ structure typedef struct { size_t size1; size_t size2; size_t tda; double * data; gsl_block * block; int owner; } gsl_matrix;

Matrices are stored in row-major order, meaning that each row of elements forms a contiguous block in memory. The number of rows isÂ size1. The range of valid row indices runs from 0 tosize1-1. SimilarlyÂ size2Â is the number of columns. The range of valid column indices runs from 0 toÂ size2-1. The physical row dimensionÂ tda, orÂ trailing dimension, specifies the size of a row of the matrix as laid out in memory.

example In the following matrixÂ size1Â is 3,Â size2Â is 4, andÂ tdaÂ is 8. The physical memory layout of the matrix begins in the top left hand-corner and proceeds from left to right along each row in turn. XX XX XX XX XX XX XX XX XX XX XX XX xx represents unused memory locations The functions for allocating and accessing matrices are defined inÂ gsl_matrix.h

Accessing matrix elements
The functions for accessing the elements of a matrix use the same range checking system as vectors. doubleÂ gsl_matrix_getÂ (const gsl_matrix *Â m, size_tÂ i, size_tÂ j) This function returns theÂ (i,j)-th element of a matrixÂ m. Â voidÂ gsl_matrix_setÂ (gsl_matrix *Â m, size_tÂ i, size_tÂ j, doubleÂ x) This function sets the value of theÂ (i,j)-th element of a matrixÂ mÂ toÂ x.

double * gsl_matrix_ptr (gsl_matrix * m, size_t i, size_t j)
Â const double *Â gsl_matrix_const_ptrÂ (const gsl_matrix *Â m, size_tÂ i, size_tÂ j) These functions return a pointer to theÂ (i,j)-th element of a matrixÂ m.

Matrix operations Â intÂ gsl_matrix_addÂ (gsl_matrix *Â a, const gsl_matrix *Â b) This function adds the elements of matrixÂ bÂ to the elements of matrixÂ a. The resultÂ a(i,j) + b(i,j)Â is stored inÂ aÂ andÂ bÂ remains unchanged. The two matrices must have the same dimensions. Â intÂ gsl_matrix_subÂ (gsl_matrix *Â a, const gsl_matrix *Â b)

Matrix operations intÂ gsl_matrix_mul_elementsÂ (gsl_matrix *Â a, const gsl_matrix *Â b) intÂ gsl_matrix_div_elementsÂ (gsl_matrix *Â a, const gsl_matrix *Â b) Â intÂ gsl_matrix_scaleÂ (gsl_matrix *Â a, const doubleÂ x) Â intÂ gsl_matrix_add_constantÂ (gsl_matrix *Â a, const doubleÂ x)

Example #include <stdio.h> #include <gsl/gsl_matrix.h> int main (void) { int i, j; gsl_matrix * m = gsl_matrix_alloc (10, 3); for (i = 0; i < 10; i++) { for (j = 0; j < 3; j++) {gsl_matrix_set (m, i, j, *i + j); } gsl_matrix_free (m); return 0; }

Apophenia

Uses of apophenia Can be used for simple stats-package--like fitting of models, where the user gathers data, cleans it, and runs a series of regressions Â can use the library as input to the design of other systems, like fitting a model and then using the fitted model to generate agents in your simulation, or designing hierarchical models built from simpler base models.

workflow of a typical fitting-a-model project
Read the raw data into the database usingÂ apop_text_to_db. Use SQL queries handled byÂ apop_queryÂ to massage the data as needed. UseÂ apop_query_to_dataÂ to pull some of the data into an in-memory apop_dataÂ set.

4 Call model estimation like apop_estimate to fit the parameters
5 Interrogate the returned estimate, by dumping it to the screen withÂ apop_model_print, sending its parameters and variance-covariance matrices to additional tests Or send the model's output as the input to another model.

Apop_data The apop_data structure represents a data set
It joins together aÂ gsl_vector, aÂ gsl_matrix, anÂ apop_name, and a table of strings. It can be used everywhere aÂ gsl_matrixÂ or aÂ gsl_vector can be used.

Apop_data The structure basically includes six parts: a vector
a matrix a grid of text elements a vector of weights names for everything: row names, a vector name, matrix column names, text names. a link to a second page of data

ex : consider data for a weighted OLS regression
Â ex : consider data for a weighted OLS regression. It includes an outcome variable in the vector, dependent variables in the matrix and text grid, replicate weights, and column names in bold labeling the variables: Example :

Apophenia will generally assume that one row across all of these elements describes a single observation or data point.Â  apop_data_get,Â apop_data_set, andÂ apop_data_ptr are used Â these functions consider the vector to be the -1st column, so using the data set in the example ,apop_data_get(sample_set, .row=0, .col=-1) == 1.

Reading data : can be done usingÂ apop_text_to_dataÂ orÂ apop_text_to_dbÂ and thenÂ apop_query_to_data.
Subsets of data can be generated as required using Â APOP_DATA_ROWS

Means of creating apop_data set
Apop_quer_to_text Apop_query_to_data Apop_matrix_to_data Apop_vector_to_data Apop_data_alloc

Apop_query_to_text Dump the results of a query into an array of strings. Returns:AnÂ apop_dataÂ structure with theÂ textÂ element filled. Arg/fmt : a printf style SQL query IfÂ apop_opts.db_name_columnÂ matches a column of the output table, then that column is used for row names, and therefore will not be included in theÂ text.

Apop_query_to_data Queries the database, and dumps the result into anÂ apop_dataÂ set. IfÂ apop_opts.db_name_columnÂ is set (it defaults to being "row_names"), and the name of a column matches the name, then the row names are read from that column. Returns : If no rows are returned,Â NULL; else anÂ apop_dataÂ set with the data in place. Most data will be in theÂ matrixÂ element of the output

Apop_query_to_data Queries the database, and dumps the first column of the result into aÂ gsl_vector. UsesÂ apop_query_to_dataÂ internally, then throws away all but the first column of the matrix. IfÂ apop_opts.db_name_columnÂ is set, then I'll ignore that column. It gets put into the names of theÂ apop_dataÂ set, and then thrown away when I look at only theÂ gsl_matrixÂ part of that set. If the query returns zero rows of data or no columns, the function returnsÂ NULL.

Returns : AÂ gsl_vectorÂ holding the first column of the returned matrix

Apop_matrix_to_data Wraps anÂ apop_dataÂ structure around an existingÂ gsl_matrix. The matrix is not copied, but is pointed to by the newÂ apop_dataÂ struct. Parameters: (m) The existing matrix you'd like to turn into anÂ apop_dataÂ structure. Returns:TheÂ apop_dataÂ structure whoseÂ matrixÂ pointer points to the input matrix. The rest of the struct is basically blank.

Â apop_vector_to_data Wraps anÂ apop_dataÂ structure around an existingÂ gsl_vector. The vector is not copied, but is pointed to by the newÂ apop_dataÂ struct. Parameters: (v )The data vector Returns:an allocated, ready-to-useÂ apop_dataÂ structure.

Apop_data_alloc Allocate aÂ apop_dataÂ structure, to be filled with data. Has three arguments, likeÂ apop_data_alloc(2,3,4): vector size, matrix rows, matrix cols. If the first argument is zero, you get aÂ NULLÂ vector. Two arguments,Â apop_data_alloc(2,3), would allocate just a matrix, leaving the vectorÂ NULL. One argument,Â apop_data_alloc(2), would allocate just a vector, leaving the matrixÂ NULL. Zero arguments,Â apop_data_alloc(), will produce a basically blank set, with Â out->matrix== out->vector==NULL.

Get,set and point

Set apop_data_set(in, row, col, data)Â is much like the GSL's Â gsl_matrix_set(in->matrix, row, col, data), but with some differences: TheÂ apop_dataÂ set has names, so we can get/set elements using those names. The versions that take a column/row name useÂ apop_name_findÂ for the search; TheÂ apop_dataÂ set has both matrix and vector elements.

Set For those that take a column number, column -1 is the vector element. For those that take a column name, It will search the vector last---if It doesn't find the name among the matrix columns, but the name matches the vector name, it return column -1. If you give both a .row and a .rowname, it goes with the name; similarly for .col and .colname. The column (like all defaults) is zero unless stated otherwise, soÂ apop_data_get(dataset, 1)Â gets item (1, 0) from the matrix element ofdataset

structure int apop_data_set(apop_data * data, const size_t row,
const intÂ col, const doubleÂ val, const char *Â colname, const char *Â rowname, const char *Â pageÂ )

Example Set a data element.
Eg : the following would all set row 3, column 8, ofÂ dÂ to 5: apop_data_set(d, 3, 8, 5); apop_data_set(d, .row = 3, .col=8, .val=5); apop_data_set(d, .row = 3, .colname="Column 8", .val=5); apop_data_set(d, .row = 3, .colname="Column 8", 5); //invalid---the value doesn't follow the colname.

Returns:The value at the given location.
Parameters: Data : The data set. Must not beÂ NULL. Row : The row number of the desired element. IfÂ rowname==NULL, default is zero. Col : The column number of the desired element. -1 indicates the vector. IfÂ colname==NULL, default is zero. rowname : The row name of the desired element. IfÂ NULL, use the row number. colname : The column name of the desired element. IfÂ NULL, use the column number. page : The case-insensitive name of the page on which the element is found. IfÂ NULL, use first page. val : The value to give the point. Returns:The value at the given location.

Apop_data_get double apop_data_get( const apop_data * data,
const size_tÂ row, const intÂ col, const char *Â rowname, const char *Â colname, const char *Â pageÂ ) Returns the data element at the given point.

Apop_data_ptr double* apop_data_ptr (apop_data * data, const int row,
const intÂ col, const char *Â rowname, const char *Â colname, const char *Â pageÂ ) Get a pointer to an element of anÂ apop_dataÂ set.

all of these functions use theÂ Designated initializersÂ syntax for inputs.
i.e. apop_text_to_db("infile.txt", "intable", 0, 1, NULL);

Forming partitioned matrices

The entire data set can be copied.
Two data matrices can be stacked one on top of other. (stack rows) Two data matrices can be stacked one to the right of other. (stack columns) Two data vectors can be stacked. For this we use apop_data_stack function

Apop_data_stack apop_data* apop_data_stack (apop_data * m1,
charÂ posn, charÂ inplaceÂ ) Put the first data set either on top of or to the left of the second data set. The fn returns a new data set, meaning that at the end of this function, until youÂ apop_data_free()Â the original data sets, you will be taking up twice as much memory.

Parameters: m1the upper/rightmost data set (default = NULL)
m2the second data set (default =Â NULL) Posn : If 'r', stack rows of m1's matrix above rows of m2's. if 'c', stack columns of m1's matrix to left of m2's (default = 'r') Inplace : IfÂ 'i'Â 'y'Â or 1, useÂ apop_matrix_reallocÂ andÂ apop_vector_reallocÂ to modifyÂ m1Â in place; Otherwise, allocate a new vector, leavingÂ m1Â unmolested. (default='n') Returns:The stacked data, either in a newÂ apop_dataÂ set orÂ m1

Shunting data Copying stuctures

TO F R O M Text file DB table Double[] Gsl_vector Gsl_matrix Apop_data
Q P V S F R O M

METHODS OF CONVERSION C- copying F-Fuction call Q-querying P-Printing
V-views S-Subelements.

C â€“ copying