Download presentation

Presentation is loading. Please wait.

Published bySydney Gunn Modified over 2 years ago

1
Shunting data Copying stuctures

2
Text fileDB tableDouble[]Gsl_vectorGsl_matrixApop_dat a Text fileCFF DB tableQQQQ Double[]CFF Gsl_vectorPPFCFF Gsl_matrixPPFVCF Apop_dat a PPFSSC TO FROMFROM

3
METHODS OF CONVERSION C- copying F-Fuction call Q-querying P-Printing V-views S-Subelements.

4
C – copying Gsl_memcpy function is used int gsl_vector_memcpy (gsl_vector * dest, const gsl_vector * src) – This function copies the elements of the vector src into the vector dest. The two vectors must have the same length. This function assumes that the destination to which data is copied has already been allocated.

5
apop_data* apop_data_copy(const apop_dat a * in) apop_dataapop_data_copyapop_dat a – Copies one apop_data structure to another. That is, all data is duplicated.apop_data Memmove(&second,&first,sizeof(datatype)) – It goes to the locatin of first and blindly copies what it finds to the location of second upto the size of one datatype.

6
Apop_system() : int apop_system(const char * fmt,... )apop_system – Call system(), but with printf-style arguments. – E.g. : char filenames[] = "apop_asst.c apop_asst.o" apop_system("ls -l %s", filenames); apop_system Returns: The return value of the system() call. int gsl_matrix_memcpy (gsl_matrix * dest, const gsl_matrix * src) – This function copies the elements of the matrix src into the matrix dest. The two matrices must have the same size.

7
F – Function Call These are designed to convert one format to another There are two ways : – Using pointer to declare a list of pointers to pointers – Automatically allocated array to use double subscripts Second method is more convenient but it allows decl of matrix only once.

8
F – Function Call

9
P -printing The ouptut can be directed to screen, file, database or system Apop_opts.output_type function is used.it has following choices : – s : print to screen (default) – f : print to file – d : stores the result in db table – p : write to pipe in apop_opts.output-pipe

10
Q – querying To get data from db queries can be used. Four ways : – Apop_query_to_float – Apop_query_to_vector – Apop_query_to_data – Apop_query_to_matrix

11
int apop_query(const char * fmt,... )apop_query – Send a query to the database that returns no data. – As with the apop_query_to_... functions, the query can include printf-style format specifiers, such as : apop_query("create table %s(id, name, age);", tablename).

12
apop_data* apop_query_to_data(const char * fmt,... ) apop_dataapop_query_to_data – Queries the database, and dumps the result into an apop_data set.apop_data – Most data will be in the matrix element of the output. Column names are appropriately placed. double apop_query_to_float(const char * fmt,... )apop_query_to_float – Queries the database, and dumps the result into a single double-precision floating point number.

13
This calls apop_query_to_data and returns the (0,0)th element of the returned matrix. Thus, if your query returns multiple lines, you will get no warning, and the function will return the first in the listapop_query_to_data gsl_matrix* apop_query_to_matrix(const char * fmt,... )apop_query_to_matrix Queries the database, and dumps the result into a matrix. Uses apop_query_to_data and returns just the matrix partapop_query_to_data Returns gsl_matrix

14
gsl_vector* apop_query_to_vector(const char * fmt,... )apop_query_to_vector Queries the database, and dumps the first column of the result into a gsl_vector. Uses apop_query_to_data internally, then throws away all but the first column of the matrix.apop_query_to_data Returns:A gsl_vector holding the first column of the returned matrix. Thus, if your query returns multiple lines, you will get no warning, and the function will return the first in the list.

15
S- subelements Only some data items can be pulled out of entire set. For this method of copying function from F above can be used.

16
V - views Exactly similar to db views. Can have have subsets of original matrices. Changes made to original data will be reflected in views and vice versa Following gsl_matrix functions are used: – Apop_matrix_row(m,row,v) – Apop_matrix_col(m,col,v) – Apop_submatrix (m, srow, scol, nrows, ncols, o )

17
Apop_matrix_col(m,col,v) – After this call, v will hold a vector view of the colth column of m. – Eg : Apop_matrix_col(m,5,col_v) – It will return a gsl_vector named col_v holding the fifth column Apop_matrix_row(m,row,v) – After this call, v will hold a vector view of the rowth row of m. – Eg : Apop_matrix_row(m,3,row_v) – It will return a gsl_vector named row_v holding the third row

18
Apop_submatrix (m, srow, scol, nrows, ncols, o ) It Pulls a pointer to a submatrix into a gsl_matrix Parameters: – m : The root matrix – srowthe first row (in the root matrix) of the top of the submatrix – scol :the first column (in the root matrix) of the left edge of the submatrix – nrow: number of rows in the submatrix ncolnumber of columns in the submatrix

19
Example Apop_submatrix(m,2,4,6,8,submat) – It will return a gsl_matrix * named submat whose (0,0)th element is at (2,4) from original matrix For data sets we use these functions with row/column names Apop_row_t(m,fourth_row,row_v) Apop_col_t(m,fifth column,col_v)

20
LINEAR ALGEBRA

21
apop_data* apop_dot(const apop_data * d1, apop_dataapop_dotapop_data const apop_data * d2, char form1,char form2 )apop_data A convenience function for dot products. – d1 may be a vector or a matrix, and the same for d2, – so this function can do vector dot matrix, matrix dot matrix, and so on. – If d1 includes both a vector and a matrix, then later parameters will indicate which to use. – Char form 1 and 2 are flags for each matrix indicating what to do with it – i.e t for transpose – v for vector – 0 use the matrix as it is.

22
Eg : apop_data(X,X,t,0) it will XX i.e.it takesdot product of X with itself and the first version of X is transposed.while the second is not. If first row is vector it is always taken to be row.if second element is is a vector it is alwys taken to be column

23
int gsl_blas_ddot (const gsl_vector * x, const gsl_vector * y, double * result) – It returns the dot product of vectors X and Y – Eg : double dotprod; gsl_blas_dot(x,y,&dotprod); **The Basic Linear Algebra Subprograms (BLAS) define a set of fundamental operations on vectors and matrices which can be used to create optimized higher-level linear algebra functionality. The functions are declared in the file gsl_blas.h

24
MATRIX INVERSION AND EQUATION SOLVING

25
gsl_matrix* apop_matrix_inverse(const gsl_matrix * in)apop_matrix_inverse – Inverts a matrix. The in matrix is not destroyed in the process. You may need to call apop_matrix_determinant first to check that your input is invertible, or use apop_det_and_inv to do both at once.apop_matrix_determinantapop_det_and_inv Parameters : in is the The matrix to be inverted. Returns:Its inverse.

26
double apop_matrix_determinant(const gsl_matrix * in)apop_matrix_determinant – Find the determinant of a matrix. The in matrix is not destroyed in the process. – apop_matrix_inverse, or apop_det_and_inv to do both at once.apop_matrix_inverseapop_det_and_inv Parameters: in – The matrix to be determined. Returns:The determinant.

27
double apop_det_and_inv(const gsl_matrix * in,gsl_matrix ** out,int calc_det,int calc_inv )apop_det_and_inv Calculate the determinant of a matrix, its inverse, or both, via LU decomposition. The in matrix is not destroyed in the process.

28
Parameters:in The matrix to be inverted/determined. Out : If you want an inverse, this is where to place the matrix to be filled with the inverse. Will be allocated by the function. calc_det0: Do not calculate the determinant. \ 1: Do. calc_inv0: Do not calculate the inverse. \ 1: Do. Returns:If calc_det == 1, then return the determinant. Otherwise, just returns zero. If calc_inv!=0, then *out is pointed to the matrix inverse.

29
Numbers Values taken by floating point numbers can take :they are INFINITY -INFINITY NAN(not a number)(mainly used for missing data)

30
MODELS

31
Apop_model Similar to apo_data it encapsulates model information in uniform manner It allows models to be in various functions that can take any model as input. A model is intermediate between data and parameters.from there model can go in three directions

32
1. X β : given data, estimate parameters (OLS parameter or covariance) 2. β X : given parameters generate artificial data (Monte Carlo) 3. (X, β ) p : given both parameters and data estimate their likelihood or probability (Bayesian Estimation)

33
apop_modelapop_model* apop_estimate(apop_data * d, apop_model m )apop_estimateapop_data apop_model estimate the parameters of a model given data. This function copies the input model, preps it, and calls m.estimate(d,&m). If your model has no estimate method, then It assume apop_maximum_likelihood(d, m), with the default MLE params.

34
Parameters: – d :The data – m :The model Returns:A pointer to an output model, which typically matches the input model but has its parameters element filled in. Eg apop_model *est = apop_estimate(data,apop_normal)

35
#include int main(void) { apop_text_to_db(.text_file="data",.tabname="d");apop_text_to_db apop_data *data = apop_query_to_data("select * from d");apop_dataapop_query_to_data apop_model apop_model *est = apop_estimate(data, apop_ols); apop_model_show(est);apop_estimateapop_ols } Ols : Ordinary Least Squares

36
Examples Cooks distance Network data MLE models Utility maximization

37
Cooks Distance It is an estimate of how much each data point affects a regression. In a practical ordinary least squares analysis, Cook's distance can be used in several ways: 1)to indicate data points that are particularly worth checking for validity;ordinary least squares 2) to indicate regions of the design space where it would be good to be able to obtain more data points. It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.R. Dennis Cook

38
Cook's distance measures the effect of deleting a given observation. Data points with large residuals (outliers) and/or high leverage may distort the outcome and accuracy of a regression. Points with a large Cook's distance are considered to merit closer examination in the analysisoutliersleverage

39
MLE models maximum-likelihood estimation (MLE) is a method of estimating the parameters of a statistical model.estimatingparametersstatistical model When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters.statistical modelestimates

40
Example one may be interested in the heights of adult female penguins, but be unable to measure the height of every single penguin in a population due to cost or time constraints. Assuming that the heights are normally (Gaussian) distributed with some unknown mean and variance, the mean and variance can be estimated with MLE while only knowing the heights of some sample of the overall population.normally (Gaussian) distributedmeanvariance MLE would accomplish this by taking the mean and variance as parameters and finding particular parametric values that make the observed results the most probable (given the model).

41
GRAPHICS

42
Graphics Gnuplot is a free, command-driven, interactive, function and data plotting program. Gnuplot Any mathematical expression accepted by C, FORTRAN, Pascal, or BASIC may be plotted. The precedence of operators is determined by the specifications of the C programming language.

43
plot and splot are the primary commands in Gnuplot. They plot functions and data in many many ways. plot is used to plot 2-d functions and data, while splot plots 3-d surfaces and data.

44
Syntax plot {[ranges]} {[function] | {"[datafile]" {datafile-modifiers}}} {axes [axes] } { [title- spec] } {with [style] } {, {definitions,} [function]...} where either a [function] or the name of a data file enclosed in quotes is supplied.

45
To plot functions simply type: plot [function] at the gnuplot> prompt. For example: gnuplot> plot sin(x)/x gnuplot> splot sin(x*y/20) gnuplot> plot sin(x) title 'Sine Function', tan(x) title 'Tangent'

46
Discrete data contained in a file can be displayed by specifying the name of the data file (enclosed in quotes) on the plot or splot command line. Data files should have the data arranged in columns of numbers Columns should be separated by white space (tabs or spaces) only, (no commas). Lines beginning with a # character are treated as comments and are ignored by Gnuplot. A blank line in the data file results in a break in the line connecting data points.

47
Customization of the axis ranges, axis labels, and plot title, as well as many other features, are specified using the set command. Specific examples of the set command follow.

48
Create a title: > set title "Force-Deflection Data" Put a label on the x-axis: > set xlabel "Deflection (meters)" Put a label on the y-axis: > set ylabel "Force (kN)" Change the x-axis range: > set xrange [0.001:0.005] Change the y-axis range: > set yrange [20:500] Have Gnuplot determine ranges: > set autoscale Move the key: > set key 0.01,100

49
Delete the key: > unset key Put a label on the plot: > set label "yield point" at 0.003, 260 Remove all labels: > unset label Plot using log-axes: > set logscale Plot using log-axes on y-axis: > unset logscale; set logscale y Change the tic-marks: > set xtics (0.002,0.004,0.006,0.008) Return to the default tics: > unset xtics; set xtics auto

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google