Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data analysis in MATLAB Christian Ruff. Why use MATLAB to analyse data? One single programme can be used for: –importing single-subject data from any.

Similar presentations


Presentation on theme: "Data analysis in MATLAB Christian Ruff. Why use MATLAB to analyse data? One single programme can be used for: –importing single-subject data from any."— Presentation transcript:

1 Data analysis in MATLAB Christian Ruff

2 Why use MATLAB to analyse data? One single programme can be used for: –importing single-subject data from any format –re-arranging for multi-subject analyses –statistical tests –plotting results  Errors are less likely  One single script for analysis and documentation  This can even be used by your experimental COGENT-script (online-analysis)  Ultimately, MATLAB is **much** more flexible than SPSS or EXCEL, especially for graphs Nuisances: –some details of SPSS procedures not available (but on the web) –Use not as intuitive as SPSS buttons, but help and doc

3 Outline How to: (1) Import single-subject data from any format (2) Inspect single-subject data for distribution / outliers etc. (3) Re-arrange data for multi-subject analyses (4) Perform statistical tests  all as steps in one single script

4 Outline How to: (1) Import single-subject data from any format (2) Inspect single-subject data for distribution / outliers etc. (3) Re-arrange data for multi-subject analyses (4) Perform statistical tests  all as steps in one single script

5 (1) Importing data: Reading in files MATLAB can read in many different types of files, using different functions These can be listed with help fileformats Examples are: –xlsread: EXCEL data –dlmread: tab-delimited text (or any other form of delimited text, e.g., whitespace) –csvread: comma-separated numbers –textread: any mixture of text and numbers –importdata: any formatted data as a full file (looks for the most appropriate function to use) –fopen/fread: any formatted data by line, but need extensive user specification of format help and doc give instructions and examples MATLAB can also be used to save data in the corresponding formats (e.g., dlmwrite, csvwrite, fopen/fwrite/fprintf)

6 (1) Importing data: Types of variables Data can be stored in files in very different formats (see e.g. different field-types in excel-sheets) Three elementary formats are: –Strings: characters (such as letters), cannot be (sensibly) manipulated numerically e.g., variable names or condition descriptions example_string = ‘123.456’; –Double:used for numbers, can be numerically manipulated Doubles are not stored element-by-element, but as wholes example_number = 123.456; –Logicals:used for boolean logic, so can take only the value 0 (false) and 1 (true) can be numerically manipulated, but does not make sense often used for indexing example_logical = logical(123.456);

7 (1) Importing data: Variable conversion Raw data files often contain mixtures of strings and numbers Numerical values are often represented as strings in imported data After importing data into a variable in MATLAB, the format of each variable can be seen by typing whos ( ), or tested with isnumeric, ischar, or islogical The Matlab workspace contains an array editor that is similar to Excel Strings can be converted into doubles by the commands double or str2num, this turns numbers in “text format” into numbers that you can do computations with e.g. example_number = double(example_string); Doubles can be converted into strings by the command char or num2str; this makes it possible to include numbers in text that you want to write into a file e.g. example_string = char(example_string);

8 (1) Importing data: Variable formats Relevant variable formats include: –Matrices: - contain m( x n x o…) elements, can be accessed by row or column - all elements in a matrix are forced to be in the same format  matrices are well suited for storing numbers  matrices are not ideal for strings (of different lengths e.g. words) –Cells:- contain m( x n x o…) elements, can only be accessed element-by-element - each element can be of different format and length  well suited for storing string variables, and mixtures of variables  not ideal for storing only number variables that have to be accessed and manipulated as a group (e.g., by row and column)

9 (1) Importing data: Variable formats Relevant variable formats include: –Structures: - contain m( x n x o…) elements that all have several fields - each field in any element can contain any variable (e.g., string, numerical) in any format (e.g., cell, matrices…) - the fields of different elements can easily be combined if they have the same format  well suited for different variables that are nevertheless linked (e.g., data from different subjects)  not ideal for storing only number variables that have to be accessed and manipulated as a group (e.g., by row and column)  easy to combine one field of different elements into a matrix (e.g., different trials)  see strucdem

10 (1) Importing data: Transforming variables Arrays / cells / structures can easily be converted into each other: –Numerical array to cell: num2cell or mat2cell  cell2mat –String array to cell: cellstr  char –Structure to array:struct2array  struct –Structure to cell:struct2cell  cell2struct Arrays / cells / structures can be appended or combined –Numerical arrays: [123;456] or cat –String array: strvcat or strcat –Cells:cat –Structures:cat  If the dimensions of the to-be-combined variables are known, then all of these operations can also be performed simply by indexing (e.g. num3(1,:) = num1; num3(2,:) = num2;)

11 Outline How to: (1) Import single-subject data from any format (2) Inspect single-subject data for distribution / outliers etc. (3) Re-arrange data for multi-subject analyses (4) Perform statistical tests  all as steps in one single script

12 (2) Inspecting data: Descriptive statistics Descriptive statistics: mean, median, min, max, prctile, range, var, std, skewness, kurtosis, cdfplot - many of these also work for data with missing values, by appending “nan” (e.g., nanmean) Visualisation of distribution: - Histogram: hist, also available with superimposed normal distribution: histfit - Test for normal distribution: - visually with normplot - statistically with lillietest (when testing for normality), kstest (when testing for any distribution) or kstest2 (when testing for identity of distributions of two or more variables) - Scatterplot of two variables: scatter, also available for several variables: plotmatrix - Lineplot of data against one dimension (e.g., time): plot, or two dimensions: plot3 - visual check for outliers: boxplot (or check for impact of outliers with trimmean)

13 Outline How to: (1) Import single-subject data from any format (2) Inspect single-subject data for distribution / outliers etc. (3) Re-arrange data for multi-subject analyses (4) Perform statistical tests  all as steps in one single script

14 (3) Transforming data for multi-subject analyses Matrices are by far the most convenient data format for statistical analyses: –Most descriptive-statistics commands work on dimensions of matrices e.g., mean(matrix,1) over rows, mean(matrix,2) over columns, etc. –Matrices can easily be indexed with logicals e.g., rows = (matrix(:,2)==1); data(:,1) = matrix(rows,:); –Condition indices can easily be created as matrices e.g., data(:,[2:3]) = fullfact([2 12]); –Matrices can be easily transformed with Sort and sortrows  to sort data flipud, fliplr, flipdim, rot90  to flip dimensions reshape  to change dimensions squeeze  to remove dimensions shiftdim, circshift  to shift dimensions

15 Outline How to: (1) Import single-subject data from any format (2) Inspect single-subject data for distribution / outliers etc. (3) Re-arrange data for multi-subject analyses (4) Perform statistical tests  all as steps in one single script

16 (4) Statistics: mean comparison The MATLAB statistics toolbox contains functions for many (non-)parametric tests (help stats) These ask for data in different input formats (help and doc They give out all relevant statistics as variables, and/or as tables (if displayopt = ‘on’) Comparing several independent measures: anova1, anova2, anovan, manova1, kruskalwallis Comparing several dependent (or mixed) measures: rmaov1, rmaov2, bwoav2, rmaov31, rmaov32, rmaov33, friedman, epsGG, epsHF (all repeated measures ANOVAs from http://www.mathworks.com/matlabcentral/fileexchange)http://www.mathworks.com/matlabcentral/fileexchange Post-hoc contrasts: multcompare, grpstats Comparing two independent measures: Comparing two dependent variables: ttest2, ranksumttest, signtest, signrank

17 (4) Statistics: association/ dimension reduction Bivariate associations: –correlation: corrcoef –linear regression: regress or robustfit (weighted to minimise impact of outliers) –nonlinear regression (e.g. logistic regression): nlinfit Multivariate associations: –Canoncorr, manova1, mdscale, classify, cluster Dimension reduction: –princomp, factoran Bootstrapping is available: Bootstrp

18 (4) Statistics: many other useful things The statistics toolbox contains functions for many statistical distributions (beta, binomial, exponential, gamma, poisson, weibull…): –Fits –Cumulative and probability density functions and their inverses –random number generation Efficient design of factorial experiments (e.g. Fullfact; randn) Advanced statistical methods are either implemented (e.g., hidden Markov Models, decision trees) or can be found on the web: –http://www.statsci.org/matlabhttp://www.statsci.org/matlab –http://www.mathworks.com/matlabcentral/fileexchangehttp://www.mathworks.com/matlabcentral/fileexchange If you want to know more, look at the excellent MATLAB documentation at: –http://www.mathworks.com/access/helpdesk/help/techdoc/http://www.mathworks.com/access/helpdesk/help/techdoc/


Download ppt "Data analysis in MATLAB Christian Ruff. Why use MATLAB to analyse data? One single programme can be used for: –importing single-subject data from any."

Similar presentations


Ads by Google