Presentation is loading. Please wait.

Presentation is loading. Please wait.

MATLAB Statistic Basic

Similar presentations


Presentation on theme: "MATLAB Statistic Basic"— Presentation transcript:

1 MATLAB Statistic Basic
Application Engineer Jeffrey Liu

2 In this course, you will learn…
MATLAB basic operation Import data from file. Calculate summary statistics. Create visualizations of data. Fit models to data. Perform hypothesis tests. Use analysis of variance to test for differences in data groups. Generate random numbers for simulations.

3 Course Outline MATLAB introduction Importing and Organizing Data
Exploring Data Distributions Hypothesis Tests Analysis of Variance Regression Random Numbers and Simulation

4 Technical Computing Workflow
Files Software Hardware Access Code & Applications Explore & Discover Reporting and Documentation Outputs for Design Deployment Share Automate Data Analysis & Modeling Algorithm Development Application Development

5 How to Use This Manual Code font is used for code, function names and URLs. It also occurs on the slides in one of the lower corners as reference to relevant files or commands for the example. >> command_line_code [a,b,c] = command_in_file(d,f); At times, code may run off the line. The line continuation character (…) is used to show this. These are valid MATLAB statements if typed as shown, including carriage returns. >> [CFlowAmounts, CFlowDates, TFactors] = ... cfamounts(couponRate, settle, maturity); Menu items, options and key names are highlighted in bold in the notes sections. Use Ctrl+C to break out of execution. Click on File, Set Path ... to open the Path Browser. >> try this at the prompt

6 Customizing the Desktop
You can customize the layout of the desktop: Try Customize the desktop layout. Use the Layout dropdown menu in the Environment section of the Home tab to restore the default layout. Obviously we start with the appearance and function of the various desktop components. Go through each window; briefly explain its purpose Show resizing and moving of windows Docking and undocking Give them time to play Show how to restore default layout (good motivator: “has anyone messed up their desktop beyond recognition yet?”) Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Will my setup be preserved next time I start MATLAB? Yes Customizing the Desktop Save and restore layouts. Click to access window actions. Window actions Click and drag header bars to reposition components. Multiple components can occupy the same area, in which case they will be accessible via window tabs. Resize & reposition Undock a component to a separate window. Dock a window back to the desktop. Click and drag dividers to resize components.

7 Variables in the Base Workspace
Data imported using the Import Wizard or the Import Tool is stored in the form of MATLAB variables in the base workspace (MATLAB memory). The contents of the base workspace are shown in the Workspace browser window. The browser displays the names of the variables currently in memory and (optionally) information about the variables. Right-clicking on the Name button at the top of the Workspace browser displays a context menu from which you can choose which information to display about the variables. Data is stored in MATLAB variables of various classes (types). Different classes store different kinds of information. The Import Wizard and Import Tool automatically choose appropriate classes for the variables that will store your data. Double arrays are the default type for numerical data in MATLAB. “Array” means a variable with multiple values, stored in a regular layout, such as the table of gas price data. “Double” refers to “double precision,” which is a standard format for storing floating-point numbers – i.e., numbers that generally may have a fractional part, such as Unless you specify a different name, the imported data is stored in a variable with the same name as the file (without the extension). You can rename a variable by right-clicking it in the Workspace browser and choosing Rename from the context menu. Try Change the properties displayed in the Workspace browser to show the size and class of the variable gasprices. We imported data... where is it? Go through the try column Best to show the “class” field (for the next page) Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Variables in the Base Workspace 1990 NaN 1.87 3.63 2.65 4.59 3.16 1 2.05 2.82 1.16 1991 1.96 1.92 3.45 2.9 4.5 3.46 1.3 2.49 3.01 1.14 1992 1.89 1.73 3.56 3.27 4.53 3.58 1.5 3.06 1.13 1993 1.57 3.41 3.07 3.68 4.16 1.56 2.88 2.84 1.11 1994 1.84 1.45 3.59 3.52 3.7 4.36 1.48 2.87 2.99 1995 1.95 1.53 4.26 3.96 4 4.43 2.94 3.21 1.15 1996 2.12 1.61 4.41 3.94 4.39 3.64 1.25 3.18 3.34 1.23 1997 1.62 3.53 4.07 3.26 1.47 3.83 1998 1.63 1.38 3.87 3.84 1.49 3.04 4.06 1.06 1999 1.72 1.52 3.85 3.42 1.79 3.8 4.29 1.17 2000 1.94 1.86 3.77 3.65 2.01 4.18 4.58 1.51 2001 1.71 3.51 3.4 3.57 2.2 3.76 4.13 1.46 2002 1.76 1.69 3.62 3.67 3.74 3.15 2.24 1.36 2003 2.19 1.99 4.35 3.47 2.04 4.11 4.7 1.59 2004 2.72 2.37 4.99 5.24 5.29 3.93 2.03 4.51 5.56 1.88 2005 3.23 2.89 5.46 5.66 5.74 4.28 2.22 5.28 5.97 2.3 2006 3.54 5.88 6.03 6.1 4.47 2.31 5.92 6.36 2.59 2007 6.6 6.88 6.73 4.49 2.4 6.21 7.13 2.8 2008 4.45 4.08 7.51 7.75 7.63 2.45 5.83 7.42 19 numeric data  “double precision” 11

8 The Variable Editor The Variable Editor The Story:
You can view and edit the contents of variables in the base workspace using the Variable Editor. Double-clicking the data type icon next to the variable name in the Workspace browser opens the Variable Editor in a new window on the MATLAB desktop, with the contents of the variable displayed. To edit data in a variable, simply select the portion of the data you want to edit and type in new values. When you click away from your changes, the new data values are stored. Other editing options such as sorting the data according to a given column or deleting rows or columns are available from the Edit section of the Variable tab. When multiple variables are open in the Variable Editor, you can switch the view from one variable to another using the tabs under the toolstrip. Alternatively, variables can be tiled for comparison, using the options on the Tile section of the View tab. Try Open gasprices in the Variable Editor. Change the missing value (NaN) to 1.96. We can see the variables that are in memory, but what is their content? Open a variable or two. Edit the NaN value in data. (Should be 1.96) Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Understand that you can edit values, but this interface is NOT Excel. Which variables are shown is static (a new variable will not automatically open), but the view of the values is “live” (changes update in real time). The Variable Editor

9 New Variables New Variables The Story: Recommended Activities:
You can also use the Variable Editor to create new variables in the base workspace by extracting data from other variables. The variable gasprices is a 19-by-11 array of values, with each column representing a different variable (country) in the data, and each row representing a different (annual) observation. This format – observations × variables – is typical of how multivariate data is stored in a single MATLAB array. You can use the Variable Editor to extract individual variables (columns) giving gas price data for individual countries. To do this, click the numbered column header for the variable you want to select and click New from Selection  New Numeric Array in the Variable section of the Variable tab. A variable called gasprices1 is created in the base workspace. It can be renamed by right-clicking it in the Workspace browser and choosing Rename from the context menu. You can shift-click and Ctrl-click to select multiple portions of data. When the selected data spans multiple columns, you can create a single (matrix) variable from the selection by clicking New from Selection  New Numeric Array, or multiple (single-column) variables by clicking New from Selection  New Column Vectors. Try In the Variable Editor, select the first column of gasprices, and create a new array from this selection. In the Workspace browser, rename the variable Year. Create separate variables for France, Germany, and Mexico (columns 4, 5, and 8, respectively) by making a multiple-column selection and creating new column vectors. We have the table of gas prices, but we might want to extract specific columns (Year and individual countries). Go through the try column Show a couple of examples, have them do the others (#3 in try column) Show different ways to rename in the Workspace Open a new variable in the Editor to verify the result. Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Can you extract multiple columns/rows? Yes, but only as a block (e.g. can do columns 3-6, but can’t do columns 2, 3, 5 and 7 as one new variable) New Variables

10 Shortcuts Shortcuts The Story: Recommended Activities: Gotchas:
When you are finished with a particular task in MATLAB, you may want to clean up the desktop before moving on to another task. Some useful commands are clear Clears base workspace variables from memory clc Clears the Command Window (moves prompt to the top) close Closes the current figure After you begin to use MATLAB regularly, you will probably find other sequences of commands that you execute repeatedly. MATLAB shortcuts allow you to execute the commands at the push of a button. To create a shortcut, click the Create New Command Shortcut button ( ) on the quick access toolbar. A shortcut consists of a name that explains its purpose and a sequence of commands (called a callback) that are executed when the shortcut button is clicked. After you create your first shortcut, a Shortcuts tab becomes visible on the toolstrip and is always available. You can choose an icon for your shortcut, and organize shortcuts into categories that have their own sections in the Shortcuts tab. Try Create a shortcut button that executes the callback. clear clc close all Add this shortcut to the Quick Access toolbar. All done. Let’s clean up. Cleaning up is a common task – nice to have a quick way to do it. This leads into the next chapter (issuing commands) Work through the try column Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Stress that these particular commands are not reversible! Shortcuts

11 Navigating the Help Browser

12 Help and Documentation
MATLAB help and documentation can show you The various ways to call a function The algorithm implemented by a function Examples of how to use a function Links to related functions Tutorials and background information Basic syntactic help appears automatically after you enter the name of a function and an open parenthesis: To display help on a particular function to the Command Window, type >> help rand >> x = rand( For complete documentation on the function, type >> doc rand The MATLAB Help browser opens to the appropriate documentation. If you want to search or browse the documentation, enter >> doc to open the Help browser to its main page. You can then search by keywords or browse by topic. You can also bring up this syntax help by pressing ctrl+F1. Help is also available “on the fly”. You can highlight any command or function name and press F1 to bring up a pop-up help window on that function. Try Look for help on matrix creation functions. >> doc elmat >> doc gallery Look for help on random numbers. >> help rand >> doc rand Scroll to the “See Also” section and follow the link to rng to see how to control random number generation. >> doc randfun >> docsearch random Available demos >> demos We’ve just seen a big list of matrix functions (plus a couple of others we’ve seen along the way, such as linspace). How do we know what these functions are and what they do and how to use them? Work through try column Recommended Activities: Additional Activities: The Story: Gotchas: FAQs/Common Difficulties: Extras/References: Browse for functions (Shift-F1) isn’t mentioned. Help and Documentation search browse help doc docsearch

13 Importing Data (1/2) Files Command Excel, text, csv, or binary
.xpt in SAS .mat Multimedia, scientific Web, XML Command load → .mat xlsread → .xls / .xlsx /.csv Import Data textscan .xls .csv .txt .xpt dataset

14 Example for importing data
load carbig [num,txt,raw] = xlsread(‘Housing.xlsx'); num : Numeric data, returned as a matrix of double values txt : Text data, returned as a cell array.  raw : Unprocessed data from the worksheet, returned as a cell array.  Filename = csvread('TaiwanVix.csv‘ ,1 ,1); Import Data button

15 Variable Type Logical : 0/1 Ordinal : ordinal category
Char : character Nominal : nominal category Numeric : numerical data Dataset : a table contains title and content Cell : “containers” for data

16 What Is a Table? What Is a Table? Each column is a named variable
In this configuration, you typically want to index into columns by name, and index into rows across all columns (i.e., keep observations together). You can store each column as a separate MATLAB variable, but this makes it harder to index into multiple variables for the same row (observation). A matrix makes it easy to index into entire rows or columns, but does not allow indexing by name, or different data types in different columns. Many data sets conform to a particular tabular arrangement with the following properties: Each column of data has the same type (text, numeric, logical (T/F), etc.). Different columns, however, can have different types. Each column typically has a unique name. MATLAB provides the table data type for storing and manipulating tabular data. Each column has the same number of rows. That is, the data can be thought of some number of observations (rows) of a set of different variables (columns). What Is a Table? IDNumber LastName Sex Age Systolic Diastolic Height Weight Health YPL-320 Smith Male 38 124 93 1.8 80 Excellent GLI-532 Johnson 43 109 77 1.75 74 Fair PNI-258 Williams Female 125 83 1.63 59 Good MIJ-579 Jones 40 117 75 1.7 60 XLK-030 Brown 49 122 54 TFP-518 Davis 46 121 70 1.73 64 LPD-746 Miller 33 130 88 ATA-945 Wilson 115 82 VNL-702 Moore 28 78 LQW-768 Taylor 31 118 86 1.68 QFY-472 Anderson 45 114 58 UJG-627 Thomas 42 68 62 Poor XUE-826 Jackson 25 127 79 TRW-072 White 39 95 1.83 92 ELG-976 Harris 36 1.65 KOQ-996 Martin 48 YUZ-646 Thompson 32 87 XBR-291 Garcia 27 123 KPW-846 Martinez 37 119 1.78 81 Each column is a named variable Each row is a set of observations Mixed data types

17 Bringing Data into Tables
The first step in the statistical analysis process is to bring the data into the MATLAB environment. Common file formats for statistical data include Excel®, delimited text, and specific binary formats such as the SAS transport file format. You can also use the readtable function to import data directly into a table from spreadsheets or delimited text files: >> data = readtable('bodymeasures.csv'); Although the NHANES data is stored in SAS transport files, the same data is provided in different file formats in the course materials to demonstrate importing from these common formats. There are other import functions available in MATLAB and Statistics Toolbox for importing data from other file formats. The output from these functions depends on the function and the data in the file. You can use conversion functions of the form type2table to convert data from other data types to a table. You can also use the table function to build a table from variables in the MATLAB workspace. You can import data from spreadsheets and delimited text files interactively using the Import Tool. For example, to read from an .xpt file, you can use the Statistics Toolbox function xptread: >> dataDS = xptread('demographics.xpt'); This returns data in the form of a dataset array. To convert to a table, use dataset2table: >> data = dataset2table(dataDS); You can generate code from the Import Tool that you can use to import data from the same file or a file with the same structure. Try >> edit importtables Bringing Data into Tables 1 3 2 >> patients = readtable('patients.xlsx','ReadRowNames',true);

18 Storing Data as a Table Variable names A table containing 8 variables
When importing data from a file with the Import Tool, you can specify the type of the output. Choosing the Table option will bring the data into the MATLAB workspace as a single table variable. By default the name of the table will be derived from the file name. Like a matrix, the table will be m-by-n, where m is the number of data observations (rows) and n is the number of variables (columns). The n variables will be named, using header information in the file, if possible. The names of the variables in the table must follow the same rules for names of variables in the workspace. If necessary, the Import Tool will modify the header information to make valid variable names. You can use the readtable function to import data into a table programmatically. Alternatively, you can create a table directly from other variables in the workspace using the table function. You can convert data stored in other data types to tables using conversion functions such as array2table and cell2table. Try Use the Import Tool to import data from smartphoneSpecs.xls as a table. Note that, by default, the first two columns of text are not selected. Change the selection to include all the data. You may want to rename the table from smartphoneSpecs to something shorter (such as phones). You can do this either before or after importing. After importing, open the table in the Variable Editor. Check that the variable names have been correctly inferred from the file. Programmatically read the gas price data as a table. >> gpricetable = ... readtable('gasprices.csv','Headerlines',4); Storing Data as a Table Variable names Manufacturer Name Releasedate Weight Height Width ... Apple iPhone 29-Jun-2007 136.1 11.4 6.1 iPhone 3G 11-Jul-2008 133.0 11.6 6.2 iPhone 3GS 24-Jun-2010 iPhone 4 4-Oct-2010 5.9 iPhone 4S 14-Oct-2011 138.9 Motorola Droid 17-Oct-2009 170.1 Droid 2 12-Aug-2010 169.0 6.0 A table containing 8 variables each with 100 observations 19-by-10

19 Extracting Data from a Table
To perform analysis on the data contained in a table, it is typically necessary to extract the data from the table before passing it to another function (such as plot). You can index into variables in a table using dot (.) notation to reference the variables by name: >> weight = phones.Weight; This extracts the actual variable that was stored in the table. The result will therefore be of the same size and type as that variable. You can also use curly braces ({}) to index into variables in a table. You can index numerically >> hw = phones{:,5:6}; or by name >> hw = phones{:,{'Height','Width'}}; Note that, when you extract multiple variables, all variables must be of the same type. Try Use dot referencing to extract the weights. Note the size and type of the result. >> weight = under150.Weight; Repeat with numeric and named indexing. >> weight = under150{:,3}; >> weight = under150{:,'Weight'}; Is there a correlation between weight and speed? >> speed = under150.Processorspeed; >> scatter(weight,speed) Use numeric indexing to extract the height, width, and depth as a matrix. >> dims = under150{:,4:6} Use named indexing to extract the height, width, and depth as a matrix. >> vars = {'Height','Width','Depth'}; >> dims = under150{:,vars}; Extracting Data from a Table 100-by-1 >> patients.Weight 100-by-8 Manufacturer Name Releasedate Weight Height Width ... Apple iPhone 29-Jun-2007 136.1 11.4 6.1 iPhone 3G 11-Jul-2008 133.0 11.6 6.2 iPhone 3GS 24-Jun-2010 iPhone 4 4-Oct-2010 5.9 iPhone 4S 14-Oct-2011 138.9 Motorola Droid 17-Oct-2009 170.1 Droid 2 12-Aug-2010 169.0 6.0 19-by-10 Remember: “Curly for content”! 19-by-1 >> dims = patients{:,3:5}; >> dims = patients{:, {'Age', 'Systolic', 'Diastolic'}}; 100-by-3

20 Making New Tables from Existing Tables
Extracting Portions of a Table You can index into a table using regular array indexing with parentheses: >> AppleDims = phones(1:5,[2,4:7]); As with any array indexing, the result is a subset of the original data. In this case, AppleDims will be a 5-by-5 table. To index into multiple variables, use a cell array of strings: >> v = {'Name','Weight','Height','Width','Depth'}; >> AppleDims = phones(1:5,v); One benefit of storing data as a table is that you can index into the variables using either a number or a string corresponding to a variable name: >> AppleWeight = phones(1:5,'Weight'); Try View the information for the lightest and heaviest phones. Note the size and type of the result. >> byWeight([1,end],:) Use both numeric and named indexing to create a new table containing just the weights. Note the size and type of the result. >> weightTable = phones(:,4) >> weightTable = phones(:,'Weight') Use numeric indexing to create a new table containing some of the specifications for phones weighing less than 150 grams. >> under150 = byWeight(1:14,[1,2,4:8]); Create the same table using named indexing. >> vars = {'Manufacturer','Name','Weight',... 'Height','Width','Depth','Processorspeed'}; >> under150 = byWeight(1:14,vars) Making New Tables from Existing Tables >> ageData = patients(1:5,[1,3,6:8]); >> ageData = patients(1:5, {'LastName', 'Age', 'Height', 'Weight', 'Health'}); Manufacturer Name Releasedate Weight Height Width ... Apple iPhone 29-Jun-2007 136.1 11.4 6.1 iPhone 3G 11-Jul-2008 133.0 11.6 6.2 iPhone 3GS 24-Jun-2010 iPhone 4 4-Oct-2010 5.9 iPhone 4S 14-Oct-2011 138.9 Motorola Droid 17-Oct-2009 170.1 Droid 2 12-Aug-2010 169.0 6.0 19-by-10 5-by-5 Name Weight Height Width Depth iPhone 136.1 11.4 6.1 1.17 iPhone 3G 133.0 11.6 6.2 1.22 iPhone 3GS iPhone 4 138.9 5.9 0.94 iPhone 4S 170.1 100-by-8 5-by-5

21 Modifying Tables Modifying Tables
As with any array in MATLAB, you can modify tables using indexed assignment: >> phones.Volume = volume; >> phones{:,'Volume'} = volume; >> phones{:,11} = volume; If the specified variable exists in the table, it will be overwritten. If not, it will be added. You can view all the properties (metadata) associated with a table by typing >> phones.Properties Individual properties are referenced using dot notation: >> phones.Properties.VariableNames = ... {'A','B','C','D','E','F','G','H','I','J'}; Try Calculate the volume and density for each phone. Add these to the table as new variables. >> volume = prod(dims,2); >> under150.Volume = volume; >> density = weight./volume; >> under150.Density = density; View the metadata associated with the table of phone specifications. >> under150.Properties Add strings for the units. >> under150.Properties.VariableUnits = ... {'','','g','cm','cm','cm',... 'MHz','cm^3','g/cm^3'} Modifying Tables 100-by-1 >> patients.BMI = patients.Weight./patients.Height.^2; 19-by-1 Manufacturer Name Releasedate Weight ... Volume Apple iPhone 29-Jun-2007 136.1 81.4 iPhone 3G 11-Jul-2008 133.0 87.3 iPhone 3GS 24-Jun-2010 85.0 iPhone 4 4-Oct-2010 63.0 iPhone 4S 14-Oct-2011 138.9 Motorola Droid 17-Oct-2009 170.1 96.8 Droid 2 12-Aug-2010 169.0 96.5 19-by-11 A B C D E F ... Apple iPhone 29-Jun-2007 136.1 11.4 6.1 iPhone 3G 11-Jul-2008 133.0 11.6 6.2 iPhone 3GS 24-Jun-2010 100-by-8 100-by-9

22 Exporting Tables Exporting Tables 'healthyPatients.xls'
You can use the writetable function to export a table to a delimited text file or Excel spreadsheet. By default, writetable uses the file name to determine the appropriate output format: >> writetable(phones,'filename.xls') >> writetable(phones,'filename.txt') Use extra arguments to specify options such as the spreadsheet range or the text delimiter: >> writetable(phones,'filename.xls',... 'Range','C5:L23') >> writetable(phones,'filename.txt',... 'Delimiter','\t') Try Save under150 to an Excel spreadsheet and to a delimited text file. >> writetable(under150,'phoneDensity.xls') >> writetable(under150,'phoneDensity.txt') Exporting Tables >> writetable(healthyPatients, ) 'healthyPatients.xls' 'healthyPatients.txt'

23 Relationship between different types
cell2mat : transform cell array to matrix num2cell / mat2cell : transform numerical data or matrix to cell If X is cell array : X{} → ? X( ) → cell X {3:6} → matrix Date format datenum : transform date to number datestr : transform date number to date string num2cell mat2cell cell2mat ? x{3} x(3) x{3:6} [x{3:6}] , , ,

24 Merging data Use “join” to merge “dataset” variables
Import files with dataset array >> Variables = dataset('XLSFile', 'filename'); Merge two datasets >> New = join(A,B, 'Keys', 'date', 'Type', 'outer', 'MergeKeys',true); Use ”Merge” to combine time series variables Transfer data to time series object >> Variables = fints(date, data, 'var_name'); Merge two time series with date >> New = merge(fints1, fints2); >> join_test.m >> fints_test.m

25 Exercise 1 Create a variable contains S&P500, and money supply in Ex1.xlsx Step 1 : load data with dataset array hint : dataset Step 2 : Convert the date format hint : datenum Step 3 : merge the data (use date as a key) hint : join Try the difference between ‘inner’ and ‘outer’ in join function >> Sol1.m

26 Plot Plenty of plot resources in PLOTS page. Plot, Plot3
Choose a variable in Workspace (One or more) Choose a corresponding PLOT method Plot, Plot3 Scatter, Scatter3 Pie Hist Boxplot Contour

27 ? Measures of Shape Centrality Skewness & kurtosis >> mean
1.1575 Centrality >> mean >> geomean >> harmmean >> Mode >> Median Skewness & kurtosis >> skewness >> kurtosis

28 Measures of Spread Standard deviation Variance
>> std(x) >> std(x, flag) Variance >> var Mean/Median absolute deviation >> mad Interquartile range >> iqr >> quantile(x, 0.3) Range >> range range std iqr mad

29 Correlation Correlation and covariance
>> [RHO,PVAL] = corr(x, y, 'name', value) >> [RHO,PVAL] = corr(X, 'name', value) Type : Pearson, Kendall, Spearman >> RHO = corr(X,Y) >> plotmatrix(X) >> corr(X) plotmatrix 𝑐𝑜𝑣 𝑥 1 , 𝑥 2 =𝐸 𝑥 1 − 𝜇 1 ( 𝑥 2 − 𝜇 2 ) >>descriptive_stat.m

30 Grouped Data and Distribution
Separate the scatter plot by group >>gscatter(x, y, group) >>gplotmatrix(x, y, group) >>boxplot(x) Distribution >>disttool >>dfittool Comparing distributions >>probplot('normal', data.Weight) >> groupeddata.m >> weightdistrib.m

31 Exercise 2 1. Load the dataset array stored in the file daxReturns.mat. 2. Use dfittool to find a suitable distribution for the index returns (first variable). 3. Calculate and display the 1% percentile of the fitted distribution. Bonus Generate a 25-by-1000 matrix of random values from the fitted return distribution. Bonus Transform the random returns into 1000 price series and create a histogram of the final prices. >> Sol2.m

32 Testing Normal Distributions
T-test with single population >>[h,p,ci,stats] = ttest(x, h0, ‘alpha’, alpha) Paired T-test >>[h,p,ci,stats] = ttest(x, y, ‘alpha’, alpha) Multiple samples with equal population >>[h,p,ci,stats] = ttest2(x, y, ‘alpha’, alpha) Multiple samples without equal population >>[h,p,ci,stats] = ttest2(x, y, ‘alpha’, alpha, ‘Vartype’, ‘unequal’) Means Variances One population ttest ztest vartest Multiple populations ttest2 vartest2 vartestn >> hypo_test.m

33 Analysis of Variance One-way ANOVA Multi-way ANOVA
>>[p,table,stats] = anova1 (X, group) Multi-way ANOVA >>[p,table,stats] = anovan (y, group) Y must be vector H0: groups all have same mean H1: no they don’t >> p = anova1(data.Height,data.Overweight);

34 What is regression? Type of predictive modeling
Specify a model that describes Y as a function of X Estimate a set of coefficients that minimizes the difference between predicted and actual Typically, minimize the sum of the squared errors (the sum of the squared residuals) y = mx +b Observed Value Predicted Distance

35 Types of Regression: Linear Regression
Implies that Y is a linear function of the regression coefficients Common examples: Straight line 𝑌=𝐵0+𝐵1𝑋1 Plane 𝑌=𝐵0+𝐵1𝑋1 +𝐵2𝑋2 Polynomial 𝑌=𝐵0+𝐵1𝑋13+𝐵2𝑋12 +𝐵3𝑋1 Polynomial with cross terms 𝑌=𝐵0+𝐵1𝑋12+𝐵2(𝑋1∗𝑋2)+ 𝐵 3 𝑋22

36 Regression (Least-Square)
Method 1: >>[b,bint,r,rint,stats] = regress(y,X,alpha) Method 2: fitlm – 估計設定好的模型 >>mdl = fitlm(ds,modelspec) >>mdl = fitlm(X,y) predict – 將估計出的模型對新樣本做預測 >>[ypred,yci] = predict(mdl,Xnew) Method 3: Curve Fitting Toolbox

37 Exercise 3 Find the predictor of calories with regression
Step 1 : load the data Hint : cereal.m Step 2 : Define independent and dependent variables Hint : dependent - Calories independent - Fat, Sugar, Protein… Step 3 : fit the linear model Step 4 : See the result and see the plot the residual >> Sol_Cereal.m

38 Function Index Import Data Visulization ANOVA xlsread Csvread dataset
plot, plot3 scatter, scatter3 hist pie plotmatrix anova anovan Regression regress fitlm fitglm Variable Type Descriptive Statistics num2cell mat2cell cell2mat datenum datestr fints mean, geomean, harmmean mode, median, skewness, kurtosis std, var iqr, range corr cov Random rand randn randi Plot Data Processing Hypothesis test gplotmatrix gscatter join merge ztest, ttest, ttest2 vartest, vartest2, vartestn

39 Useful resources MATLAB Academic portal –www.mathworks.com/academia
–Video and text tutorials, books, etc. MATLAB documentation –In product or at MATLAB Central –File exchange, newsgroups, blogs, contests, …

40 Thanks for your attention


Download ppt "MATLAB Statistic Basic"

Similar presentations


Ads by Google