APPLIED DATA ANALYSIS/ANALYTICS using STATA M.A.Isiaka FCMA, ACA, CIIA, ANIMN, PhD Department of Economics, Accounting & Finance College of Management.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
IB Math Studies – Topic 6 Statistics.
Data analysis: Explore GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 9.
Analysis of Research Data
Measures of Dispersion
SHOWTIME! STATISTICAL TOOLS IN EVALUATION DESCRIPTIVE VALUES MEASURES OF VARIABILITY.
IB Math Studies – Topic 6 Statistics.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc Department of Surgery Department of Clinical Epidemiology and Biostatistics March 18, 2009.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
AP Stats Chapter 1 Review. Q1: The midpoint of the data MeanMedianMode.
Chapter 2 Describing Data.
Skewness & Kurtosis: Reference
1 Copyright © Cengage Learning. All rights reserved. 3 Descriptive Analysis and Presentation of Bivariate Data.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
Statistics with TI-Nspire™ Technology Module E Lesson 1: Elementary concepts.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
Economics 111Lecture 7.2 Quantitative Analysis of Data.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Data Presentation Numerical Summary Measures Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Prof. Eric A. Suess Chapter 3
Thursday, May 12, 2016 Report at 11:30 to Prairieview
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Basic Statistics Statistics in Engineering (collect, organize, analyze, interpret) Collecting Engineering Data Data Presentation and Summary Types of.
Quantitative Data Continued
MATH-138 Elementary Statistics
Analysis and Empirical Results
Chapter 3 Describing Data Using Numerical Measures
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Correlation – Regression
DEPARTMENT OF COMPUTER SCIENCE
Description of Data (Summary and Variability measures)
Univariate Descriptive Statistics
Laugh, and the world laughs with you. Weep and you weep alone
Describing Location in a Distribution
Descriptive Statistics:
Chapter 3 Describing Data Using Numerical Measures
Week 5 Lecture 2 Chapter 8. Regression Wisdom.
An Introduction to Statistics
1.2 Describing Distributions with Numbers
Organizing and Displaying Data
Basic Statistical Terms
Describing Distributions with Numbers
Displaying and Summarizing Quantitative Data
Number of Hours of Service
Program This course will be dived into 3 parts: Part 1 Descriptive statistics and introduction to continuous outcome variables Part 2 Continuous outcome.
Honors Statistics Review Chapters 4 - 5
15.1 The Role of Statistics in the Research Process
A Story of Functions Module 2: Modeling with Descriptive Statistics
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
SPSS (Statistical Package for Social Science)
Advanced Algebra Unit 1 Vocabulary
Exercise 1: Entering data into SPSS
Business and Economics 7th Edition
Descriptive and elementary statistics
Displaying the Order in a Group of Numbers Using Tables and Graphs
Presentation transcript:

APPLIED DATA ANALYSIS/ANALYTICS using STATA M.A.Isiaka FCMA, ACA, CIIA, ANIMN, PhD Department of Economics, Accounting & Finance College of Management Sciences Bells University of Technology, Ota, Ogun State, Nigeria.

STATA Environment

Inspecting and Describing Data Load Data Set-1 Use list command to display the two variables (price & quantity) Use browse command to open spreadsheet of the data in a new window.

Examine Descriptive Statistics Use summarize command to display number of observations, mean, standard deviation, minimum, maximum.

Plot of the Data Use plot command to display y- variable(Vertical axis) and x- variable(Horizontal axis).

Generating new Variables Use generate command to create log of the variables. Use list command to view all the new variables. What are your observations? Use plot command to view the relationship between the log variables. Repeat the plot using: gr7 command Scatter command Line command twoway connect Comment briefly on the results.

Basic Regression Use regress command with robust option on the base variables and the transformed variables. Use predict command to obtained the estimated values of the dependent variable. Display the original and estimated values of the dependent variables.

Analysis of Survey Data Load the census dataset Codebook Idunique identifier FullnameFirstname LASTNAME Ageage in years Gender1=male, 0=female Smoke0=non smoker, 1-smoker BloodtypeA,B,O, AB RaceRace (white, black, Hispanic, other) Weightweight in kg Heightheight in centimeters Diabetes0=no diabetes, 1=diabetes Hand0=right handed, 1=left handed Dentistnumeric, number of visits Use describe command examine the nature of the variables. How many respondents?

Examine Box Plot Use graph box with by() option to examine the distribution of weight for male and female respondents. Do the same for height. Generate bmi as weight divided by square of height multiply by Based on box plot, which gender has higher bmi.

Constructing Histograms Construct histogram of BMI frequency by smoke status. Are the BMI symmetrical for both smokers and nonsmokers? Using histogram and box plot, examine the age distribution of the respondents.

Further Practices based on Dataset-3. How many observations are in this dataset? How many variables are in this dataset?

Further Practices based on Dataset-3.. Using the information from describe and from the data browser please identify the type of each variable.

Further Practices based on Dataset-3... Generally apgar scores 3 or below are considered critical, scores 4-6 are considered low and scores 7+ are considered normal. Please create a new variable named status with the values 1=critical, 2=low and 3=normal. What type of variable is status? nominal categoricalordinal categoricalcontinuous numericdiscrete numeric

Further Practices based on Dataset i.What percent of babies are considered “critical” according to the apgar5 score? ii.What percent of babies are considered “normal” according to the apgar5 score? iii. What is the mean systolic blood pressure? iv. What is the median systolic blood pressure? v. What is the 90% centile of systolic blood pressure? vi.What is the range for systolic blood pressure? vii.What is the interquartile range (Q3-Q1)?

Further Practices based on Dataset a)What is the variance? b)What is the standard deviation? c)What is the Skewness? d)What is the Kurtosis? e)25% of observations have a systolic blood pressure below _______. f)What is the mean systolic blood pressure for babies with a diagnosis of germinal matrix hemorrhage? (hint: use “summarize by”, review the example of age summary by smoking status) g)What is the mean systolic blood pressure for babies without a diagnosis of germinal matrix hemorrhage?

THANK YOU