Presentation is loading. Please wait.

Presentation is loading. Please wait.

Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R.

Similar presentations


Presentation on theme: "Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R."— Presentation transcript:

1 Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R

2 Understanding Data Observation ID DateTminWindType CropHealth Status 110/15/200925Type1Poor 211/01/200934Type2Improved 310/21/200928Type1Excellent 410/28/200952Type1Poor identifier date variable continuous variable nominal variable ordinal variable/ categorical Variable Data has different types and structure Data is facts and statistics collected together for reference or analysis

3 Importance of Data Type and Data Structure  The first step in any data analysis is the creation of a dataset from data sources.  Data sources can include text files, spreadsheets, statistical packages, and database management systems.  R contains a wide variety of structures for holding data, including scalars, vectors, arrays, data frames, and lists.  A small investment in learning Data Structure and Data Type will pays off in the long run.  You are going to use R for a years from now, so it’s better to learn a little bit about the underlying data type and data structures in R.

4 Data Type A data type is a (potentially infinite) class of concrete objects that all share some property. Example integers :[15,50,67] character data :[lion, cat, dog, food] R has a wide variety of data types including double, integer, complex, logical, character, factor, dates and times, Missing data (NaN) and Infinity

5 Data Type: Integer Integers are natural numbers. Integers are whole number with no decimal point with or without (+/-) sign.

6 Data Type: Double The Double data type provides the largest and smallest possible magnitudes for a number. Doubles are numbers like 145.335, 6.67 and 9.81.

7 Data Type: Complex A complex variable or value is usually represented as a pair of floating point numbers. In statistical data analysis you will not need them often for scientific calculations.

8 Data Type: Logical A logical data item is a primitive data structure that can assume the value of either “TRUE” or “False” Most commonly used logical operators are and, or and not represented by &, — and !, respectively.

9 Data Type: Character A character object is used to represent string values in R. A character object is represented by a collection of characters between double quotes (“ ”).

10 Data Type: Factor Conceptually, factors are variables in R which take on a limited number of different values; such variables are often referred to as categorical variables. Factor objects can be created from character objects or from numeric objects, using the function factor.

11 Data Type: Dates and Times Time series data and observational data needs calendar dates and times for analysis A character object is represented by a collection of characters between double quotes (“ ”).

12 Data Type: Missing data Missing data in R appears as NA(Not Available). NA is not a string or a numeric value, but an indicator of missing value. Recoding missing values Weather data or some data formats often use codes such as - 9999 for not applicable (NA) values.

13 Data Type: Infinite values In R there are two function to check infinite values, is.finite and is.infinite. The first function returns TRUE if the number is finite; the second one returns TRUE if the number is infinite.

14 Lets talk about data structure

15 Data Structure Data structure are the ways of organizing data in R. Data structures play a central role in the data analysis using R You interact with data structures even more often than with function written in R. R has a wide variety of data structure including vectors (numerical, character, and logical), matrices, arrays, data frames, lists

16 Data Structure: Vectors Vectors can be thought of as contiguous cells containing data. Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. a is numeric vector, b is a character vector, and c is a logical vector.

17 Data Structure: Matrices and arrays Matrices are nothing more than 2-dimensional vectors. Arrays are similar to matrices but can have more than two dimensions. Arrays are one of the most efficient data structures in inserting, retrieving, indexing data. Arrays are a natural extension of matrices.

18 Data Structure: Data Frame A data frame is more general than a matrix in that different columns can contain different modes of data (numeric, character, etc.).

19 Data Structure: List Lists are the most complex of the R data types. A list allows you to gather a variety of (possibly unrelated) objects under one name.

20 Overview of data Structure

21 Questions ? Please practice commands in R


Download ppt "Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week2: Data Structure, Types and Manipulation in R."

Similar presentations


Ads by Google