Analysis of the results of the experiences conducted by FAO in the use of GPS for crop area measurement Elisabetta Carfagna

Slides:



Advertisements
Similar presentations
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Advertisements

Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Simple Linear Regression and Correlation
Heteroskedasticity The Problem:
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
Objectives (BPS chapter 24)
Sociology 601, Class17: October 27, 2009 Linear relationships. A & F, chapter 9.1 Least squares estimation. A & F 9.2 The linear regression model (9.3)
Lecture 4 This week’s reading: Ch. 1 Today:
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Lab 4: What is a t-test? Something British mothers use to see if the new girlfriend is significantly better than the old one?
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Multiple regression analysis
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
* Obviously, the pattern of the points in the sample does not match the pattern of the population.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Chapter 19 Data Analysis Overview
© 2004 Prentice-Hall, Inc.Chap 10-1 Basic Business Statistics (9 th Edition) Chapter 10 Two-Sample Tests with Numerical Data.
Chapter 11: Inference for Distributions
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Basic Business Statistics (9th Edition)
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
By Jayelle Hegewald, Michele Houtappels and Melinda Gray 2013.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
EDUC 200C Section 5–Hypothesis Testing Forever November 2, 2012.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
Projet GCP/INT/903/FRA ‘‘Appui au Programme de Renforcement des Systèmes d’Information et de Statistiques Rurales en Afrique’’ DRAFT SYNTHESIS OF MAIN.
Returning to Consumption
Experiments I will try and post the slides used in class each week on my website
NONPARAMETRIC STATISTICS
Topic 5 Statistical inference: point and interval estimate
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill 2000
Econ 314: Project 1 Answers and Questions Examining the Growth Data Trends, Cycles, and Turning Points.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Synthesis of the main conclusions and recomandations Group 1.
POSSIBLE DIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY 1 What can you do about multicollinearity if you encounter it? We will discuss some possible.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Lecture 5. Linear Models for Correlated Data: Inference.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Essential Statistics Chapter 171 Two-Sample Problems.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Multiple Independent Variables POLS 300 Butz. Multivariate Analysis Problem with bivariate analysis in nonexperimental designs: –Spuriousness and Causality.
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
Chapter 18 Data Analysis Overview Yandell – Econ 216 Chap 18-1.
QM222 Class 9 Section A1 Coefficient statistics
Y - Tests Type Based on Response and Measure Variable Data
Basic Practice of Statistics - 3rd Edition Two-Sample Problems
Inference for Regression Slope
Presentation transcript:

Analysis of the results of the experiences conducted by FAO in the use of GPS for crop area measurement Elisabetta Carfagna University of Bologna, Department of Statistics FAO Addis Abeba, 27 – 28 November 2008

Aim of the Aim of the experiences project GCP/INT/903/FRA Statistics Division of FAO pilot surveys in Cameroon, Niger, Madagascar and Senegal Aim: –Assessing the capability of measuring areas on the ground with good accuracy with a standard GPS

Main characteristics of the data set A statistical sampling technique not adopted for selecting the sample of 207 plots (purposive sample) kinds of GPS used: –Garmin 12 xl (G12) –Garmin 72 (G72) –Garmin 60 (G60) –Garmin Etrex Ventura (GE) –Magellan Explorist 400 (M400)

Countries Cameroun 36 plots measured with G60, G72 and M400 Niger 46 plots with G12, G72, M400, GE (45). Senegal, 75 plots measured with G60, G72, M400 Madagascar 86 plots allocated but measures with GPS available only for 50 plots: 24 plots with GE only and 26 plots with G72 only. –Other measurements not available due to technical difficulties, for example problems with the signal? Considering also the 50 plots in Madagascar, 207 plots In many cases, the measurement with a kind of GPS repeated three times

Size of plots and tree canopy cover Size of plots ranges from 27 to 34,700 square meters, median , mean , standard deviation Tree canopy cover: 28 have dense cover (cover = 1) 5 have partial cover (cover = 2) 124 have no cover (cover = 3) Total 157 (no information about tree canopy is reported for Madagascar) Partial tree canopy cover is very little represented; thus we cannot assess if this kind of cover affects the measurement through GPSs.

Tree canopy cover by country Camerun –28 plots with dense cover measured with G60, G72, M400. –5 plots with partial cover with G60, G72 and M400 –3 plots with no cover with G60, G72 and M400. Niger –no plots were with dense cover and with partial cover –46 plots with no cover measured with G12, G72, M400 (45 with GE) Senegal –no plots with dense cover and with partial cover –75 plots with GPS G60, G72, M400.

Weather conditions On 18 plots with cloudy weather (climate 1) on 5 plots is raining (climate 2) on 182 is sunny (climate 3) We cannot say much about weather conditions different from sunny

Position For 205 plots information regarding the position available: –172 plain (position 4) –5 plots on the top of a hill (position 1) –11 on the side of a hill (position 2) –17 at the feet of the hill (position 3). We can draw conclusions valid almost only for plots on the plain

Position by country Camerun –All 36 plots on a plain Madagascar –4 plots G72, 1 GE are on the top of the hill –5 plots G72, 6 GE on the side of a hill –8 plots G72, 9 GE at the feet of the hill –8 plots G72, 9 GE on a plain. Niger –all 46 plots on the plain, measured with G12, G72 and M400, 45 with GE. Senegal –all the 75 plots on the plain, G60, G72 and M400.

Summary statistics all data set Difference between the are measured by meter and compass and the area measured by GPS –Mean 98 square meters, median 68 Relative difference = the difference divided by the real measure –Mean 8.3% median 3.7% –The area measured with compass and meter is generally larger than with GPS

Estimate the real measures through the measures made by GPS with a linear regression model R-squared = Coef. Std. Err. t P>|t| [95% Conf. Interval] super | cons | Prob > F =

How precise can be the measure of a plot area given by the GPS receiver? It depends on the tree canopy cover Plotsall coverdense coverno cover 5% 0.7% 3.0%0.6% 10% 1.4% 5.3%1.2 % 25% 3.6%17.8%3.1% 50%10.5%37.2%8.7 %

Does the plot size affect the precision of measurements? No evidence Difference Relative difference

Does the plot size affect the precision of measurements? 3 clusters Let us identify clusters with plots with similar area

Does the plot size affect the precision of measurements? 3 clusters cluster 1 medium size Obs Mean Std. Dev. Min Max __________________________________________________________ cluster 2 small size Obs Mean Std. Dev. Min Max __________________________________________________________ cluster 3 large size Obs Mean Std. Dev. Min Max

Difference Small size Medium size Large size

Relative difference Small size Medium size Large size

Has the type of GPS receiver an impact on the accuracy? Yes The Garmin 60 is the only GPS used in the FAO experiences which has produced almost unbiased measures. Variable | Obs Mean Std. Dev. Min MaxMedian s_1 | super | difference| rel_diff |

Estimate the compass measures through the measures made by GPS with a linear regression model G s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | cons | Prob > F = R-squared =

Estimate the compass measures through the measures made by GPS with a linear regression model G60 medium plot size s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | cons | Prob > F = R-squared =

Estimate the compass measures through the measures made by GPS with a linear regression model G60 small plot size s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | cons | Prob > F = R-squared = Better R-squared than with plots of medium size >

G60 measurement 1 Student’s t test for two variables observed on one sample Normal distribution of the difference (real measure minus measure with GPS) is assumed the variances of the two variables on the sample are assumed to be equal, although unknown. One-sample t test Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Differ~e | Degrees of freedom: 108 Ho: mean(Difference) = 0 Ha: mean 0 t = t = t = P |t| = P > t =

Non parametric tests Test if the pared differences have median zero Assumption: the differences are continuous random variables, symmetric, independent and with the same median. Wilcoxon signed-rank test Two-sided test: Ho: median of compass – G60 = 0 vs. H1: median of compass – G60 different from 0 Pr(number of positive >= 55 or number of negative >= 55) = min(1, 2*Binomial(n = 106, x >= 55, p = 0.5)) = Ho not refused Ho: S_1 = SUPER z = Prob > |z| = Ho not refused

G12 All measurements made on plots without tree canopy cover Variable | Obs Mean Std. Dev. Min MaxMedian s_1 | super| Difference| Rel_diff |

G12 One-sample t test Ho: mean(Difference) = 0 Ha: mean 0 t = t = t = P |t| = P > t = Wilcoxon signed-rank test Ho: s_1 = super z = Prob > |z| = Sign test refused

G12 regression S_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] SUPER | cons | R-squared = Better R-squared than with G > But only in Niger and without tree canopy cover

M400 M400 Parametric and non parametric tests refused regression S_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] SUPER | cons | R-squared = Not as good as R-squared with G <

G72 G72 Parametric and non parametric tests refused regression s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | cons | R-squared = Not as good as R-squared with G < Difference Mean 81 median 46 Relative difference Mean 8.3% median 3.2%

G60 measurement by tree canopy cover

Estimate compass measures by G60 with dense tree canopy cover s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | _cons | Prob > F = Adj R-squared =

Estimate compass measures by G60 without tree canopy cover s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] super | _cons | Prob > F = Adj R-squared =

Does the precision of measurement improve repeating the measurement? R squared for measurement 1, 2, 3 G G G GE M

Can we trust what the plot workers declare? NO s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] s_11 | cons | R-squared =

Can we trust what the field enumerators declare? No s_1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] s_12 | cons | Prob > F = R-squared =

How faster is it to measure areas with GPS? Traditional method long measuring an area with a GPS takes the time of walking around the plot, possible additional manipulations Magellan 400 GPS measures more than 4 times shorter than the traditional measures For large plots, this ratio can go up to 17 times No significant differences amongst the receivers tested Magellan 400 more complex to use according to the field enumerators.

Thank you for your kind attention