Download presentation

Presentation is loading. Please wait.

Published byAlexis McConnell Modified over 2 years ago

1
1 Estimation of Finite Population Mean Using Ranked Set Two-stage Sampling Design By U C Sud and Dwidesh Mishra IASRI, New Delhi

2
2 The method of Ranked Set Sampling (RSS) was first introduced by McIntyre (1952) as a cost-efficient alternative to simple random sampling for situations where outside information is available allowing one to rank small sets of sampling units according to the character of interest without actually quantifying the units.The method of Ranked Set Sampling (RSS) was first introduced by McIntyre (1952) as a cost-efficient alternative to simple random sampling for situations where outside information is available allowing one to rank small sets of sampling units according to the character of interest without actually quantifying the units. McIyntyre was concerned with estimating agricultural yields where the ranking could be done on the basis of visual inspection.McIyntyre was concerned with estimating agricultural yields where the ranking could be done on the basis of visual inspection. One of the strengths of the method, however, is that its implementation and performance require only that ranking be possible but they do not depend in any way on how the ranking is accomplishedOne of the strengths of the method, however, is that its implementation and performance require only that ranking be possible but they do not depend in any way on how the ranking is accomplished Introduction

3
3 The Method of RSS A basic cycle of the method involves the random selection of m 2 units from the population. These units are randomly partitioned into m subsets, each containing m sampling units. The members of every subset are ranked according to the character of interest.A basic cycle of the method involves the random selection of m 2 units from the population. These units are randomly partitioned into m subsets, each containing m sampling units. The members of every subset are ranked according to the character of interest. Then the lowest ranked member is quantified from the first set, the second lowest ranked member is quantified from the second set, and so on until the highest ranked member of the last set is quantified.Then the lowest ranked member is quantified from the first set, the second lowest ranked member is quantified from the second set, and so on until the highest ranked member of the last set is quantified. This yields m quantification from among the m 2 selected units. Since m is usually taken as small in order to facilitate the ranking, there may not be enough measurements for reasonable inference and the basic cycle is repeated r times to give n=mr quantifications out of r selected units.This yields m quantification from among the m 2 selected units. Since m is usually taken as small in order to facilitate the ranking, there may not be enough measurements for reasonable inference and the basic cycle is repeated r times to give n=mr quantifications out of r selected units.

4
4 Let us take a set-size m=3 with r=4Let us take a set-size m=3 with r=4 Then the sampling scheme can be shown by the following diagramThen the sampling scheme can be shown by the following diagram Here each row indicates a judgement ordered sample for each cycle. Encircled units are quantified. Out of 36 units drawn, 12 units have been quantifiedHere each row indicates a judgement ordered sample for each cycle. Encircled units are quantified. Out of 36 units drawn, 12 units have been quantified CycleRank

5
5 Contd. Let X 11, X 12,…, X 1m, X 22,…,X 2m,…,X m1,…,X mm be independent random variables all having the same cumulative distribution function F(x). Also letLet X 11, X 12,…, X 1m, X 22,…,X 2m,…,X m1,…,X mm be independent random variables all having the same cumulative distribution function F(x). Also let X i(1), X i(2),…, X i(m) denote the corresponding order statistics of, X i1,…,X i2,…,X ii,…,X imX i(1), X i(2),…, X i(m) denote the corresponding order statistics of, X i1,…,X i2,…,X ii,…,X im (i=1,2,…,m). Then X 1(1), X 2(2),…, X m(m) is the ranked set sample (considering one cycle only), since X i(i) is the i-th order statistic in the i-th sample.(i=1,2,…,m). Then X 1(1), X 2(2),…, X m(m) is the ranked set sample (considering one cycle only), since X i(i) is the i-th order statistic in the i-th sample. The value X ij for the randomly drawn units can be arranged as in the following diagram:The value X ij for the randomly drawn units can be arranged as in the following diagram: Set Set

6
6 Contd. After ranking the units appear as:After ranking the units appear as: The quantified units appear as

7
7 Examples RSS is very useful in environmental and ecological sampling where exact measurement (or quantification) of a selected unit is either difficult or expensive in terms of time, money or labor, but where ranking of a small set of selected units according to the characteristic of interest can be done with reasonable success on the basis of visual inspection or other rough method not requiring actual measurement.RSS is very useful in environmental and ecological sampling where exact measurement (or quantification) of a selected unit is either difficult or expensive in terms of time, money or labor, but where ranking of a small set of selected units according to the characteristic of interest can be done with reasonable success on the basis of visual inspection or other rough method not requiring actual measurement. Thus if the interest lies in estimating the mean height of the sampled trees, then measurement of the height of the trees could pose a problem, but it would be relatively easy to rank small sets of trees on the basis of visual inspection.Thus if the interest lies in estimating the mean height of the sampled trees, then measurement of the height of the trees could pose a problem, but it would be relatively easy to rank small sets of trees on the basis of visual inspection. In situations where visual inspection is not directly available ranking can be done on the basis of a covariate that is more accessible and also correlated with the character of interest.In situations where visual inspection is not directly available ranking can be done on the basis of a covariate that is more accessible and also correlated with the character of interest. Thus for estimating volume of trees one can carry out ranking on the basis of diameter of the trees.Thus for estimating volume of trees one can carry out ranking on the basis of diameter of the trees.

8
8 Performance of the RSS estimator is generally benchmarked against that of simple random sampling (SRS) estimator with the same number of quantifications. For this purpose, one may employ either the relative precision,Performance of the RSS estimator is generally benchmarked against that of simple random sampling (SRS) estimator with the same number of quantifications. For this purpose, one may employ either the relative precision, Or the relative savings,Or the relative savings, There was little follow up on McIntyres (1952) proposal until late 1960s when Hall and Dell (1966) published a field evaluation and Takahasi and Wakimoto (1968) developed the statistical theory for the RSS method. When sampling is from a continuous population and the ranking is perfect, Takahasi and Wakimoto proved that is unbiased for and is at least as efficient as.There was little follow up on McIntyres (1952) proposal until late 1960s when Hall and Dell (1966) published a field evaluation and Takahasi and Wakimoto (1968) developed the statistical theory for the RSS method. When sampling is from a continuous population and the ranking is perfect, Takahasi and Wakimoto proved that is unbiased for and is at least as efficient as. Theory of RSS

9
9 They also obtained the variance of the RSS estimator asThey also obtained the variance of the RSS estimator as where is the population variance and is the expected i-th out of m order statistic from the population. They also established the boundwhere is the population variance and is the expected i-th out of m order statistic from the population. They also established the bound or or The upper bound indicates that ranked set sampling can result in very substantial savings when compared with simple random sampling. Specifically, the method can result in savings in the number of quantifications by as much as 33, 50, 60, 67 percent when m=2, 3, 4, 5 respectively.The upper bound indicates that ranked set sampling can result in very substantial savings when compared with simple random sampling. Specifically, the method can result in savings in the number of quantifications by as much as 33, 50, 60, 67 percent when m=2, 3, 4, 5 respectively. Contd.

10
10 Review Stokes (1979) considered the use of concominant variable at the estimation stage in the context of RSSStokes (1979) considered the use of concominant variable at the estimation stage in the context of RSS Stokes (1980) dealt with the problem of estimation of population varianceStokes (1980) dealt with the problem of estimation of population variance Dell and Clutter (1972) considered the problem of ranking errorsDell and Clutter (1972) considered the problem of ranking errors Philip and Lam (1997) developed a regression estimator for RSSPhilip and Lam (1997) developed a regression estimator for RSS

11
11 RSS in the Context of Finite Population Sampling Early developments in RSS were concerned with sampling from infinite population. Patil et al. (1994) were the first to consider the situation of sampling from finite population. Explicit expressions were obtained for the variance of the RSS estimator and for its precision relative to that of simple random sampling without replacement. Krishna (2002) extended the theory of RSS to the case of sampling from a finite population by utilising a Horvitz-Thomson estimator for the estimation of the finite population mean. Calculation of Calculation of is tedious

12
12 Three different cases have been studied. In the first case the SRS is used at the 1st stage of sampling and RSS at the 2nd stage of sampling. Similarly, the RSS is used at 1st stage and SRS at the 2nd stage in second case. In the third case the RSS is used in both the stages of sampling. In each of the cases efficiency comparisons of RSS based estimators have been made with SRS based estimators with the help of real data when the sampling is SRS at both the stages of sampling. Let there be a finite population of N primary stage units, a-th primary stage unit is of size M. Let be the value of unit pertaining to b-th secondary stage unit (ssu) of a-th primary stage unit (psu). However, the contributions made by Patil et al. (1994) and Krishna (2002) were limited to the case of uni-stage sampling designs. RSS for Two – stage sampling designs RSS for Two - Stage Sampling Design

13
13 = mean per ssu in the a-th psu = Population mean Case 1: SRS at first stage and RSS at second stage Let a sample of size n be drawn from N by SRSWOR. Also, let a set of size m be selected at random and without replacement from M using RSS. Without any loss of generality we assume that Contd.

14
14 Case 1: SRS at first stage and RSS at second stage Define the eventDefine the event such that the k-th ranked unit in the subset is the s-th ranked unit in the population of ssu. Also write, - and let denote the - dimensional column vector having as its s-th component

15
15 Contd. It may be noted thatis given by If is the quantification of the k-th ranked unit from the set, then

16
16 Contd.

17
17 Contd. is the component wise square of Next, we study the joint distribution of the order statistics from two disjoint sets. Let two disjoint sets each of size be drawn without replacement from Write for the event that the k-th ranked unit from set 1 has rank s and the j-th ranked unit from set 2 has rank t in the population of size We define

18
18 Following Patil et al. (1994), it may be seen that Let be the matrix withas its (s,t)th component. Notice that, since.Letand. be the quantification of the k-th and j-th ranked units from set 1 and set 2, respectively. Then, Contd.

19
19 Contd. is given byThe covariance between

20
20 Contd. Let mr sets, each of size m, be selected randomly using RSS and without replacement from the a-th psu. Let the lowest ranked unit be quantified in each of the first r sets- In each of the next r sets, the second ranked unit is quantified to give: This process continues until the highest ranked unit is quantified in each of the last r sets:

21
21 Contd. Theorem 1, The estimator is unbiased and variance ofis given by

22
22 Proof of the results The matrix is symmetric with zeroes on the diagonal, it is calculated by Proof: To prove that the estimator is unbiased, we proceed as follows: A program has been made in the language Turbo C to calculate T A program has been made in the language Turbo C to calculate T

23
23 Contd.

24
24 Contd. After centering

25
25 Contd.

26
26 Assume that a sample of size m is selected by SRSWOR from the a-th psu a=1,2,…,N. Further, we assume that a set of size n is selected from N by RSS. Also, as in Case 1, we assume that the psus are increasingly arranged. Define the event such that the a-th ranked unit in the subset is the s-th ranked unit in the population of psus. Define be therow vector having Case2: RSS at first stage and SRS at second stage

27
27 Contd. as its s-th component = sample mean for the a-th psu. s=1,2,…,N; a=1,2,…,n

28
28 Contd. To study the joint distribution of the order statistics from disjoint sets each of size n drawn by without replacement using RSS, let be the event that the a-th ranked unit from set 1 has rank s in the population and the c-th ranked unit from set 2 has rank t in the population.

29
29 Contd. Let and be the quantification of the a-th and c-th ranked units from set 1 and set 2, respectively. Then, Moments of the estimator of population mean: Let nr sets each of size n be selected randomly and without replacement from a population of N psus. Let the lowest ranked unit be quantified in each of the first r sets

30
30 Contd. Similarly, in each of the next r sets, the second ranked unit is quantified to give This process continues until the highest raked unit is quantified in each of the last r sets: Thus, the proposed estimator of population mean, when the sample at the first stage is selected by RSS and at the second stage by SRS, is given by

31
31 Case III: RSS at both the stages On the same lines as in case 1, it can be show that is unbiased and the variance of = + Case3 : RSS at both the stages

32
32 For the purpose of comparing the RSS and the SRS based estimator an empirical study was carried out where in a part of the data of wheat crop for an experimental station as given in Singh et al. (1979) was taken. The data comprised 9 fields each field having 4 plots. (Set I). (The population values of were and respectively). For RSS protocol, plots in each field were ranked according to the perceived weight of wheat yield. Using this data, estimators of population mean based on RSS and SRS were considered for the three cases dealt with earlier. 3. Empirical Study

33
33 were and respectively). The data comprised 9 blocks and 4 societies in each of the block. Finally data on number of persons in a household given in Raj (1971) was also utilized to compare the performance of RSS and SRS based estimators. (Set III). (The population values of were 7052 and respectively). Here also the data comprised 9 households and 4 persons in a household Another data set given in Singh and Mangat (1996) on outstanding loans of farmers affiliated to cooperatives was utilized to compare the performance of RSS and SRS based estimators. (Set II). The population values of

34
34 Table 2.1 Per cent gain in precision of RSS based estimators over SRS based estimators CaseStageDesignEstimatorS.E. of the estimator Per cent gain in precision Set I 11SRS RSS 21RSS SRS 31RSS RSS 41SRS SRS

35
35 Set II 11SRS RSS 21RSS SRS 31RSS RSS 41SRS SRS

36
36 Set III 11SRS RSS 21RSS SRS 31RSS RSS 41SRS SRS

37
37 References: Dell, T.R. and Clutter, J.L.(1972). Ranked set sampling theory with order statistics background. Biometrics, 28, Halls, L.K. and Dell, T.R. (1966). Trail of ranker set sampling for forage yields. Forest Science, 12, Krishna, Pravin (2002). Some aspects of ranked set sampling from finite population. M.Sc.Thesis of I.A.R.I., New Delhi-12. McIntyre, G A (1952). A method of unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3, Patil, G.P., Sinha, A. K. and Taillie, C. (1993). Ranked set sampling from a finite population in the presence of a trend on a site. Journal of Applied Statistical Science. Vol.1, No. 1, Patil, G.P., Sinha, A. K. and Taillie, C. (1994). Ranked set sampling. Handbook of Statistics. 12, (eds. Patil, G. P. and Rao, C. R.), , North-Holland, Amsterdam. Patil, G.P., Sinha, A. K. and Taillie, C. (1995). Finite population corrections for ranked set sampling. Annals of Institute of Statistical Mathematics. Vol.47, No. 4,

38
38 Raj, D. (1971). The Design of Sample Surveys. Mcgraw-Hill Book Co., New York. Singh, D., Singh, P. and Kumar, P. (1979). Hand Book on Sampling Methods. Indian Agricultural Statistics Research Institute, New Delhi. Singh, R and Mangat, N.P.S. (1996). Elements of Survey Sampling. Kluwer Academic Publisher, pp 388. Stokes, S L (1977). Ranked set sampling with concominant variables. Communication in statistics, Theory and Methods, 6, Stokes, S L (1980). Estimation of variance using judgement order ranked set samples. Biometrics, 36,

39
39 Takahasi, K. and Wakimoto, K. (1968). On biased estimates of the population mean based on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics, 20, Yu, Philip L.H. and Lam K. (1997). Regression estimator in ranked set sampling, Biometrics, 53,

40
40 THANKS

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google