Presentation on theme: "Estimation of Finite Population Mean Using Ranked Set Two-stage Sampling Design By U C Sud and Dwidesh Mishra IASRI, New Delhi-110012."— Presentation transcript:
1 Estimation of Finite Population Mean Using Ranked Set Two-stage Sampling Design By U C Sud and Dwidesh Mishra IASRI, New Delhi
2 IntroductionThe method of Ranked Set Sampling (RSS) was first introduced by McIntyre (1952) as a cost-efficient alternative to simple random sampling for situations where outside information is available allowing one to rank small sets of sampling units according to the character of interest without actually quantifying the units.McIyntyre was concerned with estimating agricultural yields where the ranking could be done on the basis of visual inspection.One of the strengths of the method, however, is that its implementation and performance require only that ranking be possible but they do not depend in any way on how the ranking is accomplished
3 The Method of RSSA basic cycle of the method involves the random selection of m2 units from the population. These units are randomly partitioned into m subsets, each containing m sampling units. The members of every subset are ranked according to the character of interest.Then the lowest ranked member is quantified from the first set, the second lowest ranked member is quantified from the second set, and so on until the highest ranked member of the last set is quantified.This yields m quantification from among the m2 selected units. Since m is usually taken as small in order to facilitate the ranking, there may not be enough measurements for reasonable inference and the basic cycle is repeated r times to give n=mr quantifications out of r selected units.
4 Let us take a set-size m=3 with r=4 Then the sampling scheme can be shown by the following diagramHere each row indicates a judgement ordered sample for each cycle. Encircled units are quantified. Out of 36 units drawn, 12 units have been quantifiedCycleRank123-4
5 Contd.Let X11, X12,…, X1m, X22,…,X2m,…,Xm1,…,Xmm be independent random variables all having the same cumulative distribution function F(x). Also letXi(1), Xi(2),…, Xi(m) denote the corresponding order statistics of , Xi1,…,Xi2,…,Xii,…,Xim(i=1,2,…,m). Then X1(1), X2(2),…, Xm(m) is the ranked set sample (considering one cycle only), since Xi(i)is the i-th order statistic in the i-th sample.The value Xij for the randomly drawn units can be arranged as in the following diagram:Set
6 Contd. The quantified units appear as After ranking the units appear as:The quantified units appear as
7 ExamplesRSS is very useful in environmental and ecological sampling where exact measurement (or quantification) of a selected unit is either difficult or expensive in terms of time, money or labor, but where ranking of a small set of selected units according to the characteristic of interest can be done with reasonable success on the basis of visual inspection or other rough method not requiring actual measurement.Thus if the interest lies in estimating the mean height of the sampled trees, then measurement of the height of the trees could pose a problem, but it would be relatively easy to rank small sets of trees on the basis of visual inspection.In situations where visual inspection is not directly available ranking can be done on the basis of a covariate that is more accessible and also correlated with the character of interest.Thus for estimating volume of trees one can carry out ranking on the basis of diameter of the trees.
8 Theory of RSSPerformance of the RSS estimator is generally benchmarked against that of simple random sampling (SRS) estimator with the same number of quantifications. For this purpose, one may employ either the relative precision,Or the relative savings,There was little follow up on McIntyre’s (1952) proposal until late 1960s when Hall and Dell (1966) published a field evaluation and Takahasi and Wakimoto (1968) developed the statistical theory for the RSS method. When sampling is from a continuous population and the ranking is perfect, Takahasi and Wakimoto proved that is unbiased for and is at least as efficient as
9 Contd. They also obtained the variance of the RSS estimator as where is the population variance and is the expected i-th out of m order statistic from the population. They also established the boundorThe upper bound indicates that ranked set sampling can result in very substantial savings when compared with simple random sampling. Specifically, the method can result in savings in the number of quantifications by as much as 33, 50, 60, 67 percent when m=2, 3, 4, 5 respectively.
10 ReviewStokes (1979) considered the use of concominant variable at the estimation stage in the context of RSSStokes (1980) dealt with the problem of estimation of population varianceDell and Clutter (1972) considered the problem of ranking errorsPhilip and Lam (1997) developed a regression estimator for RSS
11 RSS in the Context of Finite Population Sampling Early developments in RSS were concerned with sampling from infinite population.Patil et al. (1994) were the first to consider the situation of sampling from finite population.Explicit expressions were obtained for the variance of the RSS estimator and for its precision relative to that of simple random sampling without replacement.Krishna (2002) extended the theory of RSS to the case of sampling from a finite population by utilising a Horvitz-Thomson estimator for the estimation of the finite population mean.Calculation ofCalculation of is tedious
12 RSS for Two – stage sampling designs However, the contributions made by Patil et al. (1994) and Krishna (2002) were limited to the case of uni-stage sampling designs.RSS for Two - Stage Sampling DesignThree different cases have been studied. In the first case the SRS is used at the 1st stage of sampling and RSS at the 2nd stage of sampling. Similarly, the RSS is used at 1st stage and SRS at the 2nd stage in second case. In the third case the RSS is used in both the stages of sampling. In each of the cases efficiency comparisons of RSS based estimators have been made with SRS based estimators with the help of real data when the sampling is SRS at both the stages of sampling.Let there be a finite population of N primary stage units, a-th primary stage unit is of size M. Let be the value of unit pertaining to b-th secondary stage unit (ssu) of a-th primary stage unit (psu).
13 Contd. = mean per ssu in the a-th psu = Population mean Case 1: SRS at first stage and RSS at second stageLet a sample of size ‘n’ be drawn from ‘N’ by SRSWOR. Also, let a set of size m be selected at random and without replacement from M using RSS.Without any loss of generality we assume that
14 Case 1: SRS at first stage and RSS at second stage Define the eventsuch that the k-th ranked unit in the subset is the s-th ranked unit in the population of ssu.Also write,and let denote the dimensional column vector having as its s-th component-
15 Contd. It may be noted that is given by If is the quantification of the k-th ranked unit from the set, then
17 Contd. is the component wise square of Next, we study the joint distribution of the order statistics from two disjoint sets. Let two disjoint sets each of sizebe drawn without replacement fromWritefor the event that the k-th ranked unit from set 1 has rank s and the j-th ranked unit from set 2 has rank t in the population of sizeWe define
18 Contd. Following Patil et al. (1994), it may be seen that Let be the matrix withas its (s,t)th component.Notice that, since.Let.andbe the quantification of the k-th and j-th ranked units from set 1 and set 2, respectively. Then ,
20 Contd.Let mr sets, each of size m, be selected randomly using RSS and without replacement from the a-th psu. Let the lowest ranked unit be quantified in each of the first ‘r’ sets-In each of the next r sets, the second ranked unit is quantified to give:This process continues until the highest ranked unit is quantified in each of the last r sets:
21 Contd. Theorem 1, The estimator is unbiased and variance of is given by
22 Proof of the results The matrix is symmetric with zeroes on the diagonal, it is calculated byA program has been made in the language Turbo ‘C’ to calculate TProof:To prove that the estimatoris unbiased, we proceed as follows:
26 Case2: RSS at first stage and SRS at second stage Assume that a sample of size ‘m’ is selected by SRSWOR from the a-th psu a=1,2,…,N. Further, we assume that a set of size ‘n’ is selected from ‘N’ by RSS. Also, as in Case 1, we assume that the psu’s are increasingly arranged.Define the eventsuch that the a-th ranked unit in the subset is the s-th ranked unit in the population of psu’s.Definebe therow vector having
27 Contd. as its s-th component s=1,2,…,N; a=1,2,…,n = sample mean for the a-th psu.
28 Contd.To study the joint distribution of the order statistics from disjoint sets each of size ‘n’ drawn by without replacement using RSS, letbe the event that the a-th ranked unit from set 1 has rank s in the population and the c-th ranked unit from set 2 has rank t in the population.
29 Contd.Letandbe the quantification of the a-th and c-th ranked units from set 1 and set 2, respectively. Then ,Moments of the estimator of population mean:Let nr sets each of size n be selected randomly and without replacement from a population of N psu’s. Let the lowest ranked unit be quantified in each of the first r sets
30 Contd.Similarly, in each of the next r sets, the second ranked unit is quantified to giveThis process continues until the highest raked unit is quantified in each of the last r sets:Thus, the proposed estimator of population mean, when the sample at the first stage is selected by RSS and at the second stage by SRS, is given by
31 Case III: RSS at both the stages On the same lines as in case 1, it can be show that is unbiased andthe variance of=+Case3 : RSS at both the stages
32 3. Empirical StudyFor the purpose of comparing the RSS and the SRS based estimator an empirical study was carried out where in a part of the data of wheat crop for an experimental station as given in Singh et al. (1979) was taken. The data comprised 9 fields each field having 4 plots. (Set I). (The population values of were and respectively).For RSS protocol, plots in each field were ranked according to the perceived weight of wheat yield. Using this data, estimators of population mean based on RSS and SRS were considered for the three cases dealt with earlier.
33 Another data set given in Singh and Mangat (1996) on outstanding loans of farmers affiliated to cooperatives was utilized to compare the performance of RSS and SRS based estimators. (Set II). The population values ofwere and respectively). The data comprised 9 blocks and 4 societies in each of the block.Finally data on number of persons in a household given in Raj (1971) was also utilized to compare the performance of RSS and SRS based estimators. (Set III). (The population values ofwere 7052 and respectively). Here also the data comprised 9 households and 4 persons in a household
34 Per cent gain in precision Table 2.1 Per cent gain in precision of RSS based estimators over SRS based estimatorsCaseStageDesignEstimatorS.E.of the estimatorPer cent gain in precisionSet I1SRS5.3910.212RSS5.601.8535.3312.4645.94
36 Set III1SRS0.19915.572RSS0.20512.1930.19418.5540.230
37 References:Dell, T.R. and Clutter, J.L.(1972). Ranked set sampling theory with order statistics background. Biometrics, 28,Halls, L.K. and Dell, T.R. (1966). Trail of ranker set sampling for forage yields. Forest Science, 12,Krishna, Pravin (2002). Some aspects of ranked set sampling from finite population. M.Sc.Thesis of I.A.R.I., New Delhi-12.McIntyre, G A (1952). A method of unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3,Patil, G.P., Sinha, A. K. and Taillie, C. (1993). Ranked set sampling from a finite population in the presence of a trend on a site. Journal of Applied Statistical Science. Vol.1, No. 1,Patil, G.P., Sinha, A. K. and Taillie, C. (1994). Ranked set sampling. Handbook of Statistics. 12, (eds. Patil, G. P. and Rao, C. R.), , North-Holland, Amsterdam.Patil, G.P., Sinha, A. K. and Taillie, C. (1995). Finite population corrections for ranked set sampling. Annals of Institute of Statistical Mathematics. Vol.47, No. 4,
38 Raj, D. (1971). The Design of Sample Surveys. Mcgraw-Hill Book Co Raj, D. (1971). The Design of Sample Surveys. Mcgraw-Hill Book Co., New York.Singh, D., Singh, P. and Kumar, P. (1979). Hand Book on Sampling Methods. Indian Agricultural Statistics Research Institute, New Delhi.Singh, R and Mangat, N.P.S. (1996). Elements of Survey Sampling. Kluwer Academic Publisher, pp 388.Stokes, S L (1977). Ranked set sampling with concominant variables. Communication in statistics, Theory and Methods, 6,Stokes, S L (1980). Estimation of variance using judgement order ranked set samples. Biometrics, 36,
39 Takahasi, K. and Wakimoto, K. (1968) Takahasi, K. and Wakimoto, K. (1968). On biased estimates of the population mean based on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics, 20, 1-31.Yu, Philip L.H. and Lam K. (1997). Regression estimator in ranked set sampling, Biometrics, 53,