Presentation on theme: "Sampling learning about.... Why? A population is a defined group of identities that can be the subject of study. The nature of a population can vary greatly,"— Presentation transcript:
Why? A population is a defined group of identities that can be the subject of study. The nature of a population can vary greatly, from everyone living in a town or country through to a bag of marbles. Any statistical investigation must clearly define who, or what is included in the population. If you wish to create a statistical profile of a population, you can either study every identity in the population (a census), or study a small group selected at random from the population (a sample). If the sample is truly random, you can assume that the statistical profile of the sample is a valid representation of the whole population. A good example is a blood sample; its composition is assumed to be identical to the rest of the blood in your body.
First, you must define the population, or the Sampling Frame. For example, if the population is that of the registered voters in a country, the sampling frame can be the electoral register. This is distinct, and defines a population that can differ from other registers, for example a telephone directory. How? The next step is to choose, and describe, the sampling method. Ideally this is a probability sampling method, as opposed to a non-probability method. What are the differences?
Non-Probability sampling methods selectively exclude elements of the population - not every element has an equal chance of being selected. For example, a street survey automatically excludes everyone not walking past the interviewer. NP methods typically introduce significant exclusion bias, invalidating the conclusions of the survey. Probability sampling methods randomly select elements of the population - every element has an equal chance of being selected. There are four main methods...
Simple Random Each element in the population is given an equal chance of being selected. The sample can potentially give bias to a particular sub-population, but this is not an intentional bias, and is only an artifact of probability. A simple method is to assign each element a unique number, and to use a random number generator or similar tool to select elements for the sample. Simple Random will not provide access to sub- populations that may be of special interest to the researcher.
Systematic Sampling Each element in the population is given an equal chance of being selected. The population is first organised, for example in numeric order, and the sample is taken at regular intervals from a randomised start point. For example, every tenth element is sampled, starting with the third element. It is effective if the start point is random. Systematic Sampling will ensure that elements from throughout the population will be selected, eliminating the selection bias that Simple Random can produce. Bias can be introduced if there is a periodic cycle within the population, as the sampling interval can coincide with the periodicity.
Stratified Sampling Each element in the population is given an equal chance of being selected. The population is first divided into distinct sub- populations according to a common criteria, and the sample is taken either Simply or Systematically from each sub-population, each sub-sample size being in proportion to the size of the sub-population with respect to the population as a whole. Stratified Sampling allows for the study of specific sub- groups within a population, while also giving a statistical profile of the population as a whole. Bias can be introduced if the sample purposefully excludes sub-populations of minimal size.
Cluster Sampling Similar is principle to Systematic sampling, but using pre-existing sub-populations to represent the total population. An example would be to sample a single town to represent all towns of similar size. As the wider population is inferred from the cluster, an initial sampling frame is not necessary. Cluster sampling can be more convenient to perform than other whole-population methods. For example, an interview can interview all houses on a block, and infer to the population of the district. Bias can be introduced if there is a periodic cycle within the population, as the sampling interval can coincide with the periodicity.
Non-Probabilistic Sampling These methods have significant bias, as only a few elements of the population have a chance of being selected. Quota Sampling begins with a population organisation similar to Stratified Sampling, but a minimum population size is then stipulated, thereby excluding all elements that are not early in the selection order. Accidental Sampling is open to most forms of bias possible; classic examples include street surveys, which exclude everyone who is not on the right side of the street, looks too busy to talk, is actually at work, or the interviewer cannot be bothered asking.
At All Times Document each stage of the process: define the population name the sampling method justify your choice of sampling method describe how you plan to take the sample identify sources of bias, and state how you will minimise these