# Computing in Archaeology

## Presentation on theme: "Computing in Archaeology"— Presentation transcript:

Computing in Archaeology

Aims To become familiar with sampling practices in an archaeological context

Introduction to Sampling
An area of excavation is a sample of the complete site which in itself is a sample of all sites of that type. The same goes for artefact assemblages. The essence of all sampling is to gain the maximum amount of information by measuring or testing just a part of the available material Fletcher & Lock 2005, 66

Archaeological sample
Sampled population Target population

Formal definitions Population: the whole group or set of objects about which inference is to be made Sampling fame: a list of the items, units or objects that could be sampled Variable: a characteristic which is to be measured for the units, such as weight of spearheads Fletcher & Lock 2005, 66

Formal definitions Sample: the subset or part of the population that is selected Sample size: the number in the sample. A sample size of 5 is considered small, while, formally, a sample size of 50 is large. The sample size maybe stated as a percentage of the sampling frame, e.g. a 10% sample Fletcher & Lock 2005, 67

Sampling strategies a simple random sample (probability sample USA) a systematic sample a stratified sample a cluster sample

population – 100 units . . . etc 100 obsidian spearheads

population – 100 units 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

A simple random number sample

Random sampling If we have a sample of 100 spearheads, we simply pick 10 random numbers (i.e. 10%) Computers can help generate random sequences, but are not necessary You must avoid bias in your selection as this can result in scrutiny from others

a simple random number sample
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

A systematic sample

Systematic sampling To take a systematic approach, we could choose every number ending in 4. Once again this would give us our 10% This method has the advantage of being easy to design unless the units have inherent patterning in their order

a systematic sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

A stratified sample

Stratified sampling Here we take a random sample 5 from the top and five from the bottom Or 5 from the left, 5 right etc

a stratified sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

a stratified sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

A cluster sample

Cluster sampling Rather than select individual items, select clusters or groups of items that are close together This may result in bias values

a cluster sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

a cluster sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Downside to systematic
Totally miss this context

Common sample statistics:
x – the sample mean s – the sample standard deviation p – the sample proportion (i.e. the proportion of the sample having a particular characteristic)

Stats The true population values for these statistics are usually unknown, and formally denoted by Greek letters

Common sample statistics:
known value estimate for x – the sample mean s – the sample standard deviation p – the sample proportion μ – the population mean

Common sample statistics:
known value estimate for x – the sample mean s – the sample standard deviation p – the sample proportion μ – the population mean σ – the population standard deviation

Common sample statistics:
known value estimate for x – the sample mean s – the sample standard deviation p – the sample proportion μ – the population mean σ – the population standard deviation π – the population proportion

The central-limit theorem (the law of averages)
In order to comment on how good an estimate the sample statistics are, the nature of their distribution needs to be known See Fletcher & Lock (2nd ED) 2005, Digging Numbers Oxbow 70-9

Similar presentations