Download presentation

Presentation is loading. Please wait.

Published byJennifer Nelson Modified about 1 year ago

1
Introduction to Program MARK Stephen J. Dinsmore Iowa State University

2
Lecture outline Introduction – modeling and inference Parameters Model structure The input file PIMs, design matrices and more in MARK Analysis tips

3
Motivation for modeling What are your goals? To just “analyze data”, or do you seek a deeper understanding of a complex process? What questions are you interested in answering? How will you use the information? By definition, a model is an approximation of truth and not truth itself!

4
Introduction Models, modeling, and estimation Process: capture, tag, release, recapture The “art” is balancing effort across each of these categories

5
Population characteristics Open versus closed populations –Assumptions –Results of assumption violations Understanding this distinction is a critical step in the modeling process.

6
Methods for marking Leg bands, neck collars Standard i.d. tags PIT tags Radio collars/transmitters Camera “traps”

7
Encounter techniques Live resightings (mainly birds) Live captures (sturgeon, many others) Dead recoveries (waterfowl)

8
Summarizing encounters Release and recapture data for each animal are summarized in an encounter history. A separate encounter history should be constructed for each animal. Encounter histories consist of strings of 1’s (animal was encountered) and 0’s (not encountered) in most cases.

9
What can we estimate? Survival (S; or apparent survival ) Population size (N) Emigration/immigration (γ″, γ′) Movement probabilities (δ) Reproduction/recruitment (F) Rate of population change (λ) Occupancy rate (ψ)

10
Models in MARK Live encounters (Cormack-Jolly-Seber) Dead recoveries (band recovery) Joint live and dead encounters Known fate (radio telemetry) Closed captures Robust design Multi-strata Pradel lambda models Patch occupancy Nest survival And the list is still growing…

11
Features of MARK Parameter estimation (model averaging) Multiple attribute groups (age, sex classes) Individual, group, and time covariates Unequal time intervals AIC model selection Quasi-likelihood theory (over-dispersion)

12
Questions?

13
Basic data – encounter histories LLLL –Live recaptures, known fate –This example codes for 4 occasions LDLD –Joint live-dead recoveries –This example codes for 2 occasions

14
Live encounters Example of possible outcomes Seen Release Dead or emigrated Live Not seen p 1-p 1-

15
Live encounters Example – 5 encounter occasions –LLLLL (1=encountered, 0=not encountered) –Estimate (apparent survival; time t to time t- 1) and p (conditional capture probability; time 2 to time t). –Last and last p are confounded without some constraint on one of them (MARK reports them as a product).

16
Live encounters Example encounter histories –11111 - 1 p 2 2 p 3 3 p 4 4 p 5 –11001 - 1 p 2 2 (1-p 3 ) 3 (1-p 4 ) 4 p 5 –00011 - 4 p 5 –11110 - 1 p 2 2 p 3 3 p 4 [ 4 (1- p 5 )+(1- 4 )]

17
Model assumptions 1) Tagged individuals are representative of the population of interest. 2) Numbers of releases are known. 3) Tagging is accurate, no tag loss, no misread tags, etc. 4) Releases are “instantaneous” (relative to time interval between releases). 5) Fates of individuals and cohorts are independent. 6) Individuals in each identifiable group (age or sex class, etc.) have the same survival and capture probability.

18
Dead recoveries Example of possible outcomes Reported Not reportedRelease Live Die S 1-S r 1-r

19
Dead recoveries Example – 3 encounter occasions –LDLDLD –Estimate survival (S) and reporting probability (r).

20
Dead recoveries Example encounter histories –100001 - S 1 S 2 (1-S 3 )r 3 –001001 - S 2 (1-S 3 )r 3 –000010 - (1-S 3 )(1- r 3 )+S 3

21
Joint live-dead Example of possible outcomes Not seen Release Live Die S 1-S Reported r 1-r Not reported Seen p 1-p

22
Joint live-dead Example – 3 encounter occasions –LDLDLD –Estimate survival (S), reporting probability (r), capture probability (p), and fidelity (F).

23
Known fate Mainly used for radio telemetry data where capture probability is 1.0. Example of possible outcomes Live Release Die S 1-S

24
Known fate Example – 4 encounter occasions –LDLDLDLD –Estimate survival (S) only

25
Known fate Example encounter histories –10101010 - S 1 S 2 S 3 S 4 –10101100 - S 1 S 2 (1-S 3 )

26
Closed captures Example of possible outcomes Release Not seen p 1-p Seen Not seen c 1-c Seen Not seen p 1-p

27
Closed captures Example – 4 encounter occasions –LLLL –Estimate initial capture (p) and recapture (c) probabilities and population size (N).

28
Closed captures Example encounter histories –1111 - p 1 c 2 c 3 c 4 –1011 - p 1 (1-c 2 )c 3 c 4 –0101 - (1-p 1 )p 2 (1-c 3 )c 4

29
Robust design Useful model that incorporates features of open and closed C-R theory. Can estimate all survival rates (i-1), not just i-2 as with CJS model. Estimate of population size for each primary sampling period. Can estimate temporary emigration (γ).

30
Robust design P, c, N SSS γ″ γ′γ′

31
Summary Quick introduction to basic models and their structure. Next: introduction to MARK including the input file, basic modeling (PIMs and design matrices), model selection, and inference.

32
Questions?

33
Getting started in MARK The input file –Required: encounter history, frequency, always ends with a ; –Optional: comment area (/* comment */), covariates –Can be coded as 1 individual per line, or summarized with multiple individuals per line or in m-arrays

34
Examples of input data CJS model 1111110 1 0 ; Robust design 0011000010010000001 0 0 0 0 0 84; Nest survival /*BYWG B-013 99-085 */14 18 21 11;

35
Program MARK Many of the procedures I’ll demonstrate can be done in more than one way in MARK! –Examples include building models using PIMs or design matrices (I prefer the latter) and selecting model(s) for inference.

36
MARK vocabulary PIMs (Parameter Index Matrices) Design matrices Link functions AIC (Akaike’s Information Criterion) Model selection

37
Model building in MARK A priori biological hypotheses PIM design matrix (β) link function real parameters

38
About PIMs PIMs provide one means of constraining the parameters in a model. Each PIM indexes a different parameter for each group (for live recaptures data with 3 groups, there will be 3 PIMs for apparent survival and 3 PIMs for recapture probability). Remember: the values in the PIMs correspond to estimable parameters and NOT the number of occasions. I recommend leaving the PIMs alone, unless you need to add age effects.

39
PIMs Cohort 112345 22345 3345 4455

40
PIMs From the previous slide, this is what the PIM would look like in MARK for 5 occasions and 4 estimates of 1234 234 34 4

41
More PIMs 12345 6789 101112 1314 15 11111 1111 1111 1

42
More PIMs 1121112112111111678927893894951121112112111111678927893894951

43
Design matrices A useful way to further “constrain” the parameters as they appear in the PIMs. The only way to introduce time trends (linear, etc.) and covariates into models. The structure of the design matrix will depend on constraints placed in the PIMs.

44
Design matrices Basic concept: MARK allows the user to apply a linear regression model as a constraint on any parameter (e.g., survival) with the use of a design matrix. Here, the response variable (a rate such as survival) is expressed as a linear regression function of 1 or more factors.

45
Design matrices Basic linear model is Y = Xβ + ε –Y is the response variable (e.g., survival) –X is a vector of “dummy” variables (1’s and 0’s) –β is the slope –ε is a vector of random error terms

46
Design matrix - example Suppose we want to determine if male and female Mourning Doves have different survival rates. –In linear regression, we have Y i = β 0 + β 1 x i + ε i –Each variate Y i is the sum of the intercept (β 0 ), the product of the slope (β 1 ) and the variable x (x i ), and the random error term (ε i ). –But what is x i ? It is a “dummy” variable specifying sex (for example, 0 for male, 1 for female). –The test is whether the slope (β 1 ) is different from zero. –If β 1 is not different from 0, then no sex effect.

47
Design matrices In MARK, rows in the design matrix correspond the parameters set in the PIMs and columns correspond to the β i. You cannot add any structure to the design matrix that is missing from the PIMs (hence my preference to leave the PIMs alone except to add age effects).

48
Link functions The rates (e.g., survival) in the linear regression model must be transformed in MARK. Several transformations are available, each having different properties. In MARK, we will primarily use the logit and sin link functions. MARK is good at setting the default link function for you.

49
Likelihoods Likelihood = Pr{encounter history} observed Log e (likelihood) = observed*log e [Pr{encounter history}]

50
Covariates Individual - some unique characteristic of each individual in the population such as body mass at capture or fork length. Group– a characteristic of the entire group such as sex or age class. Time – a unique time-specific characteristic such as river flow data or temperature.

51
Covariates Every study should be incorporating covariates! Some recommendations: –Individual covariates often apply to survival (e.g., mass at capture, size measures, fitness measures, habitat, etc.). –Group covariates can affect survival (e.g., weather) or capture probability (e.g., effort). –Time covariates often influence capture.

52
Goodness-of-fit This is an area where further work is needed. Best overall goodness-of-fit test is in Program RELEASE (included in MARK). GOF for CJS model based on results of Tests 2 and 3 in RELEASE. No good GOF tests for complex models. Ad hoc procedure for robust design.

53
AIC model selection AIC = Akaike’s Information Criterion From Information Theory, which is one of many ways to objectively assess the relative importance of a set of models. Remember – the AIC best model is not “the model”, but rather is the model within the set that had the best support, given the data.

54
AIC model selection AIC = -2ln(L) +2K –L is the likelihood of the model and K is the number of parameters in that model. –A smaller log likelihood means a better fit. –The +2K term is a “penalty” for adding more parameters, although this is balanced by an improved model fit. Message: There is an important trade-off between fit and # of parameters, and AIC provides an objective means of balancing this.

55
Quasi-likelihood theory QAIC - a way to account for over- dispersion in the data. Over-dispersion results from a lack of independence, e.g., animals that travel in family groups. In MARK, we use ad hoc procedures to estimate c (a variance inflation factor). Result: variance is inflated.

56
Model averaging Which parameter estimate do you report when you have estimates from 10 models? The estimate from the best model? All estimates? Or, some “average”? Model averaging incorporates this model selection uncertainty into parameter estimates. Best used when there are several competing models (Δ-AIC <2).

57
Number of parameters With complex models, MARK has a hard time correctly counting the number of parameters when parameter estimates are close to a boundary (e.g., near 0 or 1). Sin link function is best, logit link function sometimes performs poorly. Message: always check MARK to be sure parameters are counted correctly.

58
Model notation Describe models concisely (limited space in MARK). Some basic nomenclature: –Full time variation (t) –Linear time trends (T) –No variation (.) –Additive effects (t+temperature) –Multiplicative effects (group*t)

59
Model notation Examples – (.) means 1 = 2 = 3 … – (t) means 1, 2, … t-1 – (t+Mass) means 1, 2, … t-1 are each a function of body mass

60
Model notation Keep model names simple, but descriptive –Parameters are written with sources of variation listed in parentheses. – (t) p (t) – (T+Weight) p (t+effort) – (t*group) p (T)

61
How do we “test” for effects? For example, how would we know that weight influenced the survival of bird? Need to consider models with and without weight. Model selection results: –Are models with weight among the “best” in the model set? –Look at the β for weight – does its confidence interval overlap zero? –Likelihood ratio tests Remember this is all conditional on the model set.

62
Developing candidate models Inference is conditional on the set of models we consider. Considerable effort should go into developing a concise set of models for consideration. How many? Typically, 5-20 models will suffice. Models should address realistic questions and should not include factors known to be unimportant. Message: use what you already know.

63
Study design considerations Trade-off between sample marked and recapture probability. What is an adequate sample size? Consider question to be asked – estimate population size, or survival, or lambda?

64
Discussion Additional discussion topics: –Model assumptions and the results of assumption violations. –Developing the set of candidate models. –Selecting the appropriate model for analyses. –Others? This afternoon – an example in MARK.

65
Models – a review

66
Patch occupancy Presence-absence data Define a “patch” – ponds, islands, plots, etc. Multiple visits to each site Assumes closure during sampling period Parameters: –Ψ –p –ε and γ (robust design only)

67
Patch occupancy Modeling details: –Handles missing visits (coded as “.” in EH) –Covariates Robust design formulations: –Psi and epsilon –Psi and gamma –Psi(1), epsilon, gamma

68
Nest survival Required data for each nest: 1) The day the nest was found (k). 2) The last day the nest was checked alive (l). 3) The last day the nest was checked (m). 4) Nest fate (0 = successful, 1 = failure) (f). 5) The number of nests with this encounter history.

69
Input file – example Nest survival group=1; 151111711.24; 69901601.90; 10262911682.21; 11192411651.88; 15222201652.03; Nest survival group=2; 48801721.99; 7182111631.40; 17303001671.77; 19283311701.28;

70
Coding the data Coding the triplet k, l, and m: 1) k=1, l=3, m=5, fate=1 → S 1 S 2 [1-S 3 S 4 ] 2) k=1, l=3, m=3, fate=0 → S 1 S 2 3) k=1, l=3, m=3, fate=1 is invalid (can’t be alive and dead on day 3) 4) k=1, l=1, m=3, fate=1 → [1-S 1 S 2 ] 5) k=1, l=1, m=1, fate=0 or 1 is invalid (nest was active only on day 1) See MARK help file for more details

71
Model assumptions 1) Homogeneity of daily nest survival rates. 2) Nest fates are independent. 3) All visits to nests are recorded. 4) Nest discovery and subsequent checks do not influence nest survival. 5) Nest checks are independent of fate. 6) Nest fates are correctly determined. 7) Age of nest of at discovery.

72
Estimate nesting success? For constant nest survival, period success is DSR exponentiated to period length What happens if there is: –Temporal variation in nest survival? –Covariates? –A combination of both?

73
Temporal variation 10 days Which 10-day interval provides the “best” estimate of nest success?

74
Getting the “best” estimate Need a start date for “best” estimate – when? Does simple mean work? What about bias between observed and true nest initiation dates? Use Horvitz-Thompson estimator to correct for this bias.

75
Other considerations Stage-specific survival Divide EH into parts: –Incubation – 1 10 10 0 1; –Nestling – 10 20 22 1 1; Nest age

76
Model-based predictions MARK provides a regression equation that can be used for predictions Logit S male = 3.53+0.28*1, S male = 0.9784 Logit S female = 3.53+0.28*0, S female = 0.9716

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google