Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Program MARK

Similar presentations


Presentation on theme: "Introduction to Program MARK"— Presentation transcript:

1 Introduction to Program MARK
Stephen J. Dinsmore Iowa State University

2 Lecture outline Introduction – modeling and inference Parameters
Model structure The input file PIMs, design matrices and more in MARK Analysis tips

3 Motivation for modeling
What are your goals? To just “analyze data”, or do you seek a deeper understanding of a complex process? What questions are you interested in answering? How will you use the information? By definition, a model is an approximation of truth and not truth itself!

4 Introduction Models, modeling, and estimation
Process: capture, tag, release, recapture The “art” is balancing effort across each of these categories

5 Population characteristics
Open versus closed populations Assumptions Results of assumption violations Understanding this distinction is a critical step in the modeling process.

6 Methods for marking Leg bands, neck collars Standard i.d. tags
PIT tags Radio collars/transmitters Camera “traps”

7 Encounter techniques Live resightings (mainly birds)
Live captures (sturgeon, many others) Dead recoveries (waterfowl)

8 Summarizing encounters
Release and recapture data for each animal are summarized in an encounter history. A separate encounter history should be constructed for each animal. Encounter histories consist of strings of 1’s (animal was encountered) and 0’s (not encountered) in most cases.

9 What can we estimate? Survival (S; or apparent survival )
Population size (N) Emigration/immigration (γ″, γ′) Movement probabilities (δ) Reproduction/recruitment (F) Rate of population change (λ) Occupancy rate (ψ)

10 Models in MARK Live encounters (Cormack-Jolly-Seber)
Dead recoveries (band recovery) Joint live and dead encounters Known fate (radio telemetry) Closed captures Robust design Multi-strata Pradel lambda models Patch occupancy Nest survival And the list is still growing…

11 Features of MARK Parameter estimation (model averaging)
Multiple attribute groups (age, sex classes) Individual, group, and time covariates Unequal time intervals AIC model selection Quasi-likelihood theory (over-dispersion)

12 Questions?

13 Basic data – encounter histories
LLLL Live recaptures, known fate This example codes for 4 occasions LDLD Joint live-dead recoveries This example codes for 2 occasions

14 Live encounters Example of possible outcomes Seen Release
Dead or emigrated Live Not seen p 1-p 1- 

15 Live encounters Example – 5 encounter occasions
LLLLL (1=encountered, 0=not encountered) Estimate  (apparent survival; time t to time t-1) and p (conditional capture probability; time 2 to time t). Last  and last p are confounded without some constraint on one of them (MARK reports them as a product).

16 Live encounters Example encounter histories 11111 - 1p22p33p44p5

17 Model assumptions Tagged individuals are representative of the population of interest. Numbers of releases are known. Tagging is accurate, no tag loss, no misread tags, etc. Releases are “instantaneous” (relative to time interval between releases). Fates of individuals and cohorts are independent. Individuals in each identifiable group (age or sex class, etc.) have the same survival and capture probability.

18 Dead recoveries Example of possible outcomes Reported Not reported
Release Live Die S 1-S r 1-r

19 Dead recoveries Example – 3 encounter occasions LDLDLD
Estimate survival (S) and reporting probability (r).

20 Dead recoveries Example encounter histories 100001 - S1S2(1-S3)r3
(1-S3)(1- r3)+S3

21 Joint live-dead Example of possible outcomes Release Live Die S 1-S
Reported r 1-r Not reported Seen p 1-p Not seen

22 Joint live-dead Example – 3 encounter occasions LDLDLD
Estimate survival (S), reporting probability (r), capture probability (p), and fidelity (F).

23 Known fate Mainly used for radio telemetry data where capture probability is 1.0. Example of possible outcomes Live Release Die S 1-S

24 Known fate Example – 4 encounter occasions LDLDLDLD
Estimate survival (S) only

25 Known fate Example encounter histories 10101010 - S1S2S3S4

26 Closed captures Example of possible outcomes Seen c Seen Not seen p
Release Seen p 1-p Not seen 1-p Not seen

27 Closed captures Example – 4 encounter occasions LLLL
Estimate initial capture (p) and recapture (c) probabilities and population size (N).

28 Closed captures Example encounter histories 1111 - p1c2c3c4
(1-p1)p2(1-c3)c4

29 Robust design Useful model that incorporates features of open and closed C-R theory. Can estimate all survival rates (i-1), not just i-2 as with CJS model. Estimate of population size for each primary sampling period. Can estimate temporary emigration (γ).

30 Robust design P, c, N S γ″ γ′ P, c, N

31 Summary Quick introduction to basic models and their structure.
Next: introduction to MARK including the input file, basic modeling (PIMs and design matrices), model selection, and inference.

32 Questions?

33 Getting started in MARK
The input file Required: encounter history, frequency, always ends with a ; Optional: comment area (/* comment */), covariates Can be coded as 1 individual per line, or summarized with multiple individuals per line or in m-arrays

34 Examples of input data CJS model 1111110 1 0 ; Robust design
; Robust design ; Nest survival /*BYWG B */ ;

35 Program MARK Many of the procedures I’ll demonstrate can be done in more than one way in MARK! Examples include building models using PIMs or design matrices (I prefer the latter) and selecting model(s) for inference.

36 MARK vocabulary PIMs (Parameter Index Matrices) Design matrices
Link functions AIC (Akaike’s Information Criterion) Model selection

37 A priori biological hypotheses
Model building in MARK A priori biological hypotheses PIM design matrix (β) link function real parameters

38 About PIMs PIMs provide one means of constraining the parameters in a model. Each PIM indexes a different parameter for each group (for live recaptures data with 3 groups, there will be 3 PIMs for apparent survival and 3 PIMs for recapture probability). Remember: the values in the PIMs correspond to estimable parameters and NOT the number of occasions. I recommend leaving the PIMs alone, unless you need to add age effects.

39 PIMs Cohort

40 PIMs From the previous slide, this is what the PIM would look like in MARK for 5 occasions and 4 estimates of  2 3 4 3 4 4

41 More PIMs 13 14 15 1 1 1 1 1 1

42 More PIMs 2 1 1 1 1 1 3 8 9 4 9 5

43 Design matrices A useful way to further “constrain” the parameters as they appear in the PIMs. The only way to introduce time trends (linear, etc.) and covariates into models. The structure of the design matrix will depend on constraints placed in the PIMs.

44 Design matrices Basic concept: MARK allows the user to apply a linear regression model as a constraint on any parameter (e.g., survival) with the use of a design matrix. Here, the response variable (a rate such as survival) is expressed as a linear regression function of 1 or more factors.

45 Design matrices Basic linear model is Y = Xβ + ε
Y is the response variable (e.g., survival) X is a vector of “dummy” variables (1’s and 0’s) β is the slope ε is a vector of random error terms

46 Design matrix - example
Suppose we want to determine if male and female Mourning Doves have different survival rates. In linear regression, we have Yi = β0 + β1xi + εi Each variate Yi is the sum of the intercept (β0), the product of the slope (β1) and the variable x (xi), and the random error term (εi). But what is xi? It is a “dummy” variable specifying sex (for example, 0 for male, 1 for female). The test is whether the slope (β1) is different from zero. If β1 is not different from 0, then no sex effect.

47 Design matrices In MARK, rows in the design matrix correspond the parameters set in the PIMs and columns correspond to the βi. You cannot add any structure to the design matrix that is missing from the PIMs (hence my preference to leave the PIMs alone except to add age effects).

48 Link functions The rates (e.g., survival) in the linear regression model must be transformed in MARK. Several transformations are available, each having different properties. In MARK, we will primarily use the logit and sin link functions. MARK is good at setting the default link function for you.

49 Likelihoods Likelihood = Pr{encounter history}observed
Loge(likelihood) = observed*loge[Pr{encounter history}]

50 Covariates Individual - some unique characteristic of each individual in the population such as body mass at capture or fork length. Group– a characteristic of the entire group such as sex or age class. Time – a unique time-specific characteristic such as river flow data or temperature.

51 Covariates Every study should be incorporating covariates!
Some recommendations: Individual covariates often apply to survival (e.g., mass at capture, size measures, fitness measures, habitat, etc.). Group covariates can affect survival (e.g., weather) or capture probability (e.g., effort). Time covariates often influence capture.

52 Goodness-of-fit This is an area where further work is needed.
Best overall goodness-of-fit test is in Program RELEASE (included in MARK). GOF for CJS model based on results of Tests 2 and 3 in RELEASE. No good GOF tests for complex models. Ad hoc procedure for robust design.

53 AIC model selection AIC = Akaike’s Information Criterion
From Information Theory, which is one of many ways to objectively assess the relative importance of a set of models. Remember – the AIC best model is not “the model”, but rather is the model within the set that had the best support, given the data.

54 AIC model selection AIC = -2ln(L) +2K
L is the likelihood of the model and K is the number of parameters in that model. A smaller log likelihood means a better fit. The +2K term is a “penalty” for adding more parameters, although this is balanced by an improved model fit. Message: There is an important trade-off between fit and # of parameters, and AIC provides an objective means of balancing this.

55 Quasi-likelihood theory
QAIC - a way to account for over-dispersion in the data. Over-dispersion results from a lack of independence, e.g., animals that travel in family groups. In MARK, we use ad hoc procedures to estimate c (a variance inflation factor). Result: variance is inflated.

56 Model averaging Which parameter estimate do you report when you have estimates from 10 models? The estimate from the best model? All estimates? Or, some “average”? Model averaging incorporates this model selection uncertainty into parameter estimates. Best used when there are several competing models (Δ-AIC <2).

57 Number of parameters With complex models, MARK has a hard time correctly counting the number of parameters when parameter estimates are close to a boundary (e.g., near 0 or 1). Sin link function is best, logit link function sometimes performs poorly. Message: always check MARK to be sure parameters are counted correctly.

58 Model notation Describe models concisely (limited space in MARK).
Some basic nomenclature: Full time variation (t) Linear time trends (T) No variation (.) Additive effects (t+temperature) Multiplicative effects (group*t)

59 Model notation Examples  (.) means 1 = 2 = 3 …
 (t) means 1, 2, … t-1  (t+Mass) means 1, 2, … t-1 are each a function of body mass

60 Model notation Keep model names simple, but descriptive
Parameters are written with sources of variation listed in parentheses.  (t) p (t)  (T+Weight) p (t+effort)  (t*group) p (T)

61 How do we “test” for effects?
For example, how would we know that weight influenced the survival of bird? Need to consider models with and without weight. Model selection results: Are models with weight among the “best” in the model set? Look at the β for weight – does its confidence interval overlap zero? Likelihood ratio tests Remember this is all conditional on the model set.

62 Developing candidate models
Inference is conditional on the set of models we consider. Considerable effort should go into developing a concise set of models for consideration. How many? Typically, 5-20 models will suffice. Models should address realistic questions and should not include factors known to be unimportant. Message: use what you already know.

63 Study design considerations
Trade-off between sample marked and recapture probability. What is an adequate sample size? Consider question to be asked – estimate population size, or survival, or lambda?

64 Discussion Additional discussion topics:
Model assumptions and the results of assumption violations. Developing the set of candidate models. Selecting the appropriate model for analyses. Others? This afternoon – an example in MARK.

65 Models – a review

66 Patch occupancy Presence-absence data
Define a “patch” – ponds, islands, plots, etc. Multiple visits to each site Assumes closure during sampling period Parameters: Ψ p ε and γ (robust design only)

67 Patch occupancy Modeling details: Robust design formulations:
Handles missing visits (coded as “.” in EH) Covariates Robust design formulations: Psi and epsilon Psi and gamma Psi(1), epsilon, gamma

68 Nest survival Required data for each nest:
The day the nest was found (k). The last day the nest was checked alive (l). The last day the nest was checked (m). Nest fate (0 = successful, 1 = failure) (f). The number of nests with this encounter history.

69 Input file – example Nest survival group=1; 1 5 11 1 1 71 1.24;
; ; ; ; ; Nest survival group=2; ; ; ; ;

70 Coding the data Coding the triplet k, l, and m:
k=1, l=3, m=5, fate=1 → S1S2 [1-S3S4] k=1, l=3, m=3, fate=0 → S1S2 k=1, l=3, m=3, fate=1 is invalid (can’t be alive and dead on day 3) k=1, l=1, m=3, fate=1 → [1-S1S2] k=1, l=1, m=1, fate=0 or 1 is invalid (nest was active only on day 1) See MARK help file for more details

71 Model assumptions Homogeneity of daily nest survival rates.
Nest fates are independent. All visits to nests are recorded. Nest discovery and subsequent checks do not influence nest survival. Nest checks are independent of fate. Nest fates are correctly determined. Age of nest of at discovery.

72 Estimate nesting success?
For constant nest survival, period success is DSR exponentiated to period length What happens if there is: Temporal variation in nest survival? Covariates? A combination of both?

73 Temporal variation Which 10-day interval provides the “best”
estimate of nest success? 10 days 10 days

74 Getting the “best” estimate
Need a start date for “best” estimate – when? Does simple mean work? What about bias between observed and true nest initiation dates? Use Horvitz-Thompson estimator to correct for this bias.

75 Other considerations Stage-specific survival Divide EH into parts:
Incubation – ; Nestling – ; Nest age

76 Model-based predictions
MARK provides a regression equation that can be used for predictions Logit Smale = *1, Smale = Logit Sfemale = *0, Sfemale =


Download ppt "Introduction to Program MARK"

Similar presentations


Ads by Google