Multiple Sequence Analysis: a contextualized narrative approach to longitudinal data University of Stirling, September 2007 Gary Pollock Department of.

Slides:



Advertisements
Similar presentations
Variations of the Turing Machine
Advertisements

1 Inducements–Call Blocking. Aware of the Service?
Advanced Piloting Cruise Plot.
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 1 Embedded Computing.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
1 Covalent bonds l Nonmetals hold onto their valence electrons. l They cant give away electrons to bond. l Still want noble gas configuration. l Get it.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
Board of Early Education and Care Retreat June 30,
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
2010 fotografiert von Jürgen Roßberg © Fr 1 Sa 2 So 3 Mo 4 Di 5 Mi 6 Do 7 Fr 8 Sa 9 So 10 Mo 11 Di 12 Mi 13 Do 14 Fr 15 Sa 16 So 17 Mo 18 Di 19.
Solve Multi-step Equations
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
PP Test Review Sections 6-1 to 6-6
ABC Technology Project
EU market situation for eggs and poultry Management Committee 20 October 2011.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
2 |SharePoint Saturday New York City
VOORBLAD.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
LO: Count up to 100 objects by grouping them and counting in 5s 10s and 2s. Mrs Criddle: Westfield Middle School.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Subtraction: Adding UP
Januar MDMDFSSMDMDFSSS
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Energy Generation in Mitochondria and Chlorplasts
By Rasmussen College. 1. What majors or programs do you offer? 2. What is the average length of your programs? 3. What percentage of your students graduate?

Presentation transcript:

Multiple Sequence Analysis: a contextualized narrative approach to longitudinal data University of Stirling, September 2007 Gary Pollock Department of Sociology Manchester Metropolitan University

Longitudinal processes start and end times (EHA) competing risk, multi-episode (EHA) contiguous states as a single DV (SA) ie. SA offers an alternative (complementary) approach to EHA

Sequence analysis using OMA 1.Sequences of statuses are processed by…. 2.Optimal Matching Analysis (OMA) which results in … 3.A distance matrix representing the closeness (proximity) of each sequence with all others which can then be processed by… 4.Cluster analysis which leads to the construction of… 5.A typology of sequence categories

Single Sequences social class (S/N/M) eg. case 1:SSSSSSSSSS case 2:NNNNNSSSSS case 3:NNNNNMMMMM etc. Case Analysis: resulting typology is an end-in-itself Variable Analysis: typology as a predictor or a dependent variable Class, employment status, qualifications, housing, marital status, housing.. can all be analysed in this way – a range of typologies…but these dont account for interactions as they are each independently arrived at why not combine sequence data prior to analysis in order to capture interactions?

Analysis: process Create sequence data file Determine what to do with internal gaps (fill, delete or skip) Determine the costs to be used in the OMA (indel and substitution). These are the parameters which define the distances between the sequences. They work by giving low distance scores to similar sequences and high scores to dissimilar sequences Perform the OMA (though there other SA techniques) Weight the distances scores to account for different sequence lengths Perform cluster analysis Analyse clusters (i. sequence progression ii. covariates)

Indel and substitution costs case 1:SSSSSSSSSS case 2:NNNNNSSSSS case 3:NNNNNMMMMM If INDEL = 1 and SUBS = 2 (often a default setting) 1,2 = 10 1,3 = 20 2,3 = 10 If INDEL = 1 and SUBS = 2 (NM, MN, SM,MS) and 1.5 (NS,SN) 1,2 = 7.5 1,3 = ,3 = 10

Data: BHPS born tracked from age 21 to 29 data shifted to a common time axis class and qualifications examined here (housing, marriage, employment status and fertility status also processed) All internal gaps filled All sequence lengths included Year on year transitions used to inform substitution costs

Sequence gaps over waves A to N

Data: BHPS

Single Sequences: class C = no job yet 1 = Service class Higher 2 = Service class Lower 3 = Non-manual 4 = Self 5 = Skilled 6 = unskilled

Proportions of time spent in a particular class

Year on year class transitions

Year on year class transitions: off diagonal proportions (N = 1512)

Class substitution costs None sch scl nm self skil unsk None 0.0, 1.8, 1.8, 1.8, 1.8, 1.8, 1.8, sch 1.8, 0.0, 1.2, 1.3, 1.8, 1.7, 1.7, scl 1.8, 1.2, 0.0, 1.1, 1.7, 1.3, 1.3, nm 1.8, 1.3, 1.1, 0.0, 1.7, 1.6, 1.3, Self 1.8, 1.8, 1.7, 1.7, 0.0, 1.6, 1.6, Skil 1.8, 1.7, 1.3, 1.6, 1.6, 0.0, 1.2, unsk 1.8, 1.7, 1.3, 1.3, 1.6, 1.2, 0.0;

Cluster analysis of class sequences An eight cluster solution produces the following: Clus % cases description 1 17 non manual, little if any mobility 2 12 service class, lower, little mobility 3 13 unskilled, little mobility 4 12 moving from unskilled to skilled work 5 15 mixed 6 6 skilled, little mobility 7 19 upwards mobility, NM, SCL, SCH 8 5 self employed, little mobility

Single Sequences: highest qualification C = HE 2 = Post GCSE/O grade 3 = GCSE / O grade 4 = Other 5 = None/at school

Proportions of time spent in highest qualification statuses

Year on year changes in HEQ

Year on year changes in HEQ: off diagonal proportions (N = 248)

HEQ substitution costs None HE A O oth none None 0.0, 2.0, 2.0, 2.0, 2.0, 2.0, HE 1.8, 0.0, 2.0, 2.0, 2.0, 2.0, A 1.8, 1.1, 0.0, 2.0, 2.0, 2.0, O 1.8, 1.8, 1.2, 0.0, 2.0, 2.0, Other 1.8, 1.7, 1.6, 1.7, 0.0, 1.8, None 1.8, 1.8, 1.6, 1.7, 1.7, 0.0;

Cluster analysis of HEQ sequences A seven cluster solution produces the following: Clus % cases description 117from GCSE to post-GCSE 2 7late post GCSE to HE 330post GCSE, stable 413early post GCSE to HE 5 6no qualifications 614GCSE, stable 7 11other, stable

Multiple Sequence Analysis (MSA) combine different sequences prior to OMA processing eg. class, qualifications, (housing, marital and fertility statuses) are combined in a single measure the sequences represent a narrative of change (or stability) on the measured dimensions the resulting typology can be analysed using case and variable methods as before but is in itself a representation of complex time embedded associations between the source variables

Multiple Sequences: class and highest qualification C st Digit: 1 = HE 2 = Post GCSE/O grade 3 = GCSE / O grade 4 = Other 5 = None/at school 2 nd Digit: 0 = no job yet 1 = Service class Higher 2 = Service class Lower 3 = Non-manual 4 = Self 5 = Skilled 6 = unskilled

Year on year changes This is a large (35 by 35 ) matrix Calculation of substitution costs as for single sequence structure Frequent transitions: (2.9%) (2.3%) (2.6%) (2.6%) (4.5%) (5.9%) (2.4%)

Sequence analysis of class- HEQ data Clus%description 111post GCSE, NM, stable 2 8post GCSE HE, NM SCL 3 5no quals, self emp unsk 410 GCSE, mixed emp (self,sk,unsk) 5 7post GCSE, NM SCL 6 7GCSE, NM both stable 7 4post GCSE skilled, both stable 8 6from unsk and sk SCH, HE 915mixed 10 4other quals and SCL, SCH 11 3post GCSE, SCH/SCLswitching 12 6other and sk/unsk, stable 13 8post GCSE, unsk stable 14 2post GCSE, self, stable

Advantages of MSA Is not limited to a single sequence measure Is not limited to a single event type Articulates the full scope of related sequences together

Issues Increasing complexity of the measure as new variables drawn in computing time / software switching Lack of formal rules in executing the OMA and clustering processes Largely exploratory: scope to develop in relation to EHA