Presentation is loading. Please wait.

Presentation is loading. Please wait.

MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density.

Similar presentations


Presentation on theme: "MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density."— Presentation transcript:

1 MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density states in biomolecular folding pathways. Fall 2013 course offered through the University of Iowa Division of Continuing Education Isabel K. Darcy, Department of Mathematics Applied Mathematical and Computational Sciences, University of Iowa

2 You can join live lecture Wednesday and Friday either by going to
or joining via regular classroom. NOTE: to ask questions, you need to joing via regular classroom.

3 IMA Annual Program Year Workshop, December 9-13, 2013
Topological Structures in Computational Biology Tuesday December 10, 2013 11:30am-12:20pm Pek Lum (Ayasdi, Inc.) Friday December 13, 2013 9:00am-9:50am Monica Nicolau (Stanford University)

4

5 Example: Point cloud data representing a hand.
Data Set: Example: Point cloud data representing a hand. Function f : Data Set  R Example: x-coordinate f : (x, y, z)  x

6 Put data into overlapping bins. Example: f-1(ai, bi)
( ( ) ( ) ( ) ( ) ( ) ) Function f : Data Set  R Ex 1: x-coordinate f : (x, y, z)  x

7 Vertex = a cluster of a bin. Edge = nonempty intersection
D) Cluster each bin & create network. Vertex = a cluster of a bin. Edge = nonempty intersection between clusters Need covering Resolution multiscale

8 Vertex = a cluster of a bin. Edge = nonempty intersection
D) Cluster each bin & create network. Vertex = a cluster of a bin. Edge = nonempty intersection between clusters Need covering Resolution multiscale

9 Vertex = a cluster of a bin. Edge = nonempty intersection
D) Cluster each bin & create network. Vertex = a cluster of a bin. Edge = nonempty intersection between clusters Need covering Resolution multiscale

10 Example: Point cloud data representing a hand.
A) Data Set Example: Point cloud data representing a hand. B) Function f : Data Set  R Example: x-coordinate f : (x, y, z)  x Put data into overlapping bins. Example: f-1(ai, bi) Cluster each bin & create network. Vertex = a cluster of a bin. Edge = nonempty intersection between clusters

11 Chose filter

12 Chose filter

13 http://scitation. aip. org/content/aip/journal/jcp/130/14/10. 1063/1

14 Data: Contact maps from 2,800 Serial Replica Exchange Molecular Dynamics (SREMD) simulations of the GCAA tetraloop on the distributed computing platform. 760 trajectories with a complete unfolding event 550 trajectories with a complete refolding event. Goal: To determine secondary structure pathways between folded and unfolded state

15 Problem: Many more folded and unfolded conformations than intermediate conformations
How to distinguish intermediate conformations from noise? Solution Choose f: space of conformations  R f(conformation) = density

16

17 550 trajectories with a complete refolding event
2952 configurations

18 Distance = Hamming distance

19 550 trajectories with a complete refolding event
2952 configurations

20

21 760 trajectories with a complete refolding event
4330 configurations

22 An eQTL biological data visualization challenge and approaches from the visualization community,
Bartlett et al. BMC Bioinformatics 2012, 13(Suppl 8):S8 Mapper applied to SNP data:

23 Monday December 09, 2013 9:00am-9:50am Visualizing and Exploring Molecular Simulation Data via Energy Landscape Metaphor Yusu Wang (The Ohio State University)

24 E(conformation) = energy of the conformation
Motivation: Let S = set of conformations of the survivin protein Energy landscape E: S  R E(conformation) = energy of the conformation

25 Data from: 20,000 conformations obtained via replica exchange molecular dynamics. The backbone = 46 alpha-carbon atoms = 1035 dimensional vector of pairwise distances describing the protein shape. Intrinsic dimensionality of the conformational manifold has been estimated at around 20.

26 level set = f-1(r) = { x in M | f(x) = r }
contour tree level set = f-1(r) = { x in M | f(x) = r } A contour = a connected component of a level set. Let Cq = the contour in M that is collapsed to q Let TopoComp(edge) = U Cq 1940’s Reeb graph How do you embed the tree? q in edge

27 Given f: Md  R, Find g: R2  R such that f and g share same contour tree (2) the area of TopoComp(edge) of g is the same as the volumes of the corresponding TopoComp(edge) of f for each edge in the contour tree. Expands upon Weber’s Topological Landscapes, 2007

28 f: M^2  R

29

30

31 Figure 8: (a) Slice-and-dice and (b) Voronoi treemap layouts of terrains in Figure 6.

32

33

34

35

36

37

38

39 FIG. 2. Color a NMR structure of the GCAA tetraloop. b Contact map
for the native state. Bases are numbered from 1 to 12 and native basepair contacts dotted lines are numbered 1–4.

40 FIG. 3. Color online Graphical representation of pathways by MAPPER
FIG. 3. Color online Graphical representation of pathways by MAPPER. a Unfolding pathway. b Folding pathway. In both cases, the top row graphs are the outputs from MAPPER, while the bottom row depicts the mean contact maps of the corresponding clusters. For clarity in mean contact maps, we drop those mean contacts lower than 0.4. The node colors from red to blue indicate the density from high to low, and the labels e.g., 100% show the percentage of configurations of the same level included in the cluster corresponding to the node. We dropped all the clusters of size smaller than 3% of the level size. a shows that unfolding has a single dominant pathway characterized by unzipping from the end base pair. b shows that folding process has two dominant pathways, passing through either the formation of the closing base pair or the end base pair. A noisy cluster consisting 3% of the level size was also shown in b, which accounts for reptation, i.e., sliding of the two strands of the stem.

41

42 FIG. 4. Color online Transition probability from two intermediate states.
Lag time is 2 ps. The left four nodes as extended structures Fig. 3b are merged into node U, and the right three nodes as folded structures are collected in node F. The two intermediate states on pathways are denoted by I1 and I2, respectively. The transition probability from I1 and I2 to other states are noted as numbers on the arrows. One can see that I1 and I2 are kinetically separated.

43 FIG. 5. Color online The number of end base-pair clusters found by
K-means. Here k ranges from 2 to 80 with step 2. For each k, 20 experiments are repeated with K-means clustering. The number of clusters with end base pair formed are recorded. The star is the median of such numbers and the bar delimits the distribution range from 10% to 90%. Starting from around k=25, such clusters appear with at least 1/2 probability. Around k =55, such clusters begin to split. The instability of K-means clusters is increasing as k grows, indicated by the expanding ranges.

44 FIG. 6. Color online K-means clustering fails to capture the low-density
intermediate states with one end base pair formed. The illustration here chooses k=30 for K-means clustering. a shows how end base-pair formed structures are distributed in different k-means clusters; b illustrates the mean structures of the top eight K-means clusters gray which contain base-pair formed structures. K-means splits the MAPPER cluster and lumps them with densest clusters.

45

46

47

48

49

50

51

52

53

54 Example: Point cloud data representing a hand.
Data Set Example: Point cloud data representing a hand. Function f : Data Set  R Example: x-coordinate f : (x, y, z)  x Put data into overlapping bins. Example: f-1(ai, bi)

55 An eQTL biological data visualization challenge and approaches from the visualization community, Bartlett et al. BMC Bioinformatics 2012, 13(Suppl 8):S8


Download ppt "MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 22, 2013: Topological methods for exploring low-density."

Similar presentations


Ads by Google