Presentation is loading. Please wait.

Presentation is loading. Please wait.

S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates.

Similar presentations


Presentation on theme: "S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates."— Presentation transcript:

1 S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates

2 An overview of the sequence formats used in glycobioinformatics

3 Special structural features

4 Uniqueness—A central requirement for encoding carbohydrate sequences Why Server as primary key in database Beneficial for the implementation of exact structure search How Apply strict sorting rules Define a controlled vocabulary Support encoding of uncertain linkages and unspecified monosaccharides

5 General idea of GlycoCT

6 Basic monosaccharide namespace

7 Basic residue(RES) entities in GlycoCT Substituents and other entities

8 Modeling the topology Residue entities are modeled in RES section. Linkages are modeled in LIN section. Atom replacement schema.

9 Encoding linkage

10 Encoding Repeating units

11 Encoding alternative units

12 Encoding underdetermined units

13 Sorting Why One central requirement is to generate a unique representation for all carbohydrates. Sorting is used to determine the order of appearance of elements. How A set of hierarchical rules are used in GlycoCT to define the ordering of residues, linkages and special structural features. Residue comparison algorithm Linkage comparison algorithm Underdetermined subtree comparison algorithm Alternative subtree comparison algorithm

14 Residue comparison Apply when there are multiple starting points exist. Rules Number of child residues. Length of the longest branch. Number of terminal residues. Number of branching points. Lexical order.

15 Linkage comparison Rules Number of bonds between parent and child residues. Atom linkage position at the parent residue. Atom linkage position at the child residue. Linkage type at the parent residue. Comparison of child residues with residue comparison algorithm. Decide the internal order of the RES and LIN sections

16 Underdetermined subtree & Alternative subtree comparison The encoding of UND and ALT is handled separately from the description of the other topological features. Apply the set of rules from the residue and linkage comparison algorithm to each UND and ALT to determine internal order. The reducing residues of UNDs and ALTs are compared with the residue comparison. If two compared UNDs are identical, the parent residues and linkages(linkage between UND and main graph) are compared.

17 First application and results All the monosaccharides from CarbBank were translated to the naming defined by GlycoCT. 1439 different names in CarbBank resulted in 474 different basetypes and 29 different substituents, reducing the number of distinct residues by 65%. Two main reasons for the reduction The separation of monosaccharides into basetype and substituents The unique encoding for monosaccharides

18 Conclusion A superset of capabilities of all known sequence formats in glycobioinformatics Support structurally undetermined sequences The consistent naming scheme for monosaccharides can be easily maintained.


Download ppt "S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates."

Similar presentations


Ads by Google