Download presentation
Presentation is loading. Please wait.
Published byVincent Stevens Modified over 6 years ago
1
STRING Protein networks from data and text mining
Lars Juhl Jensen
2
9.6 million proteins
3
functional associations
4
guilt by association
6
genomic context
7
gene fusion
8
Korbel et al., Nature Biotechnology, 2004
9
phylogenetic profiles
10
Korbel et al., Nature Biotechnology, 2004
11
experimental data
12
gene coexpression
14
physical interactions
15
Jensen & Bork, Science, 2008
16
curated knowledge
17
protein complexes
19
pathways
20
Letunic & Bork, Trends in Biochemical Sciences, 2008
21
many databases
22
different formats
23
different identifiers
24
variable quality
25
not comparable
26
hard work
27
parsers
28
mapping files
29
quality scores
30
von Mering et al., Nucleic Acids Research, 2005
31
score calibration
32
von Mering et al., Nucleic Acids Research, 2005
33
implicit weighting by quality
34
common scale
35
missing most of the data
36
>10 km
37
too much to read
38
computer
39
as smart as a dog
40
teach it specific tricks
43
named entity recognition
44
comprehensive lexicon
45
cyclin dependent kinase 1
46
CDC2
47
orthographic variation
48
spaces and hyphens
49
cyclin dependent kinase 1
50
cyclin-dependent kinase 1
51
prefixes and suffixes
52
CDC2
53
hCdc2
54
“black list”
55
SDS
56
co-mentioning
57
counting
58
within documents
59
within paragraphs
60
within sentences
61
quality scores
62
score calibration
63
integration
64
visualization
65
string-db.org Szklarczyk et al., Nucleic Acids Research, 2015
66
web resource
67
download files
68
REST API
69
Bioconductor package
70
Cytoscape App
71
protein query
76
disease query
80
Acknowledgments Damian Szklarczyk John "Scooter" Morris Helen Cook
Michael Kuhn Stefan Wyder Milan Simonovic Alberto Santos Nadezhda Doncheva Alexander Roth Peer Bork Christian von Mering
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.