Discrete Kernels.

Slides:



Advertisements
Similar presentations
Network analysis Sushmita Roy BMI/CS 576
Advertisements

Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Fast Jensen-Shannon Graph Kernel Bai Lu and Edwin Hancock Department of Computer Science University of York Supported by a Royal Society Wolfson Research.
The multi-layered organization of information in living systems
Hidden Markov Models: Applications in Bioinformatics Gleb Haynatzki, Ph.D. Creighton University March 31, 2003.
Random Walks Ben Hescott CS591a1 November 18, 2002.
Mining and Searching Massive Graphs (Networks)
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
1 Graph Mining Applications to Machine Learning Problems Max Planck Institute for Biological Cybernetics Koji Tsuda.
Lecture 21: Spectral Clustering
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
Mismatch string kernels for discriminative protein classification By Leslie. et.al Presented by Yan Wang.
Network Statistics Gesine Reinert. Yeast protein interactions.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Biological networks Tutorial 12. Protein-Protein interactions –STRING Protein and genetic interactions –BioGRID Signaling pathways –SPIKE Network visualization.
Centrality Measures These measure a nodes importance or prominence in the network. The more central a node is in a network the more significant it is to.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Protein Homology Detection Using String Alignment Kernels Jean-Phillippe Vert, Tatsuya Akutsu.
CSE 321 Discrete Structures Winter 2008 Lecture 25 Graph Theory.
1 NHDC and PHDC: Local and Global Heat Diffusion Based Classifiers Haixuan Yang Group Meeting Sep 26, 2005.
CMPT-825 (Natural Language Processing) Presentation on Zipf’s Law & Edit distance with extensions Presented by: Kaustav Mukherjee School of Computing Science,
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
GRAPH Learning Outcomes Students should be able to:
Lecture7 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Concept of Line Graphs Topic 3: Introduction.
Soon-Hyung Yook, Sungmin Lee, Yup Kim Kyung Hee University NSPCS 08 Unified centrality measure of complex networks.
Network Analysis and Application Yao Fu
Presentation: Random Walk Betweenness, J. Govorčin Laboratory for Data Technologies, Faculty of Information Studies, Novo mesto – September 22, 2011 Random.
1 Burning a graph as a model of social contagion Anthony Bonato Ryerson University Institute of Software Chinese Academy of Sciences.
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Soon-Hyung Yook, Sungmin Lee, Yup Kim Kyung Hee University NSPCS 08 Unified centrality measure of complex networks: a dynamical approach to a topological.
Networks Igor Segota Statistical physics presentation.
Roee Litman, Alexander Bronstein, Michael Bronstein
1 CISC 841 Bioinformatics (Fall 2007) Kernel engineering and applications of SVMs.
Lecture 3 1.Different centrality measures of nodes 2.Hierarchical Clustering 3.Line graphs.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features 王荣 14S
Graph Theory. undirected graph node: a, b, c, d, e, f edge: (a, b), (a, c), (b, c), (b, e), (c, d), (c, f), (d, e), (d, f), (e, f) subgraph.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Class 2: Graph Theory IST402.
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015.
Graph Representations And Traversals. Graphs Graph : – Set of Vertices (Nodes) – Set of Edges connecting vertices (u, v) : edge connecting Origin: u Destination:
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
A Fast Kernel for Attributed Graphs Yu Su University of California at Santa Barbara with Fangqiu Han, Richard E. Harang, and Xifeng Yan.
Hidden Markov Models BMI/CS 576
Graph Spectral Image Smoothing
Random Walk for Similarity Testing in Complex Networks
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
CSCI2950-C Lecture 12 Networks
GraDe-SVM: Graph-Diffused Classification for the Analysis of Somatic Mutations in Cancer Morteza H.Chalabi, Fabio Vandin Hello.
Analysis of bio-molecular networks through RANKS (RAnking of Nodes
Random walks on complex networks
Network Science: A Short Introduction i3 Workshop
Example 1: Undirected adjacency graph Discretized polygon
Why Social Graphs Are Different Communities Finding Triangles
Intro to Alignment Algorithms: Global and Local
Graph Operations And Representation
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Walks, Paths, and Circuits
Clustering Coefficients
Ilan Ben-Bassat Omri Weinstein
Asymmetric Transitivity Preserving Graph Embedding
Graphs G = (V, E) V are the vertices; E are the edges.
Partially labeled classification with Markov random walks
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Label propagation algorithm
Presentation transcript:

Discrete Kernels

Kernels for Sequences ACGGTTCAA ATATCGCGGGAA Similarity between sequences of different lengths ACGGTTCAA ATATCGCGGGAA

Count Kernel Inner product between symbol counts Extension: Spectrum kernels (Leslie et al., 2002) Counts the number of k-mers (k-grams) efficiently

Motivations for graph analysis Existing methods assume ” tables” Structured data beyond this framework → New methods for analysis … Osaka Female 31 ×× 0002 Tokyo Male 40 ○○ 0001 Address Sex Age Name Serial Num

Graph Structures in Biology DNA Sequence RNA Compounds A C G H C CG U UA H C C O H C C フェノール H C H H

Graph Kernels Going to define the kernel function (Kashima, Tsuda, Inokuchi, ICML 2003) Going to define the kernel function Both vertex and edges are labeled

Label path Sequence of vertex and edge labels Generated by random walking Uniform initial, transition, terminal probabilities

Path-probability vector

Kernel definition Kernels for paths Take expectation over all possible paths! Marginalized kernels for graphs

Classification of Protein 3D structures Graphs for protein 3D structures Node: Secondary structure elements Edge: Distance of two elements Calculate the similarity by graph kernels Borgwardt et al. “Protein function prediction via graph kernels”, ISMB2005

Biological Networks Protein-protein physical interaction Metabolic networks Gene regulatory networks Network induced from sequence similarity Thousands of nodes (genes/proteins) 100000s of edges (interactions)

Physical Interaction Network Undirected graphs of proteins Edge exists if two proteins physically interact Docking (Key – Keyhole) Interacting proteins tend to have the same biological function

Metabolic Network

Diffusion kernels (Kondor and Lafferty, 2002) Function prediction by SVM using a network Kernels are needed ! Define closeness of two nodes Has to be positive definite How Close?

Definition of Diffusion Kernel A: Adjacency matrix,   D: Diagonal matrix of Degrees L = D-A: Graph Laplacian Matrix Diffusion kernel matrix :Diffusion paramater Matrix exponential, not elementwise exponential

Computation of Matrix Exponential Definition Eigen-decomposition

Adjacency Matrix and Degree Matrix

Graph Laplacian Matrix L

Actual Values of Diffusion Kernels Closeness from the “central node”