CS4030: Bio-Computing Revision Lecture. DNA Replication Prior to cell division, all the genetic instructions must be copied so that each new cell will.

Slides:



Advertisements
Similar presentations
Adders Used to perform addition, subtraction, multiplication, and division (sometimes) Half-adder adds rightmost (least significant) bit Full-adder.
Advertisements

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Sequence Alignments.
George M. Coghill The Morven Framework. Motivation To provide properly constructive, constraint based qualitative simulation Retain QR ethos To alleviate.
Artificial Immune Systems. CBA - Artificial Immune Systems Artificial Immune Systems: A Definition AIS are adaptive systems inspired by theoretical immunology.
Chapter 7 Sampling and Sampling Distributions
Chapter 3 Determinants 3.1 The Determinant of a Matrix
Discrete Math Recurrence Relations 1.
Copyright © Cengage Learning. All rights reserved.
Chapter 11: Models of Computation
Chapter 10: Virtual Memory
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Sequence Alignment I Lecture #2
Sequence Alignments with Indels Evolution produces insertions and deletions (indels) – In addition to substitutions Good example: MHHNALQRRTVWVNAY MHHALQRRTVWVNAY-
Artificial Intelligence
6.4 Best Approximation; Least Squares
Polynomial Functions of Higher Degree
1 Functions and Applications
9. Two Functions of Two Random Variables
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
The Pumping Lemma for CFL’s
Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)
Fa07CSE 182 CSE182-L4: Database filtering. Fa07CSE 182 Summary (through lecture 3) A2 is online We considered the basics of sequence alignment –Opt score.
Artificial Immune Systems Razieh Khamseh-Ashari Department of Electrical and Computer Eng Isfahan University of Technology Supervisor: Dr. Abdolreza Mirzaei.
1 BY: Nazanin Asadi Zohre Molaei Isfahan University of Technology.
Measuring the degree of similarity: PAM and blosum Matrix
Artificial Immune Systems Andrew Watkins. Why the Immune System? Recognition –Anomaly detection –Noise tolerance Robustness Feature extraction Diversity.
Definitions Optimal alignment - one that exhibits the most correspondences. It is the alignment with the highest score. May or may not be biologically.
Sequence Alignments and Database Searches Introduction to Bioinformatics.
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Alignments 1 Sequence Analysis.
Sequence Similarity Searching Class 4 March 2010.
Sequencing and Sequence Alignment
Summer Bioinformatics Workshop 2008 Sequence Alignments Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University.
C T C G T A GTCTGTCT Find the Best Alignment For These Two Sequences Score: Match = 1 Mismatch = 0 Gap = -1.
Introduction to Bioinformatics
Sequence Alignments Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2004.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 20, 2003.
Introduction to Bioinformatics Algorithms Sequence Alignment.
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
Sequence Alignments Introduction to Bioinformatics.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 10, 2005.
Roadmap The topics:  basic concepts of molecular biology  more on Perl  overview of the field  biological databases and database searching  sequence.
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Sequence comparison: Local alignment
Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.
Developing Pairwise Sequence Alignment Algorithms
Sequence Alignments and Dynamic Programming BIO/CS 471 – Algorithms for Bioinformatics.
Sequence Alignment.
Pair-wise Sequence Alignment What happened to the sequences of similar genes? random mutation deletion, insertion Seq. 1: 515 EVIRMQDNNPFSFQSDVYSYG EVI.
Pairwise alignments Introduction Introduction Why do alignments? Why do alignments? Definitions Definitions Scoring alignments Scoring alignments Alignment.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Arun Goja MITCON BIOPHARMA
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Applied Bioinformatics Week 3. Theory I Similarity Dot plot.
Sequence Alignments with Indels Evolution produces insertions and deletions (indels) – In addition to substitutions Good example: MHHNALQRRTVWVNAY MHHALQRRTVWVNAY-
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Sequence Alignment.
Presentation By SANJOG BHATTA Student ID : July 1’ 2009.
Sequence comparison: Local alignment
Pairwise sequence Alignment.
Pairwise Sequence Alignment
Find the Best Alignment For These Two Sequences
Dynamic Programming Finds the Best Score and the Corresponding Alignment O Alignment: Start in lower right corner and work backwards:
Immunocomputing and Artificial Immune Systems
Presentation transcript:

CS4030: Bio-Computing Revision Lecture

DNA Replication Prior to cell division, all the genetic instructions must be copied so that each new cell will have a complete set DNA polymerase is the enzyme that copies DNA –Reads the old strand in the 3´ to 5´ direction

Over time, genes accumulate mutations Environmental factors Radiation Oxidation Mistakes in replication or repair Deletions, Duplications Insertions Inversions Point mutations

Codon deletion: ACG ATA GCG TAT GTA TAG CCG… –Effect depends on the protein, position, etc. –Almost always deleterious –Sometimes lethal Frame shift mutation: ACG ATA GCG TAT GTA TAG CCG… ACG ATA GCG ATG TAT AGC CG?… –Almost always lethal Deletions

Why align sequences? The draft human genome is available Automated gene finding is possible Gene: AGTACGTATCGTATAGCGTAA –What does it do? One approach: Is there a similar gene in another species? –Align sequences with known genes –Find the gene with the best match

Are there other sequences like this one? 1) Huge public databases - GenBank, Swissprot, etc. 2) Sequence comparison is the most powerful and reliable method to determine evolutionary relationships between genes 3) Similarity searching is based on alignment 4) BLAST and FASTA provide rapid similarity searching a. rapid = approximate (heuristic) b. false + and - scores

Similarity Homology 1) 25% similarity 100 AAs is strong evidence for homology 2) Homology is an evolutionary statement which means descent from a common ancestor –common 3D structure –usually common function –homology is all or nothing, you cannot say "50% homologous"

Comparing two sequences Point mutations, easy: ACGTCTGATACGCCGTATAGTCTATCT ACGTCTGATTCGCCCTATCGTCTATCT Indels are difficult, must align sequences: ACGTCTGATACGCCGTATAGTCTATCT CTGATTCGCATCGTCTATCT ACGTCTGATACGCCGTATAGTCTATCT ----CTGATTCGC---ATCGTCTATCT

Scoring a sequence alignment Match score:+1 Mismatch score:+0 Gap penalty:–1 ACGTCTGATACGCCGTATAGTCTATCT ||||| ||| || |||||||| ----CTGATTCGC---ATCGTCTATCT Matches: 18 × (+1) Mismatches: 2 × 0 Gaps: 7 × (– 1) Score = +11

Origination and length penalties We want to find alignments that are evolutionarily likely. Which of the following alignments seems more likely to you? ACGTCTGATACGCCGTATAGTCTATCT ACGTCTGAT ATAGTCTATCT ACGTCTGATACGCCGTATAGTCTATCT AC-T-TGA--CG-CGT-TA-TCTATCT We can achieve this by penalizing more for a new gap, than for extending an existing gap

Scoring a sequence alignment (2) Match/mismatch score:+1/+0 Origination/length penalty:–2/–1 ACGTCTGATACGCCGTATAGTCTATCT ||||| ||| || |||||||| ----CTGATTCGC---ATCGTCTATCT Matches: 18 × (+1) Mismatches: 2 × 0 Origination: 2 × (–2) Length: 7 × (–1) Score = +7

Scoring Similarity 1) Can only score aligned sequences 2) DNA is usually scored as identical or not 3) modified scoring for gaps - single vs. multiple base gaps (gap extension) 4) AAs have varying degrees of similarity –a. # of mutations to convert one to another –b. chemical similarity –c. observed mutation frequencies 5) PAM matrix calculated from observed mutations in protein families

DNA Scoring Matrix ATCG A1000 T0100 C0010 G0001 ATCG A5-4 T 5 C 5 G 5 ATCG A1-5 T-51-5 C 1-5 G-5 1 IdentityBLASTTransition/Transversion

The dynamic programming concept Suppose we are aligning: ACTCG ACAGTAG Last position choices: G+1ACTC GACAGTA G-1ACTC -ACAGTAG --1ACTCG GACAGTA

We can use a table Suppose we are aligning: A with A … A 0 A

Needleman-Wunsch: Step 1 Each sequence along one axis Mismatch penalty multiples in first row/column 0 in [1,1]

Needleman-Wunsch: Step 2 Vertical/Horiz. move: Score + (simple) gap penalty Diagonal move: Score + match/mismatch score Take the MAX of the three possibilities

Needleman-Wunsch: Step 2 (contd) Fill out the rest of the table likewise…

Needleman-Wunsch: Step 2 (contd) Fill out the rest of the table likewise… The optimal alignment score is calculated in the lower-right corner

But what is the optimal alignment To reconstruct the optimal alignment, we must determine of where the MAX at each step came from…

A path corresponds to an alignment = GAP in top sequence = GAP in left sequence = ALIGN both positions One path from the previous table: Corresponding alignment (start at the end): AC--TCG ACAGTAG Score = +2

Semi-global alignment Suppose we are aligning: GCG GGCG Which do you prefer? G-CG-GCG GGCGGGCG Semi-global alignment allows gaps at the ends for free.

Initialize first row and column to all 0s Allow free horizontal/vertical moves in last row and column Semi-global alignment

Local alignment Global alignments – score the entire alignment Semi-global alignments – allow unscored gaps at the beginning or end of either sequence Local alignment – find the best matching subsequence CGATG AAATGGA This is achieved by allowing a 4 th alternative at each position in the table: zero, if alternative neg. Smith-Waterman Algorithm (1981).

Local alignment Mismatch = –1 this time CGATG AAATGGA

CBA - Artificial Immune Systems Classical Immunity The purpose of the immune system is defence Innate and acquired immunity –Innate is the first line of defense. Germ line encoded (passed from parents) and is quite static (but not totally static) –Adaptive (acquired). Somatic (cellular) and is acquired by the host over the life time. Very dynamic. –These two interact and affect each other

CBA - Artificial Immune Systems Multiple layers of the immune system

CBA - Artificial Immune Systems Innate Immunity May take days to remove an infection, if it fails, then the adaptive response may take over Macrophages and neurophils are actors –Bind to common (known) things. This knowledge has been evolved and passed from generation to generation.

CBA - Artificial Immune Systems Processes within the Immune System (very basically) Negative Selection –Censoring of T-cells in the thymus gland of T-cells that recognise self Defining normal system behavior Clonal Selection –Proliferation and differentiation of cells when they have recognised something Generalise and learn Self vs Non-Self

CBA - Artificial Immune Systems Clonal Selection

CBA - Artificial Immune Systems Clonal Selection

CBA - Artificial Immune Systems Immune Responses

CBA - Artificial Immune Systems A Framework for AIS Algorithms Affinity Representation Application Solution AIS Shape-Space Binary Integer Real-valued Symbolic [De Castro and Timmis, 2002]

CBA - Artificial Immune Systems A Framework for AIS Algorithms Affinity Representation Application Solution AIS Euclidean Manhattan Hamming

CBA - Artificial Immune Systems A Framework for AIS Algorithms Affinity Representation Application Solution AIS Bone Marrow Models Clonal Selection Negative Selection Positive Selection Immune Network Models

Lecture 4CBA - Artificial Immune Systems Shape-Space An antibody can recognise any antigen whose complement lies within a small surrounding region of width (the cross-reactivity threshold) This results in a volume v e known as the recognition region of the antibody veve V S The Representation Layer veve veve [Perelson,1989]

Lecture 4CBA - Artificial Immune Systems Affinity Layer Computationally, the degree of interaction of an antibody-antigen or antibody-antibody can be evaluated by a distance or affinity measure The choice of affinity measure is crucial: It alters the shape-space topology It will introduce an inductive bias into the algorithm It needs to take into account the data-set used and the problem you are trying to solve The Affinity Layer

Lecture 4CBA - Artificial Immune Systems The Affinity Layer Affinity Affinity through shape similarity. On the left, a region where all antigens present the same affinity with the given antibody. On the right, antigens in the region b have a higher affinity than those in a

Lecture 4CBA - Artificial Immune Systems Hamming Shape Space 1 if Ab i != Ag i : 0 otherwise (XOR operator) The Affinity Layer

Lecture 4CBA - Artificial Immune Systems Hamming Shape Space (a) Hamming distance (b) r-contigous bits rule The Affinity Layer

CBA - Artificial Immune Systems Mutation - Binary Single point mutation Multi-point mutation

CBA - Artificial Immune Systems Affinity Proportional Mutation Affinity maturation is controlled – Proportional to antigenic affinity – (D*) = exp(- D*) – =mutation rate – D*= affinity – =control parameter

Lecture 4CBA - Artificial Immune Systems The Algorithms Layer Bone Marrow models ( Hightower, Oprea, Kim ) Clonal Selection – Clonalg(De Castro), B-Cell (Kelsey) Negative Selection – Forrest, Dasgputa,Kim,…. Network Models – Continuous models:Jerne,Farmer – Discrete models: RAIN (Timmis), AiNET (De Castro) The Algorithms Layer

Lecture 4CBA - Artificial Immune Systems Clonal Selection –CLONALG 1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle The Algorithms Layer

Lecture 4CBA - Artificial Immune Systems 1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Clonalg Create a random population of individuals (P) The Algorithms Layer

Lecture 4CBA - Artificial Immune Systems 1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Clonalg For each antigenic pattern in the data-set S do: The Algorithms Layer

1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Lecture 4CBA - Artificial Immune Systems Clonal Selection Present it to the population P and determine its affinity with each element of the population The Algorithms Layer

1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Lecture 4CBA - Artificial Immune Systems Clonal Selection Select n highest affinity elements of P Generate clones proportional to their affinity with the antigen (higher affinity=more clones) The Algorithms Layer

Lecture 4CBA - Artificial Immune Systems 1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Clonal Selection Mutate each clone High affinity=low mutation rate and vice-versa Add mutated individuals to population P Reselect best individual to be kept as memory m of the antigen presented The Algorithms Layer

1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Lecture 4CBA - Artificial Immune Systems Clonal Selection Replace a number r of individuals with low affinity with randomly generated new ones The Algorithms Layer

Lecture 4CBA - Artificial Immune Systems 1.Initialisation 2.Antigenic presentation a.Affinity evaluation b.Clonal selection and expansion c.Affinity maturation d.Metadynamics 3.Cycle Clonal Selection Repeat step 2 until a certain stopping criterion is met The Algorithms Layer

CBA - Artificial Immune Systems Naive Application of Clonal Selection Generate a set of detectors capable of identifying simple digits Represented as a simple bitmap

CBA - Artificial Immune Systems Representation Each individual is a bitstring Use hamming distance as affinity metric

CBA - Artificial Immune Systems Evolution of Detectors Clones Mutated clones

Lecture 5CBA - Artificial Immune Systems Negative Selection Algorithms Define Self as a normal pattern of activity or stable behavior of a system/process – A collection of logically split segments (equal-size) of pattern sequence. – Represent the collection as a multiset S of strings of length l over a finite alphabet. Generate a set R of detectors, each of which fails to match any string in S. Monitor new observations (of S) for changes by continually testing the detectors matching against representatives of S. If any detector ever matches, a change ( or deviation) must have occurred in system behavior. The Algorithms Layer

Lecture 5CBA - Artificial Immune Systems Illustration of NS Algorithm: Self Non_Self Self Match Dont Match r=2 The Algorithms Layer

CBA - Artificial Immune Systems Negative Selection Cross-reactivity threshold = 1 Here M[1,1], M[1,4] and M[2,2] are above the threshold Add these to Available repertoire Eliminate the rest.

QR Motivations Problems with RBS – Reasoning from First Principles – Dangers with nearest approximation Second Generation Expert Systems – Use deep knowledge – Provide explanations of reasoning process Commonsense reasoning – Capture how humans reason – Enable use of appropriate causality Model reuse – Improved ease of ES maintenance

Arithmetic Operations Sign Algebra _ _ MULT DIV + + _ _ _ _ + + _ _ 0 0 X X X

Aritmetic Operations (2) _ _ _ _ _ + _ 0 + ? ? _ _ ? ? + + _ _ ADD SUB

Arithmetic Operations (3) A = B - C where B & C both have value [+], A will be undefined Disambiguation – may be possible from other information – A = [+] if B > C – A = [0] if B = C – A = [-] if B < C Functional Relations – Y = M+(X) – Y = M-(X)

Curve Shapes _ _ d1d1 d2d2

Transition Rules Intermediate Value Theorem (IVT) – States that for a continuous system, a function joining two points of opposite sign must pass through zero. Mean Value Theorem (MVT) – Defines the direction of change of a variable between two points. [++][+o][+-] [o+][oo][o-] [-+][-o][- -]

Single Compartment System plane 0 f10 = k10.x1 x1 = u - f10 plane 1 f10 = k10.x1 x1 = u - f10 plane 2 f10 = k10.x1 x1 = u - f10 1 u k 10.x 1

Models in Morven (define-fuzzy-model (short-name ) (variables ) (auxiliary-variables ) (input ) (constraints (print ) )

A JMorven Model model-name: single-tank short-name: fst NumSystemVariables: 2 variable: qorange: zero p-maxNumDerivatives: 1qspaces: tanks-quantity-space variable: V range: zero p-maxNumDerivatives: 2qsapces: tanks-quantity-space tanks-quantity-space2 NumExogenousVariables: 1 variable: qirange: zero p-maxNumDerivatives: 1qspaces: tanks-quantity-space Constraints: NumDiffPlanes: 2 Plane: 0NumConstraints: 2 Constraint: func (dt 0 qo) (dt 0 V) NumMappings: 9 Mappings: n-max n-large n-medium n-small zero p-small p-medium p-large p-max Constraint: sub (dt 1 V) (dt 0 qi) (dt 0 qo) NumVarsToPrint: 3VarsToPrint: V qi qo

A JMorven Quantity Space NumQSpaces: 2 QSpaceName: tanks-quantity-space NumQuantities: 9 n-max n-large n-medium n-small zero p-small p-medium p-large p-max QSpaceName: tanks-quantity-space2 NumQuantities: 5 nl-dash ns-dash zero ps-dash pl-dash

Possible States statevectorstatevector o o23+ - o o o o o o o 6+ + o o o29o + + o o o + +31o + o o + o32o + o o 12+ o + -33o + o o o +34o o o o35o + - o 15+ o o -36o o - +37o o o - o38o o + o 18+ o - -39o o o o o o41o o o o

Step Response t V

Solution Space V qiqi

Cascaded Systems plane 0 qx = k1.h1 qo = k2.h2 h1 = qi - qx h2 = qx - qo plane 1 qx = k1.h1 qo = k2.h2 h1 = qi - qx h2 = qx - qo plane 2 qx = k1.h1 qo = k2.h2 h1 = qi - qx h2 = qx - qo Tank A Tank B 1 2 u k12.x1 k20.x2 h1h1 h2h2 qiqi qxqx qoqo

Cascaded Systems Envisionment

Cascaded Systems Solution Space h2h2 h1h1 h 1 =