Proteomics Informatics – Signal processing I: analysis of mass spectra (Week 3)

Slides:



Advertisements
Similar presentations
Image Enhancement in the Frequency Domain (2)
Advertisements

Spatial Filtering (Chapter 3)
Chap 4 Image Enhancement in the Frequency Domain.
Computer Vision - A Modern Approach Set: Linear Filters Slides by D.A. Forsyth Differentiation and convolution Recall Now this is linear and shift invariant,
Edge detection. Edge Detection in Images Finding the contour of objects in a scene.
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Computer Vision - A Modern Approach
Digital Image Processing Chapter 5: Image Restoration.
CPSC 641 Computer Graphics: Fourier Transform Jinxiang Chai.
Lecture 2: Image filtering
Transforms: Basis to Basis Normal Basis Hadamard Basis Basis functions Method to find coefficients (“Transform”) Inverse Transform.
Digital Image Processing, 2nd ed. © 2002 R. C. Gonzalez & R. E. Woods Chapter 4 Image Enhancement in the Frequency Domain Chapter.
13.1 Fourier transforms: Chapter 13 Integral transforms.
Modern Navigation Thomas Herring
Proteomics Informatics Workshop Part III: Protein Quantitation
Linear Algebra and Image Processing
Digital Signals and Systems
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Presentation Image Filters
Introduction to Error Analysis
Handling Data and Figures of Merit Data comes in different formats time Histograms Lists But…. Can contain the same information about quality What is meant.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Image Processing © 2002 R. C. Gonzalez & R. E. Woods Lecture 4 Image Enhancement in the Frequency Domain Lecture 4 Image Enhancement.
WAVELET TRANSFORM.
Nelson Research, Inc – N. 88 th St. Seattle, WA USA aol.com Cell Current Noise Analysis – a Worked Example Regarding.
Instrument Components Signal Generator (Energy Source) Analytical Signal Transducer Signal Processor Display Can you identify these components in the following.
Digital Image Processing
EDGE DETECTION IN COMPUTER VISION SYSTEMS PRESENTATION BY : ATUL CHOPRA JUNE EE-6358 COMPUTER VISION UNIVERSITY OF TEXAS AT ARLINGTON.
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Digital Image Processing CSC331 Image Enhancement 1.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
8-1 Chapter 8: Image Restoration Image enhancement: Overlook degradation processes, deal with images intuitively Image restoration: Known degradation processes;
Ch5 Image Restoration CS446 Instructor: Nada ALZaben.
Course 2 Image Filtering. Image filtering is often required prior any other vision processes to remove image noise, overcome image corruption and change.
1“Principles & Applications of SAR” Instructor: Franz Meyer © 2009, University of Alaska ALL RIGHTS RESERVED Dr. Franz J Meyer Earth & Planetary Remote.
Radiation Detection and Measurement, JU, First Semester, (Saed Dababneh). 1 Counting Statistics and Error Prediction Poisson Distribution ( p.
Templates, Image Pyramids, and Filter Banks
Edge Detection and Geometric Primitive Extraction Jinxiang Chai.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
Signal processing.
Previous Lecture: ChIP-Seq. Introduction to Biostatistics and Bioinformatics Signal Processing This Lecture.
Chapter 4. Fourier Transformation and data processing:
Discrete-time Random Signals
CSE 6367 Computer Vision Image Operations and Filtering “You cannot teach a man anything, you can only help him find it within himself.” ― Galileo GalileiGalileo.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
Computer Graphics & Image Processing Chapter # 4 Image Enhancement in Frequency Domain 2/26/20161.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Instructor: Mircea Nicolescu Lecture 7
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
Date of download: 6/29/2016 Copyright © 2016 SPIE. All rights reserved. Experimental SEM images of an ArF-photoresist pattern. The images are 2000 nm long.
Spatial Filtering (Chapter 3) CS474/674 - Prof. Bebis.
Locating a Shift in the Mean of a Time Series Melvin J. Hinich Applied Research Laboratories University of Texas at Austin
Yun, Hyuk Jin. Theory A.Nonuniformity Model where at location x, v is the measured signal, u is the true signal emitted by the tissue, is an unknown.
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Linear Filters and Edges Chapters 7 and 8
Signal processing.
ECE 692 – Advanced Topics in Computer Vision
Signal processing.
Image Enhancement in the
The Chinese University of Hong Kong
Image Analysis Image Restoration.
Outline Linear Shift-invariant system Linear filters
Interpolation and Pulse Shaping
Signal Processing Lab.
Basic Image Processing
Fourier Optics P47 – Optics: Unit 8.
Digital Image Processing Week IV
9.5 Least-Squares Digital Filters
Lecture 2: Image filtering
Presentation transcript:

Proteomics Informatics – Signal processing I: analysis of mass spectra (Week 3)

Example data – MALDI-TOF Peptide intensity vs m/z

Fragment intensity vs m/z Example data – ESI-LC-MS/MS Time m/z MS/MS Peptide intensity vs m/z vs time

Sinus amplitude Wave length b a c 

Sinus and Cosinus b a c 

Two Frequencies

Fourier Transform

from numpy import * x=2.0*pi*arange(1000.0)/ sin1 = sin(1000.0*x) sin2 = 0.2*sin( *x) sin12=sin1+sin2 fft12=fft.rfft(sin12) Frequency

Inverse Fourier Transform Frequency

Inverse Fourier Transform from numpy import * x=2.0*pi*arange(1000.0)/ sin1 = sin(1000.0*x) sin2 = 0.2*sin( *x) sin12=sin1+sin2 fft12=fft.rfft(sin12) sin12_= fft.irfft(fft12,len(sin12)) Frequency

Inverse Fourier Transform Frequency

A Peak centroid full width at half maximum (FWHM) area height maximum mean variance skewness kurtosis Intensity

Mean and variance Mean Variance A peak is defined by and

Skewness and kurtosis Skewness Kurtosis

A Gaussian Peak def gaussian(x,x0,s): return exp(-(x-x0)**2/(2*s**2)) x = linspace(-1,1,1000) y=gaussian(x,0,0.1) ffty=fft.rfft(y) Frequency

A Gaussian Peak Skewness = 0 Kurtosis = 0 Frequency

Peak with a longer tail Frequency

A skewed peak def pdf(x): return 1/sqrt(2*pi) * exp(-x**2/2) def cdf(x): return (1 + erf(x/sqrt(2))) / 2 def skew(x,e=0,w=1,a=0): t = (x-e) / w return 2 / w * pdf(t) * cdf(a*t) Frequency

Normal noise x = linspace(-1,1,1000) y=0.2*random.normal(size=len(x)) If the noise is not normally distributed, try to find a transform that makes it normal Frequency

Lognormal noise x = linspace(-1,1,1000) y=0.2*random.lognormal(size=len(x)) Frequency

Skewed noise x=random.uniform(-1.0,1.0,size=10*len(x)) y=random.uniform(0.0,1.0,size=10*len(x)) yskew=skew(x,-0.1,0.2,10)/max(yskew) yn_skew=x_test[y<yskew][:len(x)] Frequency

Gaussian peak with normal noise Frequency

Removing High Frequences Frequency

Convolution Describes the response of a linear and time- invariant system to an input signal The inverse Fourier transform of the pointwise product in frequency space

Smoothing by convolution

Smoothing w=ones(2*width+1,'d') convolve(w/w.sum(),y,'valid‘) Frequency Intensity

Smoothing

Adaptive Background Correction (unsharp masking) Unsharp masking Original wi = linspace(1,window_len,window_len) w = 1 / ( 2*r_[wi[::-1],0,wi] + 1 ) x_ = x - d*convolve(w/w.sum(),x,'valid')

Adaptive Background Correction

Smoothing and Adaptive Background Correction

Savitsky-Golay smoothing Polynomial order = 3 Bin size = 25 Bin size = 75 Bin size = 150 Polynomial order = 5Polynomial order = 7

Background Frequency

Background Subtraction Using Smoothing Bin size = 100Bin size = 200Bin size = 300 Smooting Background subtraction

Root Mean Square Deviation (RMSD) The Root Mean Square Deviation (RMSD) is often constant for the noise and larger for the peak if the window size is approximately the size of the peak.

Background Subtraction using RMSD Bin size = 100Bin size = 200Bin size = 300 RMSD Intensity

Convolution, Cross-correlation, and Autocorrelation Convolution describes the response of a linear and time-invariant system to an input signal. The inverse Fourier transform of the pointwise product in frequency space. Cross-correlation is a measure of similarity of two signals. It can be used for finding a shift between two signals. Auto-correlation is the cross-correlation of a signal with itself. It can be used for finding periodic signals obscured by noise.

Cross-correlation and autocorrelation

Autocorrelation Signal Same signal

Cross-correlation Signal Shifted signal

Cross-correlation Signal Half of the peaks shifted

How similar are two signals? Dot product Identical vectors: Perpendicular vectors: The dot product is the came as the cross-correation at zero:

What are the characteristics of the dot product? S/N Dimensions Signal+Noise Noise

Autocorrelation Signal Shifted signal Sum of signal and shifted signal

Coincidence – enhances the signal The signal to noise can be dramatically increased by measuring several independent signals of the same phenomenon and combining these signals. Ideal signal Product of the four measurements Four measurements

Coincidence – supresses and transforms the noise Noise in productOriginal noise

Coincidence – supresses interference Ideal signal Product of the four measurements Four measurements with interference

Peak Finding The derivative of a function is zero at its minima and maxima. The second derivative is negative at maxima and positive at minima.

Peak Finding 1.Characterize the signal and the noise 2.Make a model of the data 3.Select detection method 4.Select parameters using simulations Intensity

Peak Finding: Characterizing the noise Intensity Let’s first try without removing the peaks

Peak Finding: Characterizing the noise Intensity Removing the peaks by looking for outliers in the root mean square deviation (RMSD) RMSD

Peak Finding: Characterizing the peaks Intensity

Peak Finding: Model of data points=1000 x = linspace(-1,1,points) y=noise*random.normal(size=len(x)) y+=signal*gaussian(x,0,0.01) S/N=1S/N=2S/N=4

Peak Finding: Detection method S/N=1S/N=2S/N=4 Peaks can be detected by finding maxima in the moving average with a window size similar to the peak width

Peak Finding: Detection method – moving average S/N=1 S/N=2 S/N=4 Bin size = 5Bin size = 20Bin size = 80Signal

Peak Finding: Detection method – RMSD S/N=1 S/N=2 S/N=4 Bin size = 5Bin size = 20Bin size = 80Signal

Peak Finding: Information about the Peak centroid (mean) full width at half maximum (FWHM) area height maximum mean variance skewness kurtosis Intensity

Information about a Peak Centroid or mean A peak is defined by To calculate any of these measures we need to know where the peak starts and ends.

Where does a peak start and end?

Estimating peptide quantity Peak height Curve fitting Peak area Peak height Curve fitting m/z Intensity

Time dimension m/z Intensity Time m/z Time

Sampling Retention Time Intensity

5% Acquisition time = 0.05  5% Sampling

What is the best way to estimate quantity? Peak height - resistant to interference - poor statistics Peak area - better statistics - more sensitive to interference Curve fitting - better statistics - needs to know the peak shape - slow

Homework: Background Subtraction Using Smoothing

Summary Fourier transform - transformation to frequency space and back Signal – how do we detect and characterize signals? Noise – how do we characterize noise? Modeling signal and noise Simulation to select thresholds and select parameters Filters – fitering by low-pass (i.e. smoothing) and high-pass filters (e.g. adaptive background correction) Detection methods based on moving average and RMSD Convolution - describes the response of a linear and time-invariant system to an input signal Cross-correlation is a measure of similarity of two signals Autocorrelation can be used for finding periodic signals obscured by noise The dot product can be used to determine how similar two signals are Coincidence measurements enhance the signal and supresses noise The quantity associated with a peak – height and area Sampling – how often do we need to sample a peak to get a good estimate of its area?

Proteomics Informatics – Signal processing I: analysis of mass spectra (Week 3)