Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.

Similar presentations


Presentation on theme: "Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla."— Presentation transcript:

1 Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla

2 Introduction Biomarkers are used to measure the progress of disease or the physiological effects of the therapeutic intervention in the treatment of disease. They are mainly used for the early warning signs for various diseases such as cancer and inflammatory diseases.

3 The selection and design of the features that will be considered in order to represent each example for the learning process are very important and will influence the classifier performance. Instances in any data set used by the machine learning methods are presented by the sequence of features which has each instance and the type of features. Eg: age, size

4 Two major learning schemes in machine learning are Unsupervised learning Supervised learning Unsupervised learning : there is no prior information is given to the learner regarding the data or the output. Clustering is the simple classical method of unsupervised learning.

5 Clustering methods Exclusive clustering( k-means algorithm) Overlapping clustering (fuzzy C-means algorithm) Hierarchical clustering Probabilistic

6 Supervised learning The instances are given with known labels its main goal is to build a classifier which makes predictions about future instances to assign their class labels.

7 A biomarker is a gene, protein/peptide or metabolite in a biological system used to indicate a physiological or pathological state that can be recognized or monitored. Gene expression which studies bridge gap between DNA information and trait information by dissecting biochemical pathways into intermediate components between genotype and phenotype. Biomarker – Biological Background

8

9 Genomics is divided into two basic areas as structural genomics and functional genomics Structural genomics related to the genetics Functional genomics this allows the detection of the genes that are turned on/off at any given time depending on environmental factors.

10 One particularly powerful application of gene expression analyses is biomarker identification which can be used for disease risk assessment, early detection, prognosis, prediction response to thearpy and preventative measures is a challenging task for cancer preventition and the improvement of treatment outcomes.

11 Computational Biomarker (Feature) selection Classification of samples from gene expression datasets usually involves small numbers of samples and tens of thousands of genes. There are two main categories : Filtering methods Wrapper approaches

12 Filtering method: each gene is examined individually. Wrapper method: correlations among the genes are taken into account and also establish the ranking among the significant genes.

13 Support vector machine(SVM) algorithms and ridge regression(RR) which is used for classifying the gene expression datasets and also the classification accuracy. RR performs the best comparision further demonstrating the advantages of the wrapper method over the filtering methods.

14 RFE for SVM which uses the “naïve” ranking on the subset of genes. The naïve ranking is the first iteration of RFE for obtaining the ranks of each gene. The SVM-RFE which is superior to SVM without RFE also uses the multivariate linear discriminant methods such as the LDA and MSD.

15 Wrapper method uses the gene selection and classification which compares the SVM-RCE K-means algorithm for gene clustering and the machine learning algorithm, SVM for classification and gene cluster ranking. Evaluates the contribution of each of those clusters to classification task by SVM.

16

17 Recently Grate has described a technique for discovering small sets of genes. The technique is mainly base on brute force approach of exhaustive search through all genes, gene pairs and some cases triple of genes. The classification has two methods: error- correcting output coding (ECOC) and pairwise coupling (PWC)

18 The biomarker pattern for distingushing each disease category from another one which is achieved by the development of an extended Markov Blanket(EMB) feature selection method. The clusters with less information are removed while retaining the remainder for the next classification step. this process is repeated until an optimal classification result is obtained.

19 Conclusion As the proposed method has many computational approaches which are critical for mining high dimensional data in order to effectively discover biomarkers. The best data mining approach would to integrate different approaches to arrive an effective algorithm as most suggested methods ignoring the existing biological knowledge and treating all genes equally.

20 Thank You


Download ppt "Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla."

Similar presentations


Ads by Google