Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray Analysis: Image processing and Filter design Instructors: Dr.Ravi Sankar Dr.Wei Qian Student: Kun Li Nov 2006.

Similar presentations


Presentation on theme: "Microarray Analysis: Image processing and Filter design Instructors: Dr.Ravi Sankar Dr.Wei Qian Student: Kun Li Nov 2006."— Presentation transcript:

1 Microarray Analysis: Image processing and Filter design Instructors: Dr.Ravi Sankar Dr.Wei Qian Student: Kun Li Nov 2006

2 Introduction of Microarray Analysis Microarray is a new technology of molecular biology research. It is an excellent tool to monitor gene transcription for thousands of genes at a time. The first step of this technique involves spotting known sequences on a substrate, which in most cases are glass slides or nylon membranes. This is followed by reverse transcription of mRNA isolated from the biological subjects under study into cDNA. During the process of reverse transcription, the control and the experimental materials are differentially labeled, pooled and hybridized to the arrays. cDNA strands in this pool hybridize to complementary sequences on the array by competing for them. The relative abundance of the corresponding mRNA from the two sources will be assessed by the mesured signal.

3 Continue … The objectives of microarray experiments are to reveal unknown genes and new gene functions as a result of experimental treatments, to find new gene expression patterns and use them as a basis for classification of physiological or pathological processes.

4 Continue …

5 Microarray Image Processing Microarray Image Processing We know there are many differences in between Micro-array images of patients and normal person. Analysis of micro-array images will help us in cancer detection and diagnosis, and more importantly it can help us to identify cancer related genes. Actually, many researches about recognition and comparison of gene expression pattern have been done. We know there are many differences in between Micro-array images of patients and normal person. Analysis of micro-array images will help us in cancer detection and diagnosis, and more importantly it can help us to identify cancer related genes. Actually, many researches about recognition and comparison of gene expression pattern have been done.

6 Literature Review Microarray Analysis attracts lots of interests from researchers, there so many literatures. I got 137 papers published only in Here I list 20 of them as below: Microarray Analysis attracts lots of interests from researchers, there so many literatures. I got 137 papers published only in Here I list 20 of them as below: SE Ahnert, K Willbrand, FCS Brown, TMA Fink (2006), "Unbiased pattern detection in microarray data series", Bioinformatics, 22(12): SE Ahnert, K Willbrand, FCS Brown, TMA Fink (2006), "Unbiased pattern detection in microarray data series", Bioinformatics, 22(12): David B Allison, Xiangqin Cui1, Grier P Page1, Mahyar Sabripou (2006), "Microarray data analysis: from disarray to consolidation and consensus", Nature Reviews Genetics, 7: David B Allison, Xiangqin Cui1, Grier P Page1, Mahyar Sabripou (2006), "Microarray data analysis: from disarray to consolidation and consensus", Nature Reviews Genetics, 7: Claes R Andersson, Anders Isaksson, Mats G Gustafsson (2006), "Bayesian detection of periodic mRNA time profiles without use of training examples", BMC Bioinformatics, 7:63. Claes R Andersson, Anders Isaksson, Mats G Gustafsson (2006), "Bayesian detection of periodic mRNA time profiles without use of training examples", BMC Bioinformatics, 7:63. Richard P Auburn, Roslin R Russell, Bettina Fischer, Lisa A Meadows, Santiago Sevillano Matilla, Steven Russell (2006), "SimArray: a user-friendly and user-configurable microarray design tool", BMC Bioinformatics, 7:102. Richard P Auburn, Roslin R Russell, Bettina Fischer, Lisa A Meadows, Santiago Sevillano Matilla, Steven Russell (2006), "SimArray: a user-friendly and user-configurable microarray design tool", BMC Bioinformatics, 7:102. Simon Barkow, Stefan Bleuler, Amela Prelic, Philip Zimmermann, and Eckart Zitzler (2006), "BicAT: a biclustering analysis toolbox", Bioinformatics, 22(10): Simon Barkow, Stefan Bleuler, Amela Prelic, Philip Zimmermann, and Eckart Zitzler (2006), "BicAT: a biclustering analysis toolbox", Bioinformatics, 22(10):

7 Continue … Anders Bengtsson, Henrik Bengtsson (2006), "Microarray image analysis: background estimation using quantile and morphological filters", BMC Bioinformatics, 7:96. Anders Bengtsson, Henrik Bengtsson (2006), "Microarray image analysis: background estimation using quantile and morphological filters", BMC Bioinformatics, 7:96. Henrik Bengtsson, Ola Hossjer (2006), "Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method", BMC Bioinformatics, 7:100. Henrik Bengtsson, Ola Hossjer (2006), "Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method", BMC Bioinformatics, 7:100. Daniel Berrar, Ian Bradbury, Werner Dubitzky (2006), "Avoiding model selection bias in small-sample genomic data sets", Bioinformatics, 22(10): Daniel Berrar, Ian Bradbury, Werner Dubitzky (2006), "Avoiding model selection bias in small-sample genomic data sets", Bioinformatics, 22(10): Daniel Berrar, Ian Bradbury, Werner Dubitzky (2006), "Instance-based concept learning from multiclass DNA microarray data", BMC Bioinformatics, 7:73. Daniel Berrar, Ian Bradbury, Werner Dubitzky (2006), "Instance-based concept learning from multiclass DNA microarray data", BMC Bioinformatics, 7:73. Ghislain Bidaut, Karsten Suhre, Jean-Michel Claverie, Michael F Ochs (2006), "Determination of strongly overlapping signaling activity from microarray data", BMC Bioinformatics, 7:99. Ghislain Bidaut, Karsten Suhre, Jean-Michel Claverie, Michael F Ochs (2006), "Determination of strongly overlapping signaling activity from microarray data", BMC Bioinformatics, 7:99. Jonathon Blake, Christian Schwager, Misha Kapushesky, and Alvis Brazma (2006), "ChroCoLoc: an application for calculating the probability of co-localization of microarray gene expression", Bioinformatics, 22: Jonathon Blake, Christian Schwager, Misha Kapushesky, and Alvis Brazma (2006), "ChroCoLoc: an application for calculating the probability of co-localization of microarray gene expression", Bioinformatics, 22:

8 Continue … Marta Blangiardo, Simona Toti, Betti Giusti, Rosanna Abbate, Alberto Magi, Filippo Poggi, Luciana Rossi, Francesca Torricelli, and Annibale Biggeri (2006) "Using a calibration experiment to assess gene-specific information: full Bayesian and empirical Bayesian models for two-channel microarray data", Bioinformatics, 22: Marta Blangiardo, Simona Toti, Betti Giusti, Rosanna Abbate, Alberto Magi, Filippo Poggi, Luciana Rossi, Francesca Torricelli, and Annibale Biggeri (2006) "Using a calibration experiment to assess gene-specific information: full Bayesian and empirical Bayesian models for two-channel microarray data", Bioinformatics, 22: Philippe Bro ë t, Vladimir A. Kuznetsov, Jonas Bergh, Edison T. Liu, Lance D. Miller (2006), "Identifying gene expression changes in breast cancer that distinguish early and late relapse among uncured patients", Bioinformatics, 22(12): Philippe Bro ë t, Vladimir A. Kuznetsov, Jonas Bergh, Edison T. Liu, Lance D. Miller (2006), "Identifying gene expression changes in breast cancer that distinguish early and late relapse among uncured patients", Bioinformatics, 22(12): Ljubomir J. Buturovic (2006), "PCP: a program for supervised classification of gene expression profiles", Bioinformatics, 22: Ljubomir J. Buturovic (2006), "PCP: a program for supervised classification of gene expression profiles", Bioinformatics, 22: Roger D Canales, Yuling Luo, James C Willey, Bradley Austermiller, Catalin C Barbacioru, Cecilie Boysen, Kathryn Hunkapiller, Roderick V Jensen, Charles R Knight, Kathleen Y Lee, Yunqing Ma, Botoul Maqsodi, Adam Papallo, Elizabeth Herness Peters, Karen Poulter, Patricia L Ruppel, Raymond R Samaha, Leming Shi, Wen Yang, Lu Zhang, Federico M Goodsaid (2006), "Evaluation of DNA microarray results with quantitative gene expression platforms", Nature Biotechnology, 24: Roger D Canales, Yuling Luo, James C Willey, Bradley Austermiller, Catalin C Barbacioru, Cecilie Boysen, Kathryn Hunkapiller, Roderick V Jensen, Charles R Knight, Kathleen Y Lee, Yunqing Ma, Botoul Maqsodi, Adam Papallo, Elizabeth Herness Peters, Karen Poulter, Patricia L Ruppel, Raymond R Samaha, Leming Shi, Wen Yang, Lu Zhang, Federico M Goodsaid (2006), "Evaluation of DNA microarray results with quantitative gene expression platforms", Nature Biotechnology, 24:

9 Continue … Pedro Carmona-Saez, Monica Chagoyen, Andres Rodriguez, Oswaldo Trelles, Jose M Carazo, Alberto Pascual-Montano (2006), "Integrated analysis of gene expression by association rules discovery", BMC Bioinformatics, 7:54. Pedro Carmona-Saez, Monica Chagoyen, Andres Rodriguez, Oswaldo Trelles, Jose M Carazo, Alberto Pascual-Montano (2006), "Integrated analysis of gene expression by association rules discovery", BMC Bioinformatics, 7:54. Pedro Carmona-Saez, Roberto D Pascual-Marqui, Francisco Tirado, Jose M Carazo, Alberto Pascual-Montano (2006), "Biclustering of gene expression data by non-smooth non-negative matrix factorization", BMC Bioinformatics, 7:78. Pedro Carmona-Saez, Roberto D Pascual-Marqui, Francisco Tirado, Jose M Carazo, Alberto Pascual-Montano (2006), "Biclustering of gene expression data by non-smooth non-negative matrix factorization", BMC Bioinformatics, 7:78. Yian A Chen, Cheng-Chung Chou, Xinghua Lu, Elizabeth H Slate, Konan Peck, Wenying Xu, Eberhard O Voit, Jonas S Almeida (2006), "A multivariate prediction model for microarray cross-hybridization", BMC Bioinformatics, 7:101. Yian A Chen, Cheng-Chung Chou, Xinghua Lu, Elizabeth H Slate, Konan Peck, Wenying Xu, Eberhard O Voit, Jonas S Almeida (2006), "A multivariate prediction model for microarray cross-hybridization", BMC Bioinformatics, 7:101. H Chipman, R Tibshirani (2006), "Hybrid hierarchical clustering with applications to microarray data", Biostatistics, 7(2): H Chipman, R Tibshirani (2006), "Hybrid hierarchical clustering with applications to microarray data", Biostatistics, 7(2): A Choudhary, M Brun, J Hua, J Lowey, E Suh, ER Dougherty (2006), "Genetic test bed for feature selection," Bioinformatics, 22(7): A Choudhary, M Brun, J Hua, J Lowey, E Suh, ER Dougherty (2006), "Genetic test bed for feature selection," Bioinformatics, 22(7):

10 Continue … Most of these researches focus on pattern recognition using Neural Network and Support Vector Machine, Gene Identification and improving the image processing methods, such as optimizing Normalization and Noise reduction method. Most of these researches focus on pattern recognition using Neural Network and Support Vector Machine, Gene Identification and improving the image processing methods, such as optimizing Normalization and Noise reduction method. Our research is different in the sense that it combines image processing and signal processing and focuses on maping the relations between genes associated with breast cancer. Our research is different in the sense that it combines image processing and signal processing and focuses on maping the relations between genes associated with breast cancer.

11 Thinking About Our New Method The most important thing of cancer detection, diagnosis and treatment is to detect cancer and identify its type in the early stage when no obvious symptoms that can be detected by traditional methods developed. The most important thing of cancer detection, diagnosis and treatment is to detect cancer and identify its type in the early stage when no obvious symptoms that can be detected by traditional methods developed. From a new microarray image, how can we detect its cancer development “ potential ” ? From a new microarray image, how can we detect its cancer development “ potential ” ?

12 Continue We believe the image pattern will give us some “ hints ” for cancer detection. We believe the image pattern will give us some “ hints ” for cancer detection. In fact, cancer development process involves lots of genes, that means before a cancer gene expressed, the expression level of many other genes have changed. So, if we can find out the “ implicit ” relations between cancer related genes, the problem solved. In fact, cancer development process involves lots of genes, that means before a cancer gene expressed, the expression level of many other genes have changed. So, if we can find out the “ implicit ” relations between cancer related genes, the problem solved. We are planning to design some filters that can be applied on microarray image to generate some specific “ signatures ” for cancer and normal. We are planning to design some filters that can be applied on microarray image to generate some specific “ signatures ” for cancer and normal.

13 It ’ s important to emphasize early stage here. It ’ s important to emphasize early stage here. Why? Because detecting cancer after people get it is not as meaningful as predicting the cancer developing probability “ before ” people get it. The figures below are small parts of normal and cancer microarray images. Why? Because detecting cancer after people get it is not as meaningful as predicting the cancer developing probability “ before ” people get it. The figures below are small parts of normal and cancer microarray images. Normal Cancer

14 ? This mid-stage (developing/early) is critical, if we know the gene expression patterns of mid-stage, we can accurately predict cancer development. However, it ’ s not possible for us to get these patterns because we have to use other methods to detect cancers, then decide a pattern belong to cancer or normal. If cancers have been detected by traditional methods, it not the mid-stage we want. Normal Cancer In developing process

15 How to resolve mid-stage problem We can assume there is a cycle as below: We can assume there is a cycle as below: ? Normal Cancer In developing process ? After treatment

16 In the cycle in previous page, we can assume that the two question marks have some similarities. Therefore, we can use the gene expression patterns of “ after treatment ” as a type of control of the gene expression pattern in “ developing ”. (although they have similarities, they won ’ t be exactly the same, so we can only use pattern of “ after treatment ” as reference.) In the cycle in previous page, we can assume that the two question marks have some similarities. Therefore, we can use the gene expression patterns of “ after treatment ” as a type of control of the gene expression pattern in “ developing ”. (although they have similarities, they won ’ t be exactly the same, so we can only use pattern of “ after treatment ” as reference.)

17 Assumption and Hypothesis Assumption: we assume gene expression pattern of “ after treatment ” and “ developing ” have some similarities and the pattern of “ after treatment ” can be use as reference. Assumption: we assume gene expression pattern of “ after treatment ” and “ developing ” have some similarities and the pattern of “ after treatment ” can be use as reference. Hypothesis I: There are differences in between gene expression pattern of “ normal ”, “ developing ”, “ cancer ” and “ after treatment ” stages. These differences can be distinguished via using computational methods. The gene expression pattern of “ developing ” stage can be derived from other three stages with relatively high reliability, Hypothesis I: There are differences in between gene expression pattern of “ normal ”, “ developing ”, “ cancer ” and “ after treatment ” stages. These differences can be distinguished via using computational methods. The gene expression pattern of “ developing ” stage can be derived from other three stages with relatively high reliability,

18 Continue … Hypothesis II: We can design filters and apply them on microarray images of the four stages to generate “ signatures ” of them. Hypothesis II: We can design filters and apply them on microarray images of the four stages to generate “ signatures ” of them. Hypothesis III: The “ signatures ” from different stages can be used to predict cancer developing probabilities. Hypothesis III: The “ signatures ” from different stages can be used to predict cancer developing probabilities.

19 Material All images we processed in this project are from aCGH tumor, provided by Pollack, Jonathan in Stanford University. Thanks to him!!! All images we processed in this project are from aCGH tumor, provided by Pollack, Jonathan in Stanford University. Thanks to him!!!

20 Method (we use an example to explain our method) This is a small portion of a microarray image containing 4800 spots. Red = Cancer Green = Control Yellow = Mixed The first thing to do is to separate the red and green layers.

21 Sample and Control Layer Sample LayerControl Layer

22 Convert RGB Image To Grayscale Image For spots finding, we need to convert the RGB image to grayscale image

23 Compute The Mean Intensity of The Image To set up regular grid, we compute the mean intensity of the column of the image, this will help us identify the center of spots and gap between them. To set up regular grid, we compute the mean intensity of the column of the image, this will help us identify the center of spots and gap between them. Mean Intensity Profile

24 Use Autocorrelation to Enhance the Result Ideally the spots would be periodically spaced, but in practice, they have different shape, size and intensity, so the mean profile looks irregular. We can use autocorrelation to enhance the result. Ideally the spots would be periodically spaced, but in practice, they have different shape, size and intensity, so the mean profile looks irregular. We can use autocorrelation to enhance the result.

25 Peaks Segmentation Remove Background noise, set some threshold to segment the peaks. Remove Background noise, set some threshold to segment the peaks. Enhanced Mean Intensity Profile Peak Segmentation

26 Grid Point Locating The grid point location should be the middle point of two adjacent peaks. The grid point location should be the middle point of two adjacent peaks. Red Lines show the grid location

27 Transpose and Repeating We have done vertical grid. To do horizontal grid, simply transpose the image and repeat the process mentioned before. We have done vertical grid. To do horizontal grid, simply transpose the image and repeat the process mentioned before.

28 Set Up Bounding Boxes Now we can form bounding box regions to address each spot individually by using pairs of neighboring grid points. Now we can form bounding box regions to address each spot individually by using pairs of neighboring grid points.

29 Segment Spots From Background Apply logarithmic transformation and do global threshold. Global Threshold

30 Continue Since we already get the bounding boxes, we can try local threshold. Local Threshold

31 Continue Advantages and disadvantages of the two method mentioned before: Advantages and disadvantages of the two method mentioned before: Log threshold is good, but some weak points missed. Local threshold shows those weak spots, but the spots with strong intensity are bad. Log threshold is good, but some weak points missed. Local threshold shows those weak spots, but the spots with strong intensity are bad.

32 Combine Logarithmic and Local Threshold It is reasonable to combine these two methods. The result is better. Combined Threshold

33 What We Get Now … SampleControl

34 Spots segmentation and intensity computation Final results: CancerControl The number in each bounding box shows the intensity of each spot.

35 Breast Cancer Analysis Now we are ready to inspect the “ implicit ” relations of genes “ hiding ” in a microarray image. Our idea is to design some type of filters which can be applied on microarray image and generate breast cancer “ signatures ”. Now we are ready to inspect the “ implicit ” relations of genes “ hiding ” in a microarray image. Our idea is to design some type of filters which can be applied on microarray image and generate breast cancer “ signatures ”.

36 Here ’ s an Example … We us a 6*6 matrix from breast cancer microarray image as an example. We us a 6*6 matrix from breast cancer microarray image as an example. We use each row of the intensity matrix of normal control to filter the control and cancer microarray images. Here are some results: We use each row of the intensity matrix of normal control to filter the control and cancer microarray images. Here are some results: red = cancer, green = control red = cancer, green = control Row Filter 1

37 Continue … Row Filter 2 Row Filter 3

38 Continue … Row Filter 4 Row Filter 5

39 Continue … Randomly choose spots to design filter: Randomly choose spots to design filter: Random Filter

40 Continue … Choose specific (cancer related) spots to design filter: Specific Filter

41 Continue … The good news is: in the processed result, the cancer and normal control have significant differences between them. It ’ s easy for us to detect breast cancer. The good news is: in the processed result, the cancer and normal control have significant differences between them. It ’ s easy for us to detect breast cancer. The bad news is: we already know the image is from breast cancer patient. How to detect early stage breast cancer reasonably? The bad news is: we already know the image is from breast cancer patient. How to detect early stage breast cancer reasonably?

42 Continue … Another bad news is: based on our small data set, the processed results don ’ t look converge to some standard. Another bad news is: based on our small data set, the processed results don ’ t look converge to some standard. Red = Breast Caner Sample 1; Blue = Breast Cancer Sample 2; Green = Control

43 Optimization Take the first/second derivate of the processed results might (might not, I think it depends on the results itself) be able to optimize the results. Take the first/second derivate of the processed results might (might not, I think it depends on the results itself) be able to optimize the results. An Example An Example

44 First/Second Derivatives

45 Another Example

46 First/Second Derivative

47 Discuss Take first/second derivatives make us focus on the essential differences between normal and cancer. How ever, some expression level information lost. Take first/second derivatives make us focus on the essential differences between normal and cancer. How ever, some expression level information lost.

48 Why the results don ’ t converge? There are many reasons make the results not converge to a standard, such as the “ gene map ” strongly depends on each individual person; different researchers have different habit; researchers use different equipment and reagents; etc. There are many reasons make the results not converge to a standard, such as the “ gene map ” strongly depends on each individual person; different researchers have different habit; researchers use different equipment and reagents; etc.

49 Let ’ s try another method to design the filter … We ’ ll identify genes that strongly related to cancer and design filter according to them. We ’ ll identify genes that strongly related to cancer and design filter according to them. From the cancer (red) and normal (green) images shown before, we can construct two intensity matrix, let ’ s call them Ic and In, subtract In from Ic, we get another matrix which shows the differences between Ic and In, let ’ s call it Id. From the cancer (red) and normal (green) images shown before, we can construct two intensity matrix, let ’ s call them Ic and In, subtract In from Ic, we get another matrix which shows the differences between Ic and In, let ’ s call it Id.

50 For each image, we can get a specific Id, but for the filter design, we need a standard Id. So we average those Ids to get the standard Id. For each image, we can get a specific Id, but for the filter design, we need a standard Id. So we average those Ids to get the standard Id. Id = (Id1+Id2+Id3+ … +Idn)/n Id = (Id1+Id2+Id3+ … +Idn)/n Now we can use the big values in Id to design the filter. Now we can use the big values in Id to design the filter. Note: Id may contain negative values, because some gene ’ s expression level maybe higher in normal than in cancer cells. Note: Id may contain negative values, because some gene ’ s expression level maybe higher in normal than in cancer cells.

51 Here ’ s an example: the biggest positive value in Id is the 61st value: 150, we keep this value, and set all other values equal to 0, we call this new matrix F1. convolve F1 with Id1, the result shows the relation between the 61st gene and all other genes. Here ’ s an example: the biggest positive value in Id is the 61st value: 150, we keep this value, and set all other values equal to 0, we call this new matrix F1. convolve F1 with Id1, the result shows the relation between the 61st gene and all other genes. The biggest negative value in Id is the 32nd value: - 50, we keep this value and set all other values equal to 0, we call this new matrix F2. convolve F2 with Id1, the result shows the relation between the 32nd gene and all other genes. The biggest negative value in Id is the 32nd value: - 50, we keep this value and set all other values equal to 0, we call this new matrix F2. convolve F2 with Id1, the result shows the relation between the 32nd gene and all other genes.

52

53 Take square root …

54 Take log …

55 Another example: linearly add the results from different filter … This time, design the filters according to the 35th and 42nd value in the Id matrix. Set them as F1 and F2 respectively. This time, design the filters according to the 35th and 42nd value in the Id matrix. Set them as F1 and F2 respectively. R1= conv(Id1, F1) R1= conv(Id1, F1) R2= conv(Id1, F2) R2= conv(Id1, F2) R3= a*R1+ b*R2 (a, b are constants) R3= a*R1+ b*R2 (a, b are constants)

56

57 Not good enough, try another idea … The filter design depends on the location of the selected value in the standard Id matrix, it ’ s tedious and not convenient. The filter design depends on the location of the selected value in the standard Id matrix, it ’ s tedious and not convenient. Each spot in the microarray image indicates a specific gene, how can we identify this speciality. Our idea is to bind a specific frequency with each specific gene. For example: bind Gene1 with Sinwt, bind Gene2 with Sin2wt, and so on. Each spot in the microarray image indicates a specific gene, how can we identify this speciality. Our idea is to bind a specific frequency with each specific gene. For example: bind Gene1 with Sinwt, bind Gene2 with Sin2wt, and so on.

58 The elements of Id look like this: The elements of Id look like this: [Value1*Sinwt Value2*Sin2wt Value3*Sin3wt …… ] [Value1*Sinwt Value2*Sin2wt Value3*Sin3wt …… ] Now we convert the intensity matrix to frequency domain. Now we convert the intensity matrix to frequency domain.

59 Why we do this? Advantage 1: Sin(iwt) is a orthogonal series Advantage 1: Sin(iwt) is a orthogonal series While i != j While i != j So we can design a feature extraction array and put all genes associated with cancer in it. For example, the array may look like this: E = [a*sin3wt b*sin17wt c*sin45wt …..], since we bind the frequency information with the intensity, it no longer depends on the location of the values. So we can design a feature extraction array and put all genes associated with cancer in it. For example, the array may look like this: E = [a*sin3wt b*sin17wt c*sin45wt …..], since we bind the frequency information with the intensity, it no longer depends on the location of the values.

60 We can use this feature extraction matrix E to “ scan ” the Id, then select those critical genes out. We can use this feature extraction matrix E to “ scan ” the Id, then select those critical genes out. Advantage 2: we can do inverse Fourier transform to transfer the intensity matrix to “ time ” domain. I think the physical meaning is: at some specific time, the expression level of all genes. Advantage 2: we can do inverse Fourier transform to transfer the intensity matrix to “ time ” domain. I think the physical meaning is: at some specific time, the expression level of all genes. Advantage3: Maybe we can design band pass or band stop filter based on this. Advantage3: Maybe we can design band pass or band stop filter based on this.

61 Disadvantage: Disadvantage: Since sin(iwt) is a orthogonal series, the process mentioned before will select specific frequency only and wipe all other frequency out, so we can ’ t see the relations between a specific gene and other genes. Since sin(iwt) is a orthogonal series, the process mentioned before will select specific frequency only and wipe all other frequency out, so we can ’ t see the relations between a specific gene and other genes.

62 Future work and Challenge Although the processed results don ’ t converge to a standard, we can construct database to store the breast cancer “ signatures ” as many as possible, therefore, when we get a new microarray image signature, we can firstly try to match it in our database or compute the “ similarities ” between the sample and cancer or between the sample and control to predict cancer developing probability. Although the processed results don ’ t converge to a standard, we can construct database to store the breast cancer “ signatures ” as many as possible, therefore, when we get a new microarray image signature, we can firstly try to match it in our database or compute the “ similarities ” between the sample and cancer or between the sample and control to predict cancer developing probability.

63 Finish the frequency and “ time ” domain analysis. Finish the frequency and “ time ” domain analysis. Optimize the filter design. Optimize the filter design.

64 Continue … The big problem is that since the gene map strongly depends on each individual person, it might not be a good idea to use a normal person to “ measure ” another people. We need microarray images from the same person, before/after he/she got cancer and after he/she received treatment. It ’ s very hard for us to get this type of images. We can use images of normal and abnormal tissue from the same person instead, but we are lacking for these images either. The big problem is that since the gene map strongly depends on each individual person, it might not be a good idea to use a normal person to “ measure ” another people. We need microarray images from the same person, before/after he/she got cancer and after he/she received treatment. It ’ s very hard for us to get this type of images. We can use images of normal and abnormal tissue from the same person instead, but we are lacking for these images either.

65 Item needed We need 50 or more microarray images of “ normal ”, “ cancer ” and “ after treatment ” stages from the same person. (or from normal and cancer tissues) We need 50 or more microarray images of “ normal ”, “ cancer ” and “ after treatment ” stages from the same person. (or from normal and cancer tissues)

66 Appendix: Softwares we can use Although there are many softwares we can use, the listed below are free: Although there are many softwares we can use, the listed below are free: F-scan F-scan P-scan P-scan ScanAlyze 2 ScanAlyze 2 TIGR Spotfinder TIGR Spotfinder UCSF Spot UCSF Spot

67 Thank you!!! Thank you!!!


Download ppt "Microarray Analysis: Image processing and Filter design Instructors: Dr.Ravi Sankar Dr.Wei Qian Student: Kun Li Nov 2006."

Similar presentations


Ads by Google