Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Data Mining and Network Analysis to find Prostate Cancer Biomarkers Gabriela Jurca 1.

Similar presentations


Presentation on theme: "Combining Data Mining and Network Analysis to find Prostate Cancer Biomarkers Gabriela Jurca 1."— Presentation transcript:

1 Combining Data Mining and Network Analysis to find Prostate Cancer Biomarkers Gabriela Jurca 1

2 Today: 1)Background 2)Research results 3)Prostate cancer survey 2

3 1 in 7 men will develop prostate cancer Worldwide: #2 most diagnosed cancer in males Developed countries : #1 Why? 3

4 Cancer biomarkers Damaged DNA UV, X-rays, etc Normal BRCA1 gene Parents Mutated BRCA1 gene Repaired Healthy Cancer Death 4

5 Prostate-specific antigen (PSA) Biomarker for prostate cancer A protein detected in blood Routinely used for: Early detection of prostate cancer Treatment monitoring.. But often not specific or sensitive enough. We could use more cancer biomarkers! 5

6 Finding cancer biomarkers… There are 25 000 – 30 000 protein coding genes Best way to narrow list down is through text mining Finding patterns and trends from existing literature Various data mining techniques can also be used 6

7 Existing work out there Other people have also researched cancer text mining: Build gene-gene or gene-disease co-occurrence networks Integrated databases Rank genes related to disease ABC…ABC… 7

8 Our Approach 8 Retrieve research abstracts on prostate cancer ~125 000 of them Find the genes mentioned in the abstracts Analyze

9 What are top countries in prostate cancer research? 9 All abstracts under “prostate cancer”:Abstracts that also mention genes:

10 Collaborations between the top 10 countries 10

11 What are the top genes? 11 KLK3 is the gene responsible for Prostate-Specific Antigen (PSA)

12 Genes most frequently studied together? 12 Gene-Year Gene-Country

13 Genes most frequently studied together? 13 Gene-YearGene-Country Soft clustering on 52 (years) dimensions Soft clustering on 132 (countries) dimensions Association Rule Mining Association Rule Mining All in the fuzzy area

14 What are the top 10 diseases? 14 Used the web tool DisGeNet to find diseases for each gene

15 What communities of genes exist? 15 Highest betweenness centrality (key genes): 1.AR 2.KLK3 3.ACAD9 4.CDKN2A 5.INS

16 Diseases among the communities 16

17 Contribution Presented a new methodology to explore prostate cancer genetic biomarkers Combined data mining and network analysis to find some interesting trends in prostate cancer research Explored new relationships: gene-year, gene-country, abstract-country 17

18 Future Work Compare results between prostate and breast cancer Validate with experimental results, such as expression data 18 Compare the gene locations on the chromosomes Prostate cancer survey

19 Another data source that we can use to enhance our results 19 Research abstracts Experimental dataSurvey results Strong results

20 Prostate cancer survey A survey regarding your body, lifestyle, and environment Example: 20

21 Why is the survey useful? We can find interesting correlations. For example: 21 2 cups of green tea a day correlates with lower prostate cancer incidence Lactose intolerance correlates with higher prostate cancer incidence

22 Type of questions on the survey: 22

23 What will we do with the data? 1.The anonymous data will be used in aggregate form to find trends 2.We may use data mining and social network analysis to analyze results 3.The results may then be validated by wet-lab researchers! 23

24 Please take the survey! Online (prostatecancer.alhajj.ca) Hardcopy (Today) 24

25 Thank you! Questions? 25


Download ppt "Combining Data Mining and Network Analysis to find Prostate Cancer Biomarkers Gabriela Jurca 1."

Similar presentations


Ads by Google