Presentation is loading. Please wait.

Presentation is loading. Please wait.

Session I Database & Data Mining Speaker: Mehmet M. Dalkilic

Similar presentations


Presentation on theme: "Session I Database & Data Mining Speaker: Mehmet M. Dalkilic"— Presentation transcript:

1 Session I Database & Data Mining Speaker: Mehmet M. Dalkilic
Content of Talk & Notes: Bioinformatics Retreat © M.M. Dalkilic

2 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
“Systems biology is the science of discovering, modeling, understanding and ultimately engineering at the molecular level the dynamic relationships between the biological molecules that define living organisms” Leroy Hood Institute for Systems Biology Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

3 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Outline (I) A cursory overview of Database and Data Mining (II) Examples (a few) (III) Sundry important research questions (IV) Summary & Prelude to Discussion Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

4 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Perspectives "There's millions and millions of unsolved problems. Biology is so digital, and incredibly complicated, but incredibly useful. …. It is hard for me to say confidently that, after fifty more years of explosive growth of computer science, there will still be a lot of fascinating unsolved problems at peoples' fingertips, that it won't be pretty much working on refinements of well-explored things. Maybe all of the simple stuff and the really great stuff has been discovered. It may not be true, but I can't predict an unending growth. I can't be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it's at that level." - It is hard for me to say confidently that, after fifty more years of explosive growth of computer science, there will still be a lot of fascinating unsolved problems at peoples' fingertips, that it won't be pretty much working on refinements of well-explored things. I can't be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it's at that level. Donald Knuth Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

5 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Perspectives Computer science is no more about computers than astronomy is about telescopes. Edsger Dijkstra Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

6 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

7 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

8 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

9 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

10 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database SQL → Algebra → Optimized Algebra → Process → Table Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

11 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database SQL is essentially a form of First Order Predicate Calculus differs from general field of Mathematical logic * We don’t focus on use of functions (omit them in SQL) * We focus on finitary models Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

12 Database Why can’t I ask any question I’d like in a relational database? Dirk Van Gucht, DSI Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

13 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Database Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

14 Database Why can’t I ask any question I’d like in a relational database? Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

15 Database Why can’t I ask any question I’d like in a relational database? Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

16 Database Why can’t I ask any question I’d like in a relational database? Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

17 Database Why can’t I ask any question I’d like in a relational database? Dirk Van Gucht, DSI Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

18 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Datamining Dirk Van Gucht, DSI Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

19 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Datamining Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

20 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Biological processes can be modeled as complex networks of interconnected components. Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

21 Bioinformatics Retreat @ Bradford Woods © Indiana University 2007
Data Integration Problem how is data meaningfully integrated Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

22 How are the data related?
Messy issues of database & datamining How are the data related? What kind of model? What kind of inferencing? Is the data validated? Is there sufficient reason to use the network? Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

23 Relational Database currently ignores domains.
Significant Challenges Relational Database currently ignores domains. The relational model is poor at modeling biological data and their uncertain nature…no probabilistic means in querying. No advance in querying. Incorporate other successes in dealing with large repositories. Databases have no casual user in mind—they are designed by experts. Datamining has focused almost exclusively on relational modeled data. Ignored actionable results. Viewing and Search are still in their infancy. Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007

24 John Colbourne Scott Beason
Thanks to organizers me if you’d like to discuss anything Acknowledgements (no special order) Justen Andrews Haixu Tang Sun Kim Jim Costello Rupali Patwardhan Junguk Hur Sumit Middha Brian Ead, Esfandiar Haghverdi John Colbourne Scott Beason Pedja Radivojac Saturday, December 29, 2018 Bioinformatics Bradford Woods © Indiana University 2007


Download ppt "Session I Database & Data Mining Speaker: Mehmet M. Dalkilic"

Similar presentations


Ads by Google