Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.

Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology Program in Evolution, Ecology, and Behavior Center for Computational Biology and Bioinformatics

Two computational biologists One computational biologist needs to know a lot of biology Another needs to know a lot of mathematics

Another two computational biologists Craig Benham: mathematics of stressed DNA (understanding regulation) Gene Myers: whole genome sequencing and BLAST

Two different types of computational biologists One works on mathematical or computational problems (derived from biology) that are well posed, and are hard to solve -- these need significant computer science/math/statistics One works on biological problems that are not well posed, and where the computer science/math/statistics needed may be “easier” Both can be problems that are important to biologists, and which they cannot solve without computational biologists’ involvement

Hard math Easy math Easily applicable Not applicable My view of Pasteur’s Quadrant

Hard math Easy math Easily applicable Not applicable My view of Pasteur’s Quadrant What computational scientists want

Hard math Easy math Easily applicable Not applicable My view of Pasteur’s Quadrant What computational scientists want What computational scientists do

Hard math Easy math Easily applicable Not applicable My view of Pasteur’s Quadrant What computational scientists want What computational scientists do What biologists want

Phylogeny From the Tree of the Life Website, University of Arizona Orangutan GorillaChimpanzee Human

DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT AAGGCCTTGGACTT TAGCCCATAGACTTAGCGCTTAGCACAAAGGGCAT TAGCCCTAGCACTT AAGACTT TGGACTTAAGGCCT AGGGCATTAGCCCTAGCACTT AAGGCCTTGGACTT AGCGCTTAGCACAATAGACTTTAGCCCAAGGGCAT

Molecular Systematics TAGCCCATAGACTTTGCACAATGCGCTTAGGGCAT UVWXY U VW X Y

Computational challenges for Assembling the Tree of Life 8 million species for the Tree of Life -- cannot currently analyze more than a few hundred (and even this can take years) We need new methods for inferring large phylogenies - hard optimization problems! We need new software for visualizing large trees We need new database technology Not all phylogenies are trees, so we need methods for inferring phylogenetic networks

Time is a bottleneck for MP and ML Phylogenetic trees MP score Global optimum Local optimum Systematists tend to prefer trees with the optimal maximum parsimony score or optimal maximum likelihood score; however, both problems are hard to solve (Our experimental studies show that polynomial time methods do not do as well as MP or ML heuristics, when trees are big and have high rates of evolution)

MP/ML heuristics Time MP score of best trees Performance of hill-climbing heuristic Fake study

DCM-boosting Speeding up MP/ML heuristics Time MP score of best trees Performance of hill-climbing heuristic Desired Performance Fake study

Characteristics The research can be published in mathematics/statistics/computer science journals and conferences, and evaluated along these lines These people can be faculty in Math/Statistics/Computer Science departments, and *maybe* in some biology departments Substantive improvements are hard, but if achieved will have enormous impact on many biologists Why? These are old problems, endorsed by biologists, of a computational nature.

The “other” type Deals with problems like: protein fold prediction, inferring metabolic or regulatory networks, finding genes within genomes, or even computing a good multiple sequence alignment Needs to know a lot of biology to pose appropriate computational problems Resultant algorithms may not (in some cases) make for interesting or publishable mathematics Note: generally new problems because of new data

What’s needed (for all types) Ability to collaborate with a variety of people, and learn what they want to achieve Ability to be flexible in terms of how one evaluates research results (e.g., real vs. simulated data, theory versus experiment) Ability to communicate research results to different types of researchers Ability to use a variety of techniques to solve biological problems Ability to model and pose appropriate computational approaches for biological problems

Difficult questions What departments should have computational biologists (especially of the second type)? Should there be departments of computational biology? Should there be PhD programs in computational biology? How to evaluate a computational biologist of either type?

Some issues for academic computational biologists Journal versus conference papers, and number of each Experimental/empirical versus theoretical work Software versus papers Authorship order within publications Promotion and Tenure in two departments? Biggest issue: How to predict future success???

Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.

Similar presentations

Presentation on theme: "Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.

Similar presentations

Presentation on theme: "Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology."— Presentation transcript:

Similar presentations

About project

Feedback