Presentation is loading. Please wait.

Presentation is loading. Please wait.

1Introduction 1.0

Similar presentations


Presentation on theme: "1Introduction 1.0"— Presentation transcript:

1 1Introduction 1.0 http://creativecommons.org/licenses/by-sa/2.0/

2 2Introduction 1.0 Welcome to the Canadian Bioinformatics Workshops Bioinformatics, 9 th Edition Toronto, On, Feb 13 – 25, 2006 Francis Ouellette, Director, CGDN Bioinformatics Core Facility Director, UBC Bioinformatics Centre

3 3Introduction 1.0 Outline Instructors, schedule, other things... Why does bioinformatics exist? (data, data, data …) Will it exist for ever? (some experts say “no”!) What is bioinformatics? (get 100 scientists in a room, get 100 answers … ) What are the big challenges in bioinformatics? (Research & discipline differences between life sciences and computer sciences) Resources available to Bioinformaticians The importance of Open Access and Open Source

4 4Introduction 1.0 David Boris Francis Fiona Chris Stephen Francois “CBW core”

5 5Introduction 1.0 David Boris Francis Fiona John Gabor Chris CBW – Bioinformatics – Toronto 2006 Mike Shum Quaid

6 6Introduction 1.0 Joanne Jenn Will CBW – Bioinformatics – Toronto 2006 Stephen Gerald Jennifer Philip Tong Bhooma David Philip

7 7Introduction 1.0 Acknowledgement A great part of this talk is adapted from what Fiona Brinkman developed two years ago (2004). When ever you see “ “, this means that slide is from Fiona’s presentation. Other slides are simply acknowledged with names. Fiona

8 8Introduction 1.0 Today 09.00 Introduction 10.00 Break 10.15 Biology 11:15 UNIX 12.30 Lunch (on your own) 13.45 Biological Databases 15:30 Break 15:45 Entrez lab 17:15 Break 17:30 Public Lecture – Tony Pawson

9 9Introduction 1.0 François Major Closing Keynote presentation on Saturday Feb 25 th Title Time Etc

10 10Introduction 1.0 Administrative stuff Accounts on Linux machines –login: guest –password: cbw2005 Security, badges, fire exits, food, code name, marks

11 11Introduction 1.0 Canadian Bioinformatics Workshops Bioinformatics Genomics Proteomics Developing the Tools  You are here bioinformatics.ca Vancouver May 1-6Calgary June 12-17 Montreal July 22-27 March 17 March 24

12 12Introduction 1.0 CBW Sponsors http://bioinformatics.ca/sponsors.php UOttawa

13 13Introduction 1.0 Questions?

14 14Introduction 1.0 Introduction - Objectives Why does bioinformatics exist What is bioinformatics What are the big challenges in bioinformatics –Research –Discipline differences between Bio and CS

15 15Introduction 1.0 Why is there Bioinformatics?  Lots of new sequences being added  Automated sequencers  Genome Projects  EST sequencing  Microarray studies  Proteomics Metagenomics (“Metagenomics” describes the functional and sequence-based analysis of the collective microbial genomes contained in an environmental sample)  Patterns in datasets that can only be analyzed using computers Sequencing technology!

16 16Introduction 1.0 Need for informatics in biology: origins Early databases: Dayhoff, 1972; Erdmann, 1978: The Atlas of Protein Sequences was available on Digital Tape, and in 1980, by modem (300 Baud). Early programs: restriction enzyme sites, patern finding, promoters, etc… circa 1978. 1978 – 1993: Nucleic Acids Research published supplemental information 1982: DDBJ/EMBL/GenBank are created as a public repository of genetic sequence information. 1983: NIH funds the PIR (Protein Information Resource) database. 1988: Pearson and Lipman create FASTA

17 17Introduction 1.0 Genomes Number of base pairs ___________________________________________________________ 1971 First published DNA sequence 12 1977 PhiX174 5,375 1982 Lambda 48,502 1992 Yeast Chromosome III 316,613 1995 Haemophilus influenza 1,830,138 1996 Saccharomyces 12,068,000 1998 C. elegans 97,000,000 2000 D. melanogaster 120,000,000 2001 H. sapines (draft) 2,600,000,000 2003 H. sapiens 2,850,000,000

18 18Introduction 1.0 Fr History of DNA Sequencing Avery: Proposes DNA as ‘Genetic Material’ Watson & Crick: Double Helix Structure of DNA Holley: Sequences Yeast tRNA Ala 1870 1953 1940 1965 1970 1977 1980 1990 2005 Miescher: Discovers DNA Wu: Sequences Cohesive End DNA Sanger: Dideoxy Chain Termination Gilbert: Chemical Degradation Messing: M13 Cloning Hood et al.: Partial Automation Cycle Sequencing Improved Sequencing Enzymes Improved Fluorescent Detection Schemes 1986 Adapted from Messing & Llaca, PNAS (1998) 1 15 150 50,000 25,000 1,500 200,000 >100,000,000 Efficiency (bp/person/year) 15,000 From Eric Green

19 19Introduction 1.0 Genbank doubles every 14 months (from the National Centre for Biotechnology Information) Shorter than Moore’s law (computer power doubling every 20 months!)

20 20Introduction 1.0 The next step is obviously to locate all of the genes and regulatory regions, describe their functions, and identify how they differ between different groups (i.e. “disease” vs “healthy”)… …bioinformatics plays a critical role Fiona

21 21Introduction 1.0

22 22Introduction 1.0

23 23Introduction 1.0 Bioinformatics will help with……. Similarity Searching Sequence Databases  What is similar to my sequence?  Searching gets harder as the databases get bigger - and quality changes  Tools: BLAST and FASTA = time saving heuristics (approximate methods)  Statistics + informed judgement of the biologist Fiona

24 24Introduction 1.0 Bioinformatics will help with……. Structure- Function Relationships  Can we predict the function of protein molecules from their sequence? sequence > structure > function  Prediction of some simple 3-D structures (  - helix,  -sheet, membrane spanning, etc.) Fiona

25 25Introduction 1.0 Top 10 Future Challenges for Bioinformatics Precise, predictive model of transcription initiation and termination: ability to predict where and when transcription will occur in a genome Precise, predictive model of RNA splicing/alternative splicing: ability to predict the splicing pattern of any primary transcript in any tissue Precise, quantitative models of signal transduction pathways: ability to predict cellular responses to external stimuli Determining effective protein:DNA, protein:RNA and protein:protein recognition codes Accurate ab initio protein structure prediction Rational design of small molecule inhibitors of proteins Mechanistic understanding of protein evolution: understanding exactly how new protein functions evolve Mechanistic understanding of speciation: molecular details of how speciation occurs Continued development of effective gene ontologies - systematic ways to describe the functions of any gene or protein Education: development of appropriate bioinformatics curricula for secondary, undergraduate and graduate education Chris Burge, Ewan Birney, Jim Fickett. Genome Technology, issue No. 17, January, 2002

26 26Introduction 1.0 What is Bioinformatics? Think – Pair – Share! Fiona

27 27Introduction 1.0 Bioinformatics is about understanding how life works. It is an hypothesis driven science.

28 28Introduction 1.0 Bioinformatics is about integrating biological themes together with the help of computer tools and biological databases, and gaining new knowledge from this.

29 29Introduction 1.0 BLAST Result Basic Local Alignment Search Tool

30 30Introduction 1.0 Genetic Analysis of Cancer in Families The Genetic Predisposition to Cancer PubMed Text Neighboring Common terms could indicate similar subject matter Statistical method Weights based on term frequencies within document and within the database as a whole Some terms are better than others From Mark Boguski

31 31Introduction 1.0 Micro-array analysis: Figure 4 Figure 1 Science Jan 1 1999: 83-87 The Transcriptional Program in the Response of Human Fibroblasts to Serum Vishwanath R. Iyer, Michael B. Eisen, Douglas T. Ross, Greg Schuler, Troy Moore, Jeffrey C. F. Lee, Jeffrey M. Trent, Louis M. Staudt, James Hudson Jr., Mark S. Boguski, Deval Lashkari, Dari Shalon, David Botstein, Patrick O. Brown

32 32Introduction 1.0 VAST Result Ferredoxin Halobacterium marismortui Chlorella fusca Vector Alignment Search Tool

33 33Introduction 1.0 Computational Biology Analysis Q Gln NH 2 -C-CH 2 -CH 2 - O R Arg NH 2 -C-NH-CH 2 -CH 2 -CH 2 - +NH 2

34 34Introduction 1.0 UBiC: Links Directory Curated list of links to bioinformatics tools available worldwide SCIENCE 16 April 2004

35 35Introduction 1.0 UBiC: Links Directory

36 36Introduction 1.0 bioinformatics.ubc.ca/resources/links_directory

37 37Introduction 1.0

38 38Introduction 1.0 Open Source and Open Access Making It Work with Open Source and Open Access What does it mean? –Will it cause the economy to go bust? –Is it too good to be true? –What if I get scooped?

39 39Introduction 1.0 Open Source: what does it mean? Open source - Any software whose code is available for users to look at and modify freely. Linux is the best-known example; others include Apache, the dominant software for servers that provide web pages worldwide.

40 40Introduction 1.0 Open Source in the life sciences Present in all areas of bioinformatics Some very well known examples of tools used in industry and academic circles include: –BLAST –EMBOSS –EnsEMBL –MLagan –GenScan –Bioconductor

41 41Introduction 1.0 Open Access Unrestricted access to data Allows all to work and make discoveries Discoveries are not necessarily open access Open access is applicable to any kind of data you want to apply it to: –Sequence data (DNA, RNA or protein) –Gene expression data –Protein-protein interaction data –Publication

42 42Introduction 1.0 An Open Access Publication is one that meets the following two conditions: –The author(s) and copyright holder(s) grant(s) to all users a free, irrevocable, worldwide, perpetual right of access to, and a license to copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship, as well as the right to make small numbers of printed copies for their personal use. –A complete version of the work and all supplemental materials, including a copy of the permission as stated above, in a suitable standard electronic format is deposited immediately upon initial publication in at least one online repository that is supported by an academic institution, scholarly society, government agency, or other well-established organization that seeks to enable open access, unrestricted distribution, interoperability, and long-term archiving (for the biomedical sciences, PubMed Central is such a repository). From: http://www.pubmedcentral.gov/about/openaccess.html

43 43Introduction 1.0 Open access critical to progress in Science Without GenBank and other public sequence databases –There would be no BLAST –There would be no diagnostics DNA testing –There would be no understanding of the human genome (there probably would not have been a human genome to work on in the first place).

44 44Introduction 1.0 Open Access of Publications We are way overdue to break down the ivory towers that surround a few journals that are allowed to hide data from everybody that does not pay them! Who did the work? Taxpayer’s dollars! There are enough good (read “well reviewed”) journals out there now so that we need not publish in closed journals. We will need to get rid of the “old guard” that only wants to publish in Science, Nature and Cell. I think these journals will change – Physicists have been doing this for decades – biologists will figure it out soon. Need the reagents to do discoveries on the text data: have diseases to understand, cures to find!

45 45Introduction 1.0 http://creativecommons.org/licenses/by-sa/2.0/

46 46Introduction 1.0

47 47Introduction 1.0

48 48Introduction 1.0

49 49Introduction 1.0 So what are we doing here? We will figure out how to think like a bioinformaticians, and plan, execute and interpret bioinformatics experimets. For this we will need to know and fully understand on available (understandable) tools and databases. How they work, and how to interpret the output file. LET US DO IT!!!

50 50Introduction 1.0 One other tool to help in this process: The wiki Every year since we have been teaching this workshop we have been adding and changing things from year to year – This year, amongst many things, we are adding the wiki …

51 51Introduction 1.0 Questions? Open access? Open source? Where the washrooms are? CBW – Bioinformatics? CBW – Proteomics? Genomics? Tools? Graduate program in Bioinformatics at UBC? When do we start?


Download ppt "1Introduction 1.0"

Similar presentations


Ads by Google