Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTRODUCTION TO BIOINFORMATICS

Similar presentations


Presentation on theme: "INTRODUCTION TO BIOINFORMATICS"— Presentation transcript:

1 INTRODUCTION TO BIOINFORMATICS
Dilvan Moreira (based on Prof. André Carvalho presentation)

2 Objectives Learning of: Basic concepts Main tecniques
Future prospects and Application in Bioinformatics André de Carvalho - ICMC/USP 11/05/2018

3 Content Molecular Biology Sequence analysis Gene recognition
Sequence alignment Use of Markov chains in biological data Variation among and inside specie André de Carvalho - ICMC/USP 11/05/2018

4 Content Natural selection in molecular level Phylogenetic analysis
Genome comparison Gene expression analysis Identification of regulatory sequences André de Carvalho - ICMC/USP 5/11/2018

5 In the end of the course, you will be able to
Understand statistical and algorithmic approaches to sequence and gene expression analyses. Understand the role of computation in the modern biology Read and comprehend recent articles about genome, in particular the aspects associated to data analysis Become familiar with standard problems and tools, beyond the area current and future objectives Access and handle real genomic data André de Carvalho - ICMC/USP 5/11/2018

6 About the course Course directed to problems
The book reading is indispensable! Experimental Discussion classes based on the didactic material Weekly homework André de Carvalho - ICMC/USP 5/11/2018

7 Bibliography Nello Cristianini and Matthew W. Hahn: Introduction to Computational Genomics: A Case Studies Approach, Cambridge, 2007 Neil C. Jones and Pavel A. Pevzner: An Introduction to Bioinformatics Algorithms, MIT Press, 2004 André de Carvalho - ICMC/USP 5/11/2018

8 Bibliography Site: www.computational-genomics.net
André de Carvalho - ICMC/USP 11/05/2018

9

10

11 Bioinformatics Research and development of computational, mathematical and statistical tools to solve problems from Biology Molecular Biology A Computação está para a Biologia da mesma forma que a Matemática está para a Física. Harold Morowitz André de Carvalho - ICMC/USP 11/05/2018

12 Bioinformatics Areas from Bioinformatics
Development of new techniques and algorithms for research in biological database Development and implementation of tools which allow efficient management and access of several kinds of information André de Carvalho - ICMC/USP 5/11/2018

13 Bioinformatics Historical 1960: first aminoacid sequences databases
: development of algorithms do analyze those data 1980: launch of GeneBank and other public database With analyses tools 1990: Huge growth of DBs GenBank and PDB André de Carvalho - ICMC/USP 5/11/2018

14 Bioinformatics Research benefits several areas Medicine
Medicine - Pharmacy - Agriculture Medicine Improve disease diagnosis Detect genetic propensity to diseases Use gene therapy as medicine Allow the development of “personal medicine”, based on individual genetical profile André de Carvalho - ICMC/USP 11/05/2018

15 Bioinformatics First step to understand the cell operation:
To know of its nucleotide sequence (DNA) It defines the entire organism genome Set of all its DNA André de Carvalho - ICMC/USP 5/11/2018

16 Bioinformatics Genomics ground zero
Publication of the hole sequencing of a free-living organism (Science, 1995) Haemophilus influenza bacterium The Institute for Genomic Research (TIGR) Craig Venter Until then, only small viral organism or small parts of other genomes First genomic sequence (Phage virus phiX174) in 1978 André de Carvalho - ICMC/USP 5/11/2018

17 Bioinformatics Haemophilus influenza
Causer of several clinical diseases Menigite and septicemia, both usually occur in children, infeccion of middle ear, etc Until 1933, it was incorrectly consider as the cause of common flu (influenza) Chosen because one of the leaders of the project had been working with it from decades They could build high qualities DNA libraries DNA base pairs in one single circular chromosome André de Carvalho - ICMC/USP 11/05/2018

18 Bioinformatics Few months later, TIGR published another bacterium genome Mycoplasma genitalium Causer of pelvic inflamation TIGR created revolutionary computational methods to obtain and assemble genome sequences Soon after, by other group, it was published the first eukaryote sequence Sacharomyces cerevisiae fungi (yeast) André de Carvalho - ICMC/USP 11/05/2018

19 Bioinformatics January 2008 TIGR creates synthetic bacterium genome
Utilizes parts of Mycoplasma genitalium Plan: insert in a living cell and create the first artificial organism André de Carvalho - ICMC/USP 11/05/2018

20

21 Bioinformatics In the last years several genomes were sequenced
Generating a great amount of data Until January 2008: More than 3500 sequencing projects Around 700 organisms had been completely sequenced André de Carvalho - ICMC/USP 11/05/2018

22 Bioinformatics André de Carvalho - ICMC/USP 11/05/2018
Source: André de Carvalho - ICMC/USP 11/05/2018

23 Bioinformatics André de Carvalho - ICMC/USP 11/05/2018
Source: André de Carvalho - ICMC/USP 11/05/2018

24 Growth of Nucleotide Sequences Database
Moore's Law DNA André de Carvalho - ICMC/USP 11/05/2018

25 Bioinformatics André de Carvalho - ICMC/USP 11/05/2018
Source: André de Carvalho - ICMC/USP 11/05/2018

26 Ongoing Genome Projects
Total EUA UE Japão Outros Indústrias Archaea Procaryote Eucaryote 2007 Archaebacteria: Unicell procariotes organisms Considered an intermediate group between eucaryote and procaryote André de Carvalho - ICMC/USP 11/05/2018

27 Bioinformatics Pace of genome projects
Examples of complete published genomes Human Mouse Drosophila melanogaster Arabidopsis thaliana Yeast Domestic animals genome André de Carvalho - ICMC/USP 11/05/2018

28 Bioinformatics Most known sequencing: Made by two groups
Humans: trillions of cells Yeast: 1 cell Made by two groups Human genome project Public International consortia Celera Genomics Private André de Carvalho - ICMC/USP 5/11/2018

29 The Human Genome Project
Officially initiated in 1990 Originally planned for during 15 years Technological advances brought the conclusion to 2003 Coordinated by U.S. Department of Energy and for National Institutes of Health André de Carvalho - ICMC/USP 5/11/2018

30 G16 Human Genome Sequencing Consortium
1. Baylor College of Medicine, Houston, Texas, USA 2. Beijing Human Genome Center, Institute of Genetics, Chinese Academy of Sciences, Beijing, China 3. Gesellschaft für Biotechnologische Forschung mbH, Braunschweig, Germany 4. Genoscope, Evry, France 5. Genome Therapeutics Corporation, Waltham, MA, USA 6. Institute for Molecular Biotechnology, Jena, Germany 7. Joint Genome Institute, U.S. Department of Energy, Walnut Creek, CA, USA 8. Keio University, Tokyo, Japan 9. Max Planck Institute for Molecular Genetics, Berlin, Germany 10. RIKEN Genomic Sciences Center, Saitama, Japan 11. The Sanger Centre, Hinxton, U.K. 12. Stanford DNA Sequencing and Technology Development Center, Palo Alto, CA, USA 13. University of Washington Genome Center, Seattle, WA, USA 14. University of Washington Multimegabase Sequencing Center, Seattle, WA,USA 15. Whitehead Institute for Biomedical Research, MIT, Cambridge, MA, USA 16. Washington University Genome Sequencing Center, St. Louis, MO, USA Funding: National Institutes of Health (US) , Department of Energy (US) , Medical Research Council of Great Britain and Northern Ireland (UK) Wellcome Trust (UK) André de Carvalho - ICMC/USP 11/05/2018

31 The Human Genome Project
Main objectives: To identify all human DNA genes To determine the base pars sequence which compose the human DNA Store this information in a DB To improve tools to data analysis To transfer acquired technology to the private sector To address the ethical, legal and social themes that would arise from the project André de Carvalho - ICMC/USP 5/11/2018

32 The Human Genome Project
Some results Number of bases: 3 billions A's, C's, G's and T's The exact number of genes is still unknown Currently between and genes André de Carvalho - ICMC/USP 5/11/2018

33 Human Genome Cromosomes
André de Carvalho - ICMC/USP 11/05/2018

34 The Human Genome Project
More recently numbers Released in October 2004 It was identified genes possibles genes Less than the rice to (shorter genes) A little more than the nematode genes André de Carvalho - ICMC/USP 5/11/2018

35 Bioinformatics Nematode (Caenorhabditis elegans) Worm
More than genes Already gave hints about diabetes, aging process and cancer development Has a gene that regulates the organ formation Can increase the development of artificial organs André de Carvalho - ICMC/USP 5/11/2018

36 Bioinformatics Some results of human genome project
One gene have around bases The sizes vary a lot Major known human gene has 2.4 millions of bases The genes are concentrated in random areas through the genome The function of more than 50% of the genes is unknown André de Carvalho - ICMC/USP 5/11/2018

37 Bioinformatics Some results of human genome project
Around 2% of the genome codes for instructions for the protein synthesis More than 40% of the expected human proteins show similarities with worm and house flies proteins The genomes from human and chimp are 98.5% genetically identical André de Carvalho - ICMC/USP 5/11/2018

38 Bioinformatics Objetctive of Genetic community:
Equivalent to the sequencing of all human genome for US$1,000 News from “Terra” website, January 22, 2008 The American genetic research company 23andMe, sponsored by google, offers on Europe the internet dispatch of a personalized copy of DNA by internet around US$ 999 EI4802,00.html André de Carvalho - ICMC/USP 5/11/2018

39 Genomes Several specie genomes and individuals from the same specie are available Scientists are comparing them André de Carvalho - ICMC/USP 5/11/2018

40 Global sells in Bioinformatics
Global Market Global sells in Bioinformatics $6,000 $5,000 $4,000 $3,000 $2,000 $1,000 $0 Millions of dollars Year André de Carvalho - ICMC/USP 11/05/2018

41 Bioinformatics Generated data need to be analyzed
Progressive emphasis change Accumulation  interpretation Laboratorial Analysis: expensive and hardworking Need of sophisticated computational tools André de Carvalho - ICMC/USP 11/05/2018

42 Bioinformatics Bioinformatics Artificial Intelligence Computational
Theory Bio inspired Computing Data Structure Bioinformatics Optimization Computer Network Internet Graphic Computing Parallel Computing Image Processing Software Engineering Database André de Carvalho - ICMC/USP 5/11/2018

43 Questions?


Download ppt "INTRODUCTION TO BIOINFORMATICS"

Similar presentations


Ads by Google