Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics for Research

Similar presentations


Presentation on theme: "Bioinformatics for Research"— Presentation transcript:

1 Bioinformatics for Research
2/9/2018 Bioinformatics for Research Module 1 Introduction to Genomics and Bioinformatics January 12, 2017 Mainlab Bioinformatics Washington State University

2 Introduction to Genomics
Learning Outcomes Refresh your knowledge of basic genomic concepts and terminology Understand conceptually the different areas of genomic research Know the basic tools of genomics

3 Prokaryotes vs. Eukaryotes
No nucleus Nucleus Circular or linear chromosomes, plasmids Chromosomes in nucleus, mitochondria and chloroplasts also have genomes Polycistronic operons (multiple genes controlled by single promoter) Monocistronic operons (one gene, one promoter) No introns Introns Created By Marwa Naguib Mohamed

4 The central dogma of genetics
mRNA protein trait translation expression transcription

5 DNA nucleotides Standard Bases Degenerate Bases Abbreviation Base A
Adenosine C Cytidine G Guanosine T Thymidine U Uridine Abbreviation Base W A, T S C, G M A, C K G, T R A, G Y C, T B C, G, T D A, G, T H A, C, T V A, C, G Created By Marwa Naguib Mohamed

6 Genes and ORFs Gene A DNA segment that encodes a specific protein that contributes to the expression of a trait Open Reading Frame (ORF) Section of mRNA without stop codons that is translated Created By Marwa Naguib Mohamed

7 Structure of a Gene Regulatory regions: up to 50 kb upstream of +1 site Exons: protein coding and untranslated regions (UTR) 1 to 178 exons per gene (mean 8.8) 8 bp to 17 kb per exon (mean 145 bp) Introns: splice acceptor and donor sites, other DNA average 1 kb – 50 kb per intron

8 DNA to RNA to Protein

9 Amino Acids mRNA is translated into protein which is a series of amino acids Each amino acid is coded for by a 3 nucleotide codon Each amino acid has a unique structure and chemical properties Created By Marwa Naguib Mohamed

10 Amino Acid Codon Table

11 Substitution Matrices

12 Protein structure Properties of amino acids determine the structure of the protein Structure is important for protein function Mutations that alter structure can destabilize/inactivate the protein

13 What is a Genome? The DNA content of an organism. Contains all the biological information needed to construct and maintain an organism In eukaryotic organisms, it is measured in haploid equivalents Size is most commonly measured in base pairs (e.g. Mb) Genome sizes vary widely in size and do not correspond to the complexity of an organism

14 Basic Genome Statistics
Chromosome number and ploidy GC content Genome size Codon bias Gene content and order What is the chromosome number of your favorite organism and how many genes does it have?

15 Genomics vs Genetics Genomics is the study of Genetics is the study of

16 Genomics Comprises Structural Genomics The study of genome structure and organization on a large scale Functional Genomics The study of gene (and protein) function on a large scale Translational Genomics The adaptation of information derived from genome technologies for organism improvement What about Comparative Genomics?

17 Structural Genomics The study of genome structure and organization on a large scale Tools of Structural Genomics

18 Functional Genomics The study of gene function and expression on a large scale Tools of Functional Genomics EST libraries (cDNAs) RNA-Seq technology Next Generation Sequencing Technology Real time PCR

19 Translational Genomics
Transferring the knowledge gained from one species to another or translating basic knowledge to applied knowledge. As we identify gene(s) associated with interesting traits, markers can be identified for marker-assisted selection.

20 Traditional Breeding X Successive Backcrosses Improved cultivar
Waiting years to select for trees Wild species undesirable fruit low yield disease resistance Cultivar desirable fruit high yield disease susceptible Successive Backcrosses Improved cultivar

21 Molecular Breeding X X Improved cultivar Wild species
Select desired progeny long before any fruit is grown using molecular markers for the trait Wild species undesirable fruit low yield disease resistance Cultivar desirable fruit high yield disease susceptible Improved cultivar

22 Assignment – Extra Credit (10 Pts)
There are many different types of “-omics” that have emerged in the last few years. Please define each of the types below in one paragraph (do not copy from a website). to Jodi by Friday Sept 4 Transcriptomics is the study of …. Proteomics is the study of …. Metabolomics is the study of …. Phenomics is the study of ….. Created By Marwa Naguib Mohamed

23 Overview of Bioinformatics
Learning Outcomes Understand the broad concept and approaches used in bioinformatics

24 What is Bioinformatics ?
Bioinformatics Working Definition The application of information technology, computer science, mathematics and statistics to the organization, processing, storage, analysis, visualization and dissemination of genomic, genetic and breeding data. What is the Range of Bioinformatics ? Mathematical modeling of biological systems Developing algorithms for sequence and network analysis Building databases and web tools

25 Bioinformatics Approach
Mathematical Modeling: Abstraction of biological systems - DNA is a “String” Developing Algorithms for Sequence Analysis - Analysis of “Strings” Sequence alignment Sequence composition Building databases and web tools - Dissemination and data mining of “Strings”

26 Bioinformatics Approach
Mathematical Modeling: Abstraction of biological systems DNA is a “String” TAAGTTATTATTTAGTTAATACTTTTAACAATATTATTAAGGTATTTAAAAAATACTATTATAGTATTTAACATAGTTAAATACCTTCCTTAATACTGTTAAATTATATTCAATCAATACATATATAATATTATTAAAATACTTGATAAGTATTATTTAGATATTAGACAAATACTAATTTTATATTGCTTTAATACTTAATAAATACTACTTATGTATTAAGTAAATATTACTGTAATACTAATAACAATATTATTACAATATGCTAGAATAATATTGCTAGTATCAATAATTACTAATATAGTATTAGGAAAATACCATAATAATATTTCTACATAATACTAAGTTAATACTATGTGTAGAATAATAAATAATCAGATTAAAAAAATTTTATTTATCTGAAACATATTTAATCAATTGAACTGATTATTTTCAGCAGTAATAATTACATATGTACATAGTACATATGTAAAATATCATTAATTTCTGTTATATATAATAGTATCTATTTTAGAGAGTATTAATTATTACTATAATTAAGCATTTATGCTTAATTATAAGCTTTTTATGAACAAAATTA

27 Bioinformatics Approach
Mathematical Modeling: Abstraction of biological systems - DNA is a “String” Developing Algorithms for Sequence Analysis - Analysis of “Strings” Sequence alignment Sequence composition Building databases and web tools - Dissemination and data mining of “Strings”

28 Sequence Alignment Pairwise Sequence Comparison is the cornerstone of bioinformatics Infer function (homology) Orthologs (occur in separate species, common ancestors) Parology (Gene duplication independent of speciation) Build Evolutionary Trees Do whole genome comparisons Infer structure

29 Assembly Algorithms: Newbler, Velvet, Mira, Celera, CAP3, PHRAP, etc.
e.g. GDR Unigenes

30 Multiple Sequence Alignment

31 Phylogenetic Analysis
Fragaria: Member of the distantly-related Rosoideae (x = 7) Malus and Prunus: Members of the Spireaoideae (x = 17) and (x = 8) respectively

32 Domain Prediction

33 Genome Mapping/Comparison
Prunus Malus Rosaceae N=9 Fragaria FC FC 7 PC PC 8 MC MC17 The innermost circle represents the nine ancestral chromosomes of Rosaceae. The eight chromosomes of peach are repeated, each section showing the regions that are orthologous to each ancestral chromosome. Concentric circle enables us to identify the ancestral relationships and origins (breakage and fusion).

34 Genome Annotation

35 Visualization Tools Comparative Mapping: CMap
Genome Browsers: GBrowse/JBrowse, etc

36 Structural Bioinformatics

37 Statistical Analysis of Functional Genomics Data
What statistical measures can be used to quantify up and down regulation of genes Technical and biological error Arrays RNA Seq

38 Bioinformatics Approach
Mathematical Modeling: Abstraction of biological systems - DNA is a “String” Developing Algorithms for Sequence Analysis - Analysis of “Strings” Sequence alignment Sequence composition Building databases and web tools - Dissemination and data mining of “Strings”

39 Database Resources

40 Database Similarity Searching
Primary databases and community databases

41 NCBI : Primary Database for Genomics Data

42 Querying Databases

43 Querying Databases


Download ppt "Bioinformatics for Research"

Similar presentations


Ads by Google