Presentation is loading. Please wait.

Presentation is loading. Please wait.

What is Bioinformatics?

Similar presentations


Presentation on theme: "What is Bioinformatics?"— Presentation transcript:

1 Welcome to BIO 345 Introduction to Bioinformatics Professor Andrew Michaelson

2 What is Bioinformatics?
The Sequence and Structure of Genes and Proteins Life Sciences Mathematics Informatics Adapted from:

3 The Human Genome displayed as a Karyotype

4 Each chromosome contains a single DNA molecule
© Mayo Foundation for Medical Education and Research (MFMER)

5 Adenine Cytosine Guanine Thymine
4 base pairs are the primary source of information responsible for all of animal and plant life Adenine Cytosine Guanine Thymine © Mayo Foundation for Medical Education and Research (MFMER)

6

7 A genetic map of Human Chromosome 7
Location of the gene responsible for Cystic Fibrosis in 90% of people of Northern European descent

8 Sequence of the Cystic Fibrosis Gene: CFTR

9 The deletion of these three base pairs is responsible

10 Bioinformatics is an interdisciplinary field that uses computer programs to answer biological questions The Sequence and Structure of Genes and Proteins Biology, Chemistry, Physics Mathematics Statistics/Computer Science Adapted from:

11 Learning Objectives for the Course

12 Learning Objectives for the Course
Be able to search and retrieve sequence information from sequence databases such as NCBI and ensembl.

13 Learning Objectives for the Course
Be able to search and retrieve sequence information from sequence databases such as NCBI and ensembl. Be able annotate simple nucleic acid sequences.

14 Learning Objectives for the Course
Be able to search and retrieve sequence information from sequence databases such as NCBI and ensembl. Be able annotate simple nucleic acid sequences. Be able to use fundamental programs such as BLAST, BLAT and Clustal Omega.

15 Learning Objectives for the Course
Be able to search and retrieve sequence information from sequence databases such as NCBI and ensembl. Be able annotate simple nucleic acid sequences. Be able to use fundamental programs such as BLAST, BLAT and Clustal Omega. To establish evolutionary relationships of species through sequence comparisons.

16

17

18 Course Design

19 Course Design

20 Course Design

21 Course Design

22 Course Design

23

24

25

26

27 1st assignment (you should do this today):
Send an from your Farmingdale account to the address: Let me know what Biology and Computer classes you have taken and what areas of Biology you are most interested in or your favorite classes so far.

28 Learning Objectives Be aware of class policies and procedures
Establish familiarity with the NCBI website Establish familiarity with the ensembl website Learn to obtain and read a FASTA sequence file Use PubMed to generate literature searches

29 Nucleic acid sequence databases are housed at NCBI

30 National Center for Biotechnology Information (NCBI) : http://www.ncbi.nlm.nih.gov/

31 Pulldown menu Displayed items will change

32 NCBI Bookshelf Offers Free Textbooks

33 Some Recommended Textbooks

34 PubMed is an excellent way to find health related journal articles

35 PubMed search is similar to other NCBI search options

36 Nucleic acid sequence databases are housed at NCBI

37 Searching for nucleic acid sequences using NCBI Nucleotide

38 Using Advanced Search Builder

39 Using Advanced Search Builder

40 Results

41 Sequence databases use accession numbers
An accession number is label that used to identify a sequence. It is a string of letters and/or numbers that corresponds to a molecular sequence.

42

43 Filters can help restrict results
RefSeq= Reference Sequences—these are the gold standard of sequences. They are the most likely sequences to be complete and correct.

44 For any assignment or project you should always use the RefSeq sequence

45 Filters can help restrict results

46 For human gene information Ensembl.org has a better interface

47 For human gene information Ensembl.org has a better interface
The current version of the Human Genome sequence is GRCh38= Genome Reference Consortium (GRC) human 38

48 Searching for genes in ensembl

49 Search Results

50 Ensembl Gene Record

51 Location of the gene on the chromosome

52 Location of the gene on the chromosome
The molecular location: location of the gene in base pairs on chromosome 12 The cytogenetic location: location of the gene using banding pattern of chromosome 12 The cytogenetic location: chromosome 12 on the small arm, region 1, band 2, sub-band 1 (p arm is the small arm/q arm is the long arm)

53 Location of the gene on the chromosome
The molecular location: location of the gene in base pairs on chromosome 12 The cytogenetic location: location of the gene using banding pattern of chromosome 12 The cytogenetic location: chromosome 12 on the small arm, region 1, band 1, sub-band 2, and sub-sub band 2.

54 Location of the gene on the chromosome

55 You will need to capture image information for this course
Snipping tool allows your to take a snapshot of anything you want

56 Location of the gene on the chromosome

57 Downloading images from ensembl
Move your cursor of this icon This will allow you to download an image as various file types. They can then be copied and pasted into your document.

58 The transcript table provides links to sequences

59 The RefSeq link will direct you to the NCBI nucleotide record for that gene

60 The RefSeq link will direct you to the NCBI nucleotide record for that gene
Follow this hyperlink

61 NCBI nucleotide record
Title for the record

62 NCBI nucleotide record
Size of the sequence

63 NCBI nucleotide record
This is the accession number for this specific sequence (there are other sequence files associated with this gene).

64 NCBI nucleotide record

65 FASTA is the universal sequence file type

66 FASTA is the universal sequence file type
The > on the top line indicates the definition line

67 The definition line can be composed of an identifier and a description
This is a compound identifier which has the GI (GenInfo #) and the (RefSeq #) The information found after the | is the description

68 The sequence lines follow the Identifier
These are the sequence lines for the file. For a nucleic acid sequence they will be a string of A,C,T and G. Each line has 70 characters.

69 To access the worksheet:
Download, complete, print out, and it to me. It must be handed in by the end of class. All assignments and papers in this course must be ed to me. To access the worksheet: Go to my website to download the worksheet. Make sure to save the file to your USB disk and enable editing. Use your USB/thumb drive to save your work as you go—Don’t save in the bioinformatics folder!


Download ppt "What is Bioinformatics?"

Similar presentations


Ads by Google