Presentation is loading. Please wait.

Presentation is loading. Please wait.

SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.

Similar presentations


Presentation on theme: "SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006."— Presentation transcript:

1 SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006

2 Genotype - Phenotype Studies Other questions: How do I know I have *all* the SNPs? What is the validation/quality of the SNPs that are known? Are these SNPs informative in my population/sample? What do I need to know for selecting the “best” SNPs? How do I pick the “best” SNPs? Typical Approach: “I have candidate gene/region and samples ready to study. Tell me what SNPs to genotype.” What information do I need to characterize a SNP for genotyping?

3 Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc picture of gene with mapped to the gene structure. How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>10%), rare Are other SNPs associated - redundant? Genotyping data!

4 Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. Entrez Gene - dbSNP - Entrez SNP 2. HapMap Genome Browser 3. SeattleSNPs PGA Candidate gene website 4. Web applications and other tools NIEHS, PolyPhen, ECR Browser

5 NCBI - Database Resource www.ncbi.nlm.nih.gov IL1B

6 Finding SNPs: Where do I start?

7 NCBI - Entrez Gene (LocusLink replacement)

8 Finding SNPs: Entrez Gene

9 dbSNP Geneview

10

11 HapMap Verified Finding SNPs: dbSNP validation (by 2hit-2allele)

12

13 Finding SNPs: dbSNP database

14 Entrez SNP - dbSNP genotype retrieval

15 Finding SNPs - Gene Genotype Report

16 Graphic display of genotype data - Visual Genotype

17 Finding SNPs - Gene Genotype Report

18

19 Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc picture of gene with mapped to the gene structure. How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>10%), rare Are other SNPs associated - redundant? Genotyping data! dbSNP - data is there

20 Entrez Gene Entry - Entrez SNP

21 Entrez SNP - direct dbSNP querying

22

23 Entrez SNP - Parseable Multi-SNP reports

24

25 Entrez SNP - Search Limiting Capabilities IL1B

26 Entrez SNP - Search Limits

27 Entrez SNP - Search Limiting Capabilities

28

29 Entrez SNP - More Limit Searching

30

31 Entrez SNP - Query Term Capabilities

32 Entrez SNP - Search Terms Fields

33 2[CHR] AND "coding nonsynon"[FUNC] More advanced queries:

34

35 Entrez SNP - Search Terms Fields 2[CHR] AND "coding nonsynonymous"[FUNC] AND "PGA-UW-FHCRC"[HANDLE] Note: Can also use wildcard (*) characters, AND, OR, and NOT operators More advanced queries:

36 Entrez SNP - Advanced Queries

37 Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc picture of gene with mapped to the gene structure. How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>10%), rare Are other SNPs associated - redundant? Genotyping data! EntrezSNP - better!

38 Finding SNPs - Entrez SNP Summary 1.dbSNP is useful for investigating detailed information on a small number SNPs - and its good for a picture of the gene 2.Entrez SNP is a direct, fast, database for querying SNP data. 3.Data from Entrez SNP can be retrieved in batches for many SNPs 4.Entrez SNP data can be “limited” to specific subsets of SNPs and formatted in plain text for easy parsing and manipulation 5.More detailed queries can be formed using specific “field tags” for retrieving SNP data

39 Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. Entrez Gene - dbSNP - Entrez SNP 2. HapMap Genome Browser 3. SeattleSNPs PGA Candidate gene website 4. Web applications and other tools NIEHS, PolyPhen, ECR Browser

40 www.hapmap.org

41 Finding SNPs: HapMap Browser

42

43 Finding SNPs: HapMap Genotypes

44 Finding SNPs: HapMap Browser

45 Minimal SNP information for genotyping/characterization What is the SNP? Flanking sequence and alleles. FASTA format >snp_name ACCGAGTAGCCAG [A/G] ACTGGGATAGAAC dbSNP reference SNP # (rs #) Where is the SNP mapped? Exon, promoter, UTR, etc picture of gene with mapped to the gene structure. How was it discovered? Method What assurances do you have that it is real? Validated how? What population – African, European, etc? What is the allele frequency of each SNP? Common (>10%), rare Are other SNPs associated - redundant? Genotyping data!

46 Finding SNPs: HapMap Browser 1.HapMap data sets are useful because individual genotype data can be used to determine optimal genotyping strategies (tagSNPs) or perform population genetic analyses (linkage disequilbrium) 2.Data are specific produced by those projects (not all dbSNP) HapMap data is available in dbSNP HapMap data is available in dbSNP 3.HapMap data (Phase II) can be accessed preleased prior to dbSNPs 4.Easier visualization of data and direct access to SNP data, individual genotypes, and LD analysis

47 Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? 1. Entrez Gene - dbSNP - Entrez SNP 2. HapMap Genome Browser 3. SeattleSNPs PGA Candidate gene website 4. Web applications and other tools NIEHS, PolyPhen, ECR Browser

48 Finding SNPs: SeattleSNPs Candidate Genes pga.gs.washington.edu

49

50

51 HapMap Compatible

52 Finding SNPs: SeattleSNPs Candidate Genes

53

54

55

56

57

58 SNP_pos Ind_ID allele1 allele2 Repeat for all individuals Repeat for next SNP

59

60

61 SIFT = Sorting Intolerant From Tolerant Evolutionary comparison of non-synonymous SNPs PolyPhen - Polymorphism Phenotyping Structural protein characteristics and evolutionary comparison

62 Physical and comparative analyses used to make predictions Uses SwissProt annotations to identify known domains Calculates a substitution probability from BLAST alignments of homologous and orthologous sequences Ranks substitutions on scale of predicted functional effects from “benign” to “probably damaging” PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs http://tux.embl-heidelberg.de/ramensky/

63 PolyPhen: Polymorphism Phenotyping- prediction of functional effect of human nsSNPs tux.embl-heidelberg.de/ramensky/ tux.embl-heidelberg.de/ramensky/

64 Finding SNPs: SeattleSNPs Candidate Genes

65 pga.gs.washington.edu

66

67 Finding SNPs: NIEHS SNPs Candidate Genes egp.gs.washington.edu

68

69 Aligns sequences to Mouse, Rat, Dog, Opposum, Chicken, Fugu and Drosophila Gene annotations from UCSC Genome Browser Easy retrieval of ECR sequences and alignments Pre-computed transcription factor binding sites http://ecrbrowser.dcode.org ECR Browser: Evolutionary Conserved Regions

70

71 Human-mouse alignment Fasta sequences

72 ECR Browser: Evolutionary Conserved Regions Transcription Factor Binding Sites from Transfac

73 Finding SNPs: Databases and Extraction Entrez SNP ( Entrez SNP (www.ncbi.nlm.nih.gov/entrez) Direct access to dbSNP data - versatile and flexible querying HapMap Browser HapMap Browser (hapmap.org) Access to large scale genotype data Rapid/early access on HapMap website Browsers provide visualization and other analysis tools SeattleSNPs SeattleSNPs (pga.gs.washington.edu) Candidate gene focused - inflammation - HLBS phenotypes Comprehensive SNP data from resequencing Early access - prior to dbSNP release Other Resources: NIEHS SNPS (egp.gs.washington.edu), Polyphen, ECR (with TransFac)


Download ppt "SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006."

Similar presentations


Ads by Google