Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xinbin Dai, Ph. D. Affymetrix Probeset Mapping and Medicago Genome Annotation (Mt4.0 RC1)

Similar presentations


Presentation on theme: "Xinbin Dai, Ph. D. Affymetrix Probeset Mapping and Medicago Genome Annotation (Mt4.0 RC1)"— Presentation transcript:

1 Xinbin Dai, Ph. D. Affymetrix Probeset Mapping and Medicago Genome Annotation (Mt4.0 RC1)

2 About Affymetrix Medicago GeneChip Mapping Algorithm and Tool Bioinformatics Resources for Medicago Truncatula Agenda

3 Affymetrix GeneChip Probes 5’ UTR EXON-I EXON-IIEXON-III3’ UTR mRNA Probeset: 11 Probes Target Transcript 25-mer 1255101520 1255101520 Perfect match - PM Mismatch - MM

4 id_at: Designates probe sets that uniquely recognize target transcripts id_a_at: Designates probe sets that recognize alternative transcripts from the same gene. id_s_at: Designates probe sets with common probes among multiple transcripts from different genes. id_x_at: Designates probe sets where it was not possible to select either a unique probe set or a probe set with identical probes among multiple transcripts. Rules for cross-hybridization were dropped in order to design the _x probe sets. These probe sets share some probes identically with two or more sequences and, therefore, these probe sets may cross-hybridize in an unpredictable manner. GeneChip® Expression Analysis Data Analysis Fundamentals. Probeset Types

5 About Medicago GeneChip TypeNum of probe sets Percent in the Mtr. set Notes Unique probe sets: e.g. Mtr.10097.1.S1_at 4418286.80 Unique to one gene Alternative (_a_), e.g.: Mtr.10267.1.S1_a_at 1162.28Alternative probe sets to one gene Shared (_s_), e.g. Mtr.10146.1.S1_s_at 47939.42Common to multiple genes Others (_x_), e.g.: Mtr.10093.1.S1_x_at 18093.55Other probe sets with complicated mapping Total50900100 Reference sequences: early version of IMGAG, DFCI GeneIndex and alfalfa EST

6 Gene transcripts were matched to corresponding Affymetrix probe sets using a position-weighted scoring index in which mismatches near the middle of a probe were most heavily penalized as follows: A perfect match for a probe set yields a score of 45 Matches were declared when at least 8 of 11 probes had scores of 43 or higher. Cutoff for matching: 43x8=344 Mapping Algorithm and Tool 1255101520 [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,2,2,2,2,2,1,1,1,1,1] Originated from Affymetrix, Inc.

7 AffyProbeMapping: An Online Affymetrix Probeset Mapping Tool http://bioinfo3.noble.org/affymap/ Input sequence: Transcript cDNA EST/Unigene CDS

8 Output of AffyProbeMapping: AffyProbeMapping also supports Affymetrix chips for other species: Lotus Japonica, Arabidopsis thaliana, rice, soybean, maize, populus, cotton and tomato

9 Bioinformatics & Data Resources for Medicago Truncatula Originated from Affymetrix, Inc. Data Sources: Mt3.5v4(2011, version for Nature paper): optical mapping 44,124 BAC-based gene loci + 18,264 illumina (nr) gene model Mt3.5v5(2012, minor changes): 45,859 BAC-based gene loci + 18,264 illumina gene model Mt4 RC1(2013, PAG 2013 conference): anchored illumina contigs onto pseudochromosomes. 84,993 gene loci (BAC+illumina). Chr sequences frozen; some of gene models might be removed. DFCI Gene index Release 11 294k ESTs/ETs  68,814 Unigenes

10 Statistics on Mt3.5v4 vs. Probesets Mapping Results using AffyProbeMapping Num of cDNAMatching probe_setPercent 37,385059.92 18,354129.42 6,649>=210.66 62,388Total100

11 Statistics on Mt4RC1 vs. Probesets Mapping Results using AffyProbeMapping Num of cDNAMatching probe_setPercent 58,660069.02 20,257123.83 6,076>=27.15 84,993Total100

12 Statistics on GeneIndex R11 vs. Probesets Mapping Results using AffyProbeMapping Num of cDNAMatching probe_setPercent 29,722043.2 32,848147.7 6,244>=29.1 68,814Total100

13 Mapping between Medicago genome vs. AffyMedicago Chip http://bioinfo3.noble.org/affymap/Dataset.gy

14 Bioinformatics Tools For Medicago Sequence Search and Annotation –DOBLAST --- http://bioinfo3.noble.org/doblast/, a parallel computing accelerated BLAST search tool Features: o Preload many Medicago data resource o Capable of handling big dataset o “Tab-delimited bioparser output format” works friendly with Excel

15 Bioinformatics Tools For Medicago Sequence Download and Cut by Coordinates. –“Sequence Download” page of DOBLAST --- batch download sequences or cut sequences by Coordinates o Preload many Medicago data resources o Batch download o Get a fragment of sequence by coordinates

16 DOBLAST sequence download page

17 Bioinformatics Tools For Medicago LegumeIP: An Integrative Platform to Study Gene Function and Genome Evolution in Legumes. Features: –Synteny analysis among model legumes –Phylogenetic analysis for gene family –Gene to gene association analysis –Gbrowser o http://plantgrn.noble.org/LegumeIP/ o We are updating to Version 2

18 LegumeIP: Synteny analysis for Medicago genome

19 LegumeIP: Phylogenetic analysis for Medicago gene family

20 LegumeIP: Gene association network analysis for Medicago gene


Download ppt "Xinbin Dai, Ph. D. Affymetrix Probeset Mapping and Medicago Genome Annotation (Mt4.0 RC1)"

Similar presentations


Ads by Google