Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSIR-National Botanical Research Institute

Similar presentations


Presentation on theme: "CSIR-National Botanical Research Institute"— Presentation transcript:

1 CSIR-National Botanical Research Institute
Genomics of Gossypium spp. for Development of Genetic Markers and Discovery of Genes Related to Fiber and Drought Traits Samir V. Sawant Principal Scientist CSIR-National Botanical Research Institute Rana Prarap Marg Lucknow

2 Synopsis of Presentation:
Large Scale Genomic Resource Development in Cotton. Genes Underlying Drought Tolerance & Fiber Quality Traits.

3 Large Scale Genomic Resource Development in Cotton though Sequencing of HMPR libraries

4 Selection of Six Diverse Germplasms of
G. hirsutum (based on AFLP genetic diversity) Genotypes Species Source of Collection JKC 703 G. hirsutum JK Agri., Hyderabad, Andhra Pradesh JKC 725 JKC 737 JKC 770 LRA 5166 TNAU, Coimbatore, Tamilnadu MCU5 UASD, Dharwad, Karnataka Jena et al. (2011) Crop & Pasture Science 62:859-75

5 Genic-Enrichment by Methylation Sensitive Restriction Digestion
In-sensitive M BstBI HpaII ClaI HindIII EcoRV Uncut DNA Enriched DNA Digestion of genomic DNA with different enzymes for methylation pattern Rai et al. (2013) Plant Biotechnology J.

6 Reads Generated for Various Genic-enriched
Cotton Genotypes Germplasms Enzymes used Total Reads (in millions) Total Bases (in Mb) JKC 703 HpaII 1.50 429.1 ClaI 1.64 474.6 JKC 725 1.36 372.9 1.87 533.5 JKC 737 1.45 407.7 1.69 542.9 JKC 770 1.30 376.1 1.76 481.1 MCU5 1.15 316.8 1.47 416.6 LRA 5166 433.9 1.70 513.4 Total 18.47 5298.8

7 Genotype wise Comparison of Genic-enriched
reads using gsMapper v2.5.3 Germplams JKC 725 JKC 737 JKC 770 MCU5 LRA 5166 % of mapped Reads Bases JKC 703 89 70 87 68 66 75 60 90 69 76 61 77 88 67 59 79 63

8 Enzyme wise comparison of Genic-enriched reads using gsMapper v2.5.3
Enzyme wise comparison of HMPR enriched reads Germplams % reads mapped % bases mapped JKC 703 78.1 61.1 JKC 725 83.2 62.3 JKC 737  83.8 66.4 JKC 770 82.3 63.8 MCU5 80.5 64.4 LRA 5166 77.3  56.5

9 De novo Assembly using Newbler v2.5.3
Assembler Parameters Cotton Genotypes JKC 703 JKC 725 JKC 737 JKC 770 MCU5 LRA 5166 Super assembly All Contigs (>100 bp) 58,142 61,862 54,731 53,419 27,952 63,002 533271 Singletons (millions) 1.23 1.43 1.33 1.29 0.80 1.24 3.56 Total bases (Mb) 377.9 428.1 427.3 378.4 233.4 398.9 1272.6 Large contigs (>500 bp) 21,920 20255 17,960 17,657 8,663 25,084 215504 Largest contig size (Kb) 29.7 30.9 30.4 31.0 36.0 29.6 24.3 Avg. contig size (bases) 826 808 809 815 868 900 N50 contig size (bases) 785 787 802 771 861 894 Q40 plus bases 92.78 92.6 92.25 92.63 94.25 92.99 93.8

10 Gene Prediction and Annotations
AUGUSTUS 90294 GenScan 125422 GlimmmerHMM 97533 Reciprocal Blast Common gene models (present in any of two or more prediction tools) 93363 Full length genes 21399 Annotation NCBI nr Total hits: Unique: 38645 TAIR 10 Total hits: Unique: Cotton EST Total hits: Unique:

11 Similarity of Predicted Gene Models with Other Plant Genomes
V. vinifera R. communis G. raimondii A. thaliana

12 qRT PCR Validation of 12 Randomly Selected Predicted Gene Models in G
qRT PCR Validation of 12 Randomly Selected Predicted Gene Models in G. hirsutum Y- axis: Fold Expression in Fiber and Root as compared to Leaf tissues

13 Identification of Transcription Factor
Encoding Genes

14 qRT PCR Validation of 12 Randomly Selected Predicted Transcription factor encoding Gene Models in G. hirsutum Y- axis: Fold Expression in Fiber and Root as compared to Leaf tissues

15 Genome-wide SNP Discovery in G. hirsutum
Pooling contigs from each germplasm Super contigs Output of AutoSNP Filtered out False SNP Detected SNP contigs Assembly (Newbler v2.5.3) AutoSNP Using customized program SNP discovery using Newbler v2.5.3 Assembler Assembly of individual germplasms Assembly (Newbler v2.5.3)

16 Strategy for SNP Discovery in G. hirsutum
G. hirsutum genotype-1 Allelic SNP (Taken) Non-Allelic SNP (Discarded) Assembly using gsAssembler v (40 bp overlap with 97% identity) G C autoSNP v2.0 for contigs with minimum six reads (≥3 reads from each genotype) × G. hirsutum genotype-2

17 Genome-wide SNP Discovery in G. hirsutum…
Cultivars SNP summary Sequence alignment JKC 703 LRA 5166 JKC 725 JKC 770

18 Genome-wide SNP Discovery in G. hirsutum…
Details of SNP discovery v Distribution of identified SNPs Total identified SNPs 4,22,617 True SNPs called 75,714 Non-redundant SNPs 66,444 Novel SNPs 66,364 UTRs, 4446 Intronic, 4518 Annotated Exonic SNPs Synonymous 2604 Non-synonymous SNPs 6506

19 Validation of Identified SNPs in G. hirsutum
JKC 703 T C JKC 725 JKC 770 LRA 5166 JKC 737 MCU5 SNPs used for Validation : 30 Germplasms used : 6 SNPs Detected : 30

20 SSRs distribution on the basis of Motifs
SSRs identification and Novelty comparison against Cotton Marker Database SSRs distribution on the basis of Motifs SSR novelty analysis Number of SSRs Unit size of different repeat type 47,093 Novel SSRs Total SSRs Identified 1,48,930 Unmatched whole sequence wise Unmatched primer sequence wise SSRs Successful in designing primers 56,142 Unmatched flanking sequence wise Novel SSRs developed 47,093

21 Validation of Identified SSRs in G. hirsutum
291/297/300 bp JKC 703 291/300 bp JKC 770 291/297 bp JKC 725 JKC 737 LRA 5166 MCU 5 SSRs used for Validation : 40 Germplasms used : 12 Polymorphic SSRs : 6 % Polymorphism : 15

22 Distribution of G. hirsutum SSRs and SNPs containing sequences on G
Distribution of G. hirsutum SSRs and SNPs containing sequences on G. raimondii reference genome G. raimondii (JGI) (Chinese draft) SSRs SNPs

23 miRNA Novel to Gossypium (on the basis of miRBase)
miRNAs in Gossypium (on the basis of miRBase) Total miRNAs identified 78 miRNA families identified 42 miRNA novel to G. hirsutum 17 Novel miRNAs miR-1713 miR-2112 miR-2675 miR-3522 miR-3696 miR-165 miR-437 miR-477 miR-536 miR-950 miR-1070 miR-4343 miR-4371 miR-5023 miR-5065 miR-5555 miR-3963

24 Promoters and Cis Regulatory Elements
Promoters identified 24839 ≥ 1000 bp 826 ≥ 500 bp 3135 Fiber developmental stage specific Promoters Initiation 184 Elongation 28 Secondary cell wall  110 Size Distribution of Identified Promoters No. of Promoters Size in bases Initiation (184) Elongation (28) Sec. Cell Wall Synthesis (110)

25 Genomic Resources Developed for G. hirsutum An Overview
Assembled Sequences AssembledBases 1272 Mb Novel SNPs 66364 GC Content 37.76 % Novel SSRs 47093 Repetitive content 12.16 % TF’s 1093 Gene Models 93363 Promoters 3135 Full length genes 21399 Rai et al. (2013) Plant Biotechnology J.

26 Total SSRs from G. herbaceum
Development & Characterization of gSSRs and eSSRs in Diploid Cotton (G. herbaceum) Total SSRs from G. herbaceum UPGMA tree of 15 genotypes of G. herbaceum based on Nei’s genetic distance using 200 SSRs 263 gSSRs 1970 eSSRs Repeat Enriched Genomic Libraries Drought Transcriptome Sequencing SSR “NBRI_gB010” among four species of cotton Cross-species transferability of G. herbaceum derived gSSRs and eSSRs Jena et al. (2012) Theoretical Applied Genetics 124 (3):565-76

27 (superior in fibre quality) (inferior in fibre quality)
Development of molecular markers from Indian genotypes of two Gossypium L. species G. hirsutum G. herbaceum JKC 703 (superior in fibre quality) JKC 777 (inferior in fibre quality) Vagad (Drought tolerant) RAHS-14 (Drought Sensitive) GujCot (Drought tolerant) RAHSIPS-187 (Drought Sensitive) 1440 SSRs 2608 SNPs 10,947 SNPs SSR Sequence 50bp Flanking NBRI SNPs Public Domain SNPs NBRI SNPs Public Domain SNPs 111 20 2 1,847 2 38,780 334 307 15 10,947 206 Primers 334 Novel SSRs 1,847 Novel SNPs 10,947 Novel SNPs Srivastava et al., Journal Plant Breeding doi: /pbr (In Press)

28 Microarray Based Single Feature Polymorphisms (SFPs) in
Gossypium hirsutum Superior fiber quality Inferior fiber quality In Silico analysis of 37,473 SFPs in six crosses (JKC703) (JKC737) (JKC783) (JKC725) (JKC770) Biological replicate 1, 2,3 Biological replicate 1, 2,3 RNA extraction/microarray hybridization Validation of SFPs in two germplasm (JKC 703 x JKC 770) No. of Selected SFPs No. of SNPs found No. of indels found RMA background correction “770-1,2,3.CEL” “703-1,2,3.CEL” “725-1,2,3.CEL” “737-1,2,3.CEL” “783-1,2,3.CEL” Further analysis for SFPs Srivastava et al. (2012) Communicated

29 SSRs/SNPs/SFPs development from Cotton
at NBRI NBRI COTTON MARKERS A-genome derived SSRs (genomic & expressed) 2,233 Total SSRs 59,805 AD-genome derived SSRs (Genic enrichment) 56,142 AD-Genome derived SSRs (Transcriptome sequencing) 1440 A-Genome derived SNPs (Transcriptome sequencing) 592 AD-genome derived SNPs (Genic enrichment) 66,444 Total SNPs 69,768 AD-genome derived SNPs (Transcriptome sequencing) 2,600 AD-genome derived SFPs (Microarray based) 132 Total Novel Markers 1,29,573

30 Axiom™ myDesign™ Array: Targeted genotyping, tailored for our study
COTTON SNP CHIP (Affymetrix’s Axiom® myDesign Cotton Array) (CSIR-NBRI) Axiom myDesign TG Array Plates enable us to: Easily select relevant SNPs from our SNP database Creating panels of 500,000 markers per sample Axiom™ myDesign™ Array: Targeted genotyping, tailored for our study A streamlined assay: Total genomic DNA (200 ng) is amplified and randomly fragmented into 25 to 125 base pair (bp) fragments. These fragments are purified, re-suspended, and hybridized to Axiom Genome-Wide and myDesign Array Plates. Following hybridization, the bound target is washed under stringent conditions to remove non-specific background to minimize background noise caused by random ligation events. Each polymorphic nucleotide is queried via a multi-color ligation event carried out on the array surface. After ligation, the arrays are stained and imaged on the Gene Titan MC Instrument. Targeting 50,000 SNPs for Genotyping with Mapping Population

31 Deployment of COTTON SNP CHIP on
Mapping Populations CICR, Nagpur: a. H X H RIL population (Fiber Traits) b. A X He RIL population (Mapping and Fiber Traits) 2. UAS, Dharwad: a. H X B RIL population (Fiber Traits) b. Core Collection (Association Mapping) 3. TNAU, Coimbatore: a. H X H RIL population (Fiber Traits) b. H X H RIL population (Sap sucking pests)

32 NBRI’s Cotton Database A Webpage for Cotton Resources

33 Drought Tolerance & Fiber Quality Traits
II. Genes Underlying Drought Tolerance & Fiber Quality Traits

34 Screening of Cotton Genotypes for Drought Tolerance and Sensitivity
Screening of G. herbaceum genotypes on different concentrations of mannitol Effect of drought on tolerant and sensitive genotype Mannitol percentage Accessions Control 2% 4% 6% 8% Vagad 100 86 Guj cot-21 82 66 RAHS-14 76 12 RAHS-IPS-187 16 H-17 84 62 14 AH-7GP AH-127 22 4 AH-41 18 RAS-45 DB-3-12 64 30 RAHS-131 JYLEHAR 2 GH-18-2LC RAHS-132 34 10 Tolerant genotype (Vagad) Sensitive genotype (RAHS-14) Drought sensitive Continuous watering 1 week alternate watering Drought Tolerant Ranjan et.al., BMC Genomics (2012) 13:94

35 Physiological Parameters in Response to Drought in Vagad and RAHS- 14
Properties of Vagad Reduced stomatal conductance (gs) Decreased transpiration rate (E) Reduced water potential (WP) Higher realtive water content (RWC) Leading to better water use efficiency (WUE). Vagad has inherent ability to sense the drought at much early stage and respond to it in much efficiently. Ranjan A et.al., BMC genomics 2012 March

36 Differentially up regulated genes (Fold change ≥ 2)
Transcriptional profiling during drought and water condition in Leaf tissue of Vagad and RAHS-14 Pyrosequencing data Microarray data Parameters Vagad library RAHS-14 library Total reads (overlap size of 100 bp and 96% identity)a 85368 56354 Total contigs (100 bp or greater)b 11439 6313 Singletone 24087 20780 Exemplar 31244 23155 Average length of contigs 350 bp 180 bp Number of contigs with greater than 500 bp 946 705 cNumber of genes with significant hit in NCBI NR database 10772 10408 dNumber of genes with significant hit in cotton EST database 16301 13822 Genotypes Differentially up regulated genes (Fold change ≥ 2) Vagad water 656 RAHS- 14 water 535 Vagad drought 430 RAHS- 14 drought 411 Ranjan A et.al., BMC genomics 2012 March

37 Genome wide gene expression profiling of leaf tissue of Vagad and RAHS-14
Propanoid pathway Pigment biosynthesis Polyketide biosynthesis Responses to various abiotic stresses Secondary metabolite pathways RAHS-14 Ethylene responsive factor WRKY Programmed cell death Senescence Lipid metabolism Ranjan A et.al., BMC genomics 2012 March

38 Differentially up regulated genes (Fold change ≥ 2)
Comparative root Transcriptome Analysis of Drought Tolerant and Sensitive Genotypes of G. herbaceum Root architectures Pyrosequencing data Drought tolerant Drought sensitive Reads Bases Contigs Singleton Av. Contig length Av. S. length GujCot-21 55, 620 13, 020, 140 1,281 30, 501 481.7bp 237.8bp RAHS-IPS 187 49, 308 11, 199, 207 858 30, 776 532.9bp 228.6bp Supercontigs 1, 04, 928 24,219, 347 2, 664 50, 531 508.7bp 231.7bp Microarray data Genotypes Differentially up regulated genes (Fold change ≥ 2) Vagad water 165 RAHS- 14 water 156 Vagad drought 256 RAHS- 14 drought 538 Ranjan A et.al., BMC genomics 2012 November

39 Functional enrichment of genes of root tissue in
drought tolerant and sensitive genotypes Tolerant genotype Sensitive genotype Regulation of Transcription factors (TFs) under drought stress Ranjan A et.al., BMC genomics 2012 November

40 Differentially expressed genes analyzed by Genevestigator in mapping the specific expression of genes in different root zones Ranjan et.al., BMC Genomics (2012) 13: 680

41 Selection of Candidate Gene for Studying the
Abiotic Response Identification of Transcription Activator (TA) from Cotton Transcriptome of root tissue Expression of TA Library name TL TR SL SR Number of Transcript (tpm) 72 (TL-tolerant leaf, TR-tolerant root, SL-sensitive leaf, SR-sensitive root)

42 Increased tolerance to drought and salt stress and better root development in tobacco over expressing GheTA WT GheTA control 50 mM Mannitol 150 mM Mannitol 200 mM Mannitol 250 mM Mannitol WT GheTA control 50 mM NaCl 75 mM NaCl 100 mM NaCl 150 mM NaCl GheTA WT GheTA WT

43 Abiotic stress tolerance of the GheTA over-expressing
tobacco transgenic plants by leaf disk assay Control 5 % PEG 10 % PEG 100 mM NaCl 150 mM NaCl 250 mM Mannitol 500 mM Mannitol WT GheTA WT GheTA

44 Over-Expression of GheTA leads to increased root biomass and better WUE in cotton transgenic plants
Wild type Cotton Transgenic Carbon Isotope Discrimination ratio shows higher water use efficiency (WUE) of cotton TA transgenic plants

45 Expressional Reprogramming During Fiber Development In Contrasting Genotypes of G. hirsutum
Fiber quality of genotypes Fiber Quality Parameters Superior genotypes Inferior genotypes JKC 725 JKC 777 JKC 703 JKC 737 JKC 783 2.5% span length (mm) Fiber strength (g/tex) Fineness (micronaire) Fiber lignin content Fiber cellulose content Nigam et al. (2013) Communicated

46 Microarray data analysis
Method used for Analyzing Microarray Data from Contrasting Cotton Genotypes Microarray data analysis Two way ANOVA study

47 Singular Enrichment Analysis (SEA)
Genotype significant genes DPA significant genes Interaction significant genes

48 MapMan Bins Cluster Analysis
0DPA 9DPA 12DPA 19DPA 25DPA 6DPA Cluster-4 Cluster-6 Cluster-5 A B Superior genotypes Inferior genotypes

49 Enrichment of Transcription factors

50 25 DPA Fiber Transcriptome: Assembly and Annotations
de novo and merged assembly Annotation of unigenes Parameters Merged assembly of both genotypes JKC 703 JKC 777 Total reads generated 547939 488128 Total bases generated (Mb) 168.4 160.9 329.3 Average read size (bp) 307 330 318 High Quality reads used in assembly 529496 473666 972907 All Contigs (>100 bp) 17900 16457 21308 Singletons 53983 45023 81120 Total bases after assembly (MB) 24.7 22.3 37.2 Large contigs (>500 bp) 8752 8153 12947 Largest contig size (bp) 3560 4991 5008 Average contig size (bp) 860 823 936 N50 contig sizea (bp) 893 838 987 Aligned Reads (%) 86.8 88.0 86.9 Aligned Bases (%) 86.4 87.4 85.6 Inferred read errorb (%) 2.0 1.8 Q40 plus basesc (%) 94.0 94.1 95.5 Parameters Genotypes Merged assembly of both genotypes JKC 703 JKC 777 Total unigenesa 71883 61480 102428 Hits in 'NCBI nr' database 40134 36844 56390 Hits in 'tair9' database 33,359 37,017 51120  Hits in ESTScan 44592 40570 20,852 Differential unigenes 2156 2076 -

51 Correlation between Cotton Fiber Microarray and Transcriptome Sequencing
R2 =0.68, p-value = 0.001

52 Hypothetical regulatory model showing over-represented genes and pathways in Superior and Inferior genotypes SA ? ABA 25 19 12 9 6 C2C2-GATA NAC BR Increased Stress Tolerance to facilitate elongation process up to its completion Barssinosteroid signalling WRKY Auxin GA Fibre Cells Continuative Elongation JA Increased Stress Environment within fibre cell and complete end of cell elongation process H202 POX Transcription factors Oxiplipins Continued Barssinosteroid Signalling SCW ABI3/VP1 SPL5 Phospholipases Pat5, Pat6 AGPs Flavonoid Biosynthesis DET2 BES1 End of Cell Elongation Energy Exhaust Efficient energy source for fast elongating fiber (Oxidative Phosphorylation e.g.TCA ,Glycolysis) Transporter Machinery BIN2 DET2 BES1 ABC transporter Barssinosteroid Targeted Gene Expression Barssinosteroid Targeted Gene Expression Elongation CSEA Asp family Continued Energy Providing Machinery Cell wall Enzymes Pectin Modification SCW formation Starts PG,PAE,PME,PMEI Cell wall loosening ALDH LIM domain S3 mRNA and Protein degradation Ubiquitin ligase, Proteasome, Splisosome Large number of ribosomal subunits (Better Protein synthesizing machinery ) MEE 59 PRF5 EXL5 Decreased high elongation rate of fibre cells during elongation period APY2 AGPs SPL3 Cell death HSF Calcium Signalling Induction of Stress Hormones signal like, Ethylene and ABA Lipid peroxidation HSPs, Ca ++ /CAM1 Initiation AP2-EREBP H2O2 Oxidative stress ROS ROS Ascorbate peroxidase Glutathione-S-Transferase WRKY Oxidative stress SUPERIOR GENOTYPE INFERIOR GENOTYPE

53 Positions of mapped differentially expressed genes on QTL
QTL start QTL end QTL size QTL size (MB) chr size chr size (MB) % covered by QTL in a chr QTL location Total No. of genes in the QTL region Completely Mapped genes in QTL region (Our data) Total mapped gene on Chromomosome (our data) 27.91 55.86 49.9 Chr 1 2757 50 138 19.31 62.76 30.7 Chr 2 2148 49 130 32.24 45.76 70.4 Chr 3.1 2986 29 92 3.707 8.1 Chr 3.2 884 18 8.37 62.17 13.47 Chr 4 1008 32 133 58.37 64.14 91.0 Chr 5 11474 77 109 42.01 51.07 82.2 Chr 6 1334 45 114 2.11 60.98 3.4 Chr 7.1 295 13 172 43.42 71.2 Chr 7.2 4441 68 18.69 57.12 32.7 Chr 08 4592 79 147 24.78 70.71 35.0 Chr 09 3146 245 642463 0.64 62.68 1.0 Chr 11 95 2 140 600 1684 Percentage of genes mapped in QTL 600/1684 = ~35% Circos Plot of all the differential genes mapped on the cotton jgi genome with all the QTL.

54 Heat map of 9 TA5 transcription factors
microRNA156 Targeted Transcription Factor (TA5) Governs the Boll Number, Size and Lint Yield in G. hirsutum Heat map of 9 TA5 transcription factors q-RT PCR measurement Ghi S1_s_at (GhSPL5) Gra A1_at (GhSPL13) Gra A1_at (GhSPL14)

55 GhTA5 produce transcripts that are targeted by miR156
Northern blot of miR156 & miR172 in cotton fiber

56 Phenotypic evaluation of overexpression and knockdown lines
Overexpression Lines Wild Type Knockdown Lines Overexpression lines Knockdown lines 1 2 3 5 4 6 Wild Type Number of Cotton Bolls Overexpression line Knockdown line Lint cotton weight (gm) /plant Seed cotton weight (gm) /plant Average of cotton boll per plant

57 Identification and characterization of fiber specific promoter in G
Identification and characterization of fiber specific promoter in G. hirsutum (NBRI_2800) Fold Change Fiber developmental stages Expression pattern of NBRI 2800 gene in different fiber developmental stages

58 Histochemical localization of GUS expression in NBRI_2800 transgenic cotton plant

59 & Epigenetic regulation
Acknowledgments HMPR Sequencing & Epigenetic regulation Sunil K. Singh Krishan M. Rai Verandra Kumar Functional Genomics Neha Pandey Rajiv Tripathi Vrijesh Yadav Anshulika Sable Bioinformatics Dr. Mehar Asif Dr. Sumit K. Bag Dipti Nigam Archana Bhardwaj Ridhi Goel Pooman Pant Cotton Marker Dr. S N Jena Anukool Srivastava Ravi P. Shukla Collaborators J K Agrigenetics Tierra Seed Science TNAU, Coimbatore CICR, Nagpur UAS, Dharwad

60 Thank You


Download ppt "CSIR-National Botanical Research Institute"

Similar presentations


Ads by Google