CSIR-National Botanical Research Institute

Slides:



Advertisements
Similar presentations
Recombinant DNA Technology
Advertisements

Potato Mapping / QTLs Amir Moarefi VCR
Identification of markers linked to Selenium tolerance genes
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
Genome-Wide SNP Genotyping in Grape – What is Next? Part of National Genetic Trait Index Project CRIS# D USDA-ARS Geneva, Cornell, Davis,
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
INTERNATIONAL CENTRE FOR GENETIC ENGINEERING AND BIOTECHNOLOGY ARUNA ASAF ALI MARG, NEW DELHI Consortium Leader NATIONAL RESEARCH CENTRE ON PLANT.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Fibre properties that affect paper quality Strength –Microfibril length/thickness –Hydrogen bonding between microfibrils and other cell wall constituents.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Review of important points from the NCBI lectures. –Example slides Review the two types of microarray platforms. –Spotted arrays –Affymetrix Specific examples.
and analysis of gene transcription
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Committee Meeting April 24 th 2014 Characterizing epigenetic variation in the Pacific oyster (Crassostrea gigas) Claire Olson School of Aquatic and Fishery.
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Fine Structure and Analysis of Eukaryotic Genes
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Arabidopsis Genome Annotation TAIR7 Release. Arabidopsis Genome Annotation  Overview of releases  Current release (TAIR7)  Where to find TAIR7 release.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Title: Stress-inducible expression of barley Hva1 gene in transgenic mulberry displays enhanced tolerance against drought, salinity and cold stress Journal.
Data Type 1: Microarrays
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Figure S1. Genomic PCR of in vitro potato plants transformed with StPTB1 prom (top) and StPTB6 prom (bottom) constructs using nptII-specific primers. Thirty.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Microarrays and Their Uses Brad Windle, Ph.D
Development and Application of SNP markers in Genome of shrimp (Fenneropenaeus chinensis) Jianyong Zhang Marine Biology.
1 International Consultation on Pro-poor Jatropha Development, Rome, Apr 08HY Genetic Markers for Jatropha Biodiversity Evaluation and Breeding Introduction.
Genomics and Arabidopsis. What is ‘genomics’? Study of an organism’s entire genome –All the DNA encoded in the organism –Nucleus, mitochondria, chloroplasts.
© 2010 by The Samuel Roberts Noble Foundation, Inc. 1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA 2 National Center.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Genomics and Forensics
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
EB3233 Bioinformatics Introduction to Bioinformatics.
Introduction to RNAseq
PT Sampoerna Agro Tbk Sampoerna Strategic Square North Tower, 28th Floor Jl. Jend. Sudirman Kav. 45 Jakarta, Indonesia,12930 Development of Marker Assisted.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
The C3HC4-Type RING Zinc Finger and MYB Transcription Factor Families Matthew Taube June 5, 2008 HC70AL.
Supplemental Figure 1. Bias-corrected NGS bioinformatics strategies. Paired-end DNA sequencing reveals the sequence of the genomic clone, the sample ID.
BLAST Sequences queried against the nr or grass databases. GO ANALYSIS Contigs classified based on homology to known plant or fungal genes Next.
0 Dpa Control pI 4-7 (Linear) 170 kDa Biotic stress pI 4-7 (Linear) 170 kDa kDa
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
WT#3#5#7#9#11#14#15#20#25#30 35S::JAZ13 Root length ratio * * * * * * * * * * Figure S2. Overexpression of native (untagged)
DNA sequence evolution in Sunflower and Lettuce Yi Zou Thesis capstone report Major: Bioinformatics 07/16/2004 Advisor: Dr. Loren Rieseberg Dr. Sum Kim.
Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.
Institute of Crop Sciences, CAAS
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
GENOME ORGANIZATION AS REVEALED BY GENOME MAPPING WHY MAP GENOMES? HOW TO MAP GENOMES?
Volume 5, Issue 2, Pages (March 2012)
Risheng Chen et al BMC Genomics
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
The Basis of ABA phenotypes in Arabidopsis det1 mutants
Association between SSR markers and
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
Lab meeting
UNIT VII – GENOMICS & CANCER
Volume 5, Issue 2, Pages (March 2012)
Volume 2, Issue 4, Pages (July 2009)
MicroRNA Binding Sites in Arabidopsis Class III HD-ZIP mRNAs Are Required for Methylation of the Template Chromosome  Ning Bao, Khar-Wai Lye, M.Kathryn.
Presentation transcript:

CSIR-National Botanical Research Institute Genomics of Gossypium spp. for Development of Genetic Markers and Discovery of Genes Related to Fiber and Drought Traits Samir V. Sawant Principal Scientist CSIR-National Botanical Research Institute Rana Prarap Marg Lucknow-226001

Synopsis of Presentation: Large Scale Genomic Resource Development in Cotton. Genes Underlying Drought Tolerance & Fiber Quality Traits.

Large Scale Genomic Resource Development in Cotton though Sequencing of HMPR libraries

Selection of Six Diverse Germplasms of G. hirsutum (based on AFLP genetic diversity) Genotypes Species Source of Collection JKC 703 G. hirsutum JK Agri., Hyderabad, Andhra Pradesh JKC 725 JKC 737 JKC 770 LRA 5166 TNAU, Coimbatore, Tamilnadu MCU5 UASD, Dharwad, Karnataka Jena et al. (2011) Crop & Pasture Science 62:859-75

Genic-Enrichment by Methylation Sensitive Restriction Digestion In-sensitive M BstBI HpaII ClaI HindIII EcoRV Uncut DNA Enriched DNA Digestion of genomic DNA with different enzymes for methylation pattern Rai et al. (2013) Plant Biotechnology J.

Reads Generated for Various Genic-enriched Cotton Genotypes Germplasms Enzymes used Total Reads (in millions) Total Bases (in Mb) JKC 703 HpaII 1.50 429.1 ClaI 1.64 474.6 JKC 725 1.36 372.9 1.87 533.5 JKC 737 1.45 407.7 1.69 542.9 JKC 770 1.30 376.1 1.76 481.1 MCU5 1.15 316.8 1.47 416.6 LRA 5166 433.9 1.70 513.4 Total 18.47 5298.8

Genotype wise Comparison of Genic-enriched reads using gsMapper v2.5.3 Germplams JKC 725 JKC 737 JKC 770 MCU5 LRA 5166 % of mapped Reads Bases JKC 703 89 70 87 68 66 75 60   90 69 76 61 77 88 67 59 79 63

Enzyme wise comparison of Genic-enriched reads using gsMapper v2.5.3 Enzyme wise comparison of HMPR enriched reads Germplams % reads mapped % bases mapped JKC 703 78.1 61.1 JKC 725 83.2 62.3 JKC 737  83.8 66.4 JKC 770 82.3 63.8 MCU5 80.5 64.4 LRA 5166 77.3  56.5

De novo Assembly using Newbler v2.5.3 Assembler Parameters Cotton Genotypes JKC 703 JKC 725 JKC 737 JKC 770 MCU5 LRA 5166 Super assembly All Contigs (>100 bp) 58,142 61,862 54,731 53,419 27,952 63,002 533271 Singletons (millions) 1.23 1.43 1.33 1.29 0.80 1.24 3.56 Total bases (Mb) 377.9 428.1 427.3 378.4 233.4 398.9 1272.6 Large contigs (>500 bp) 21,920 20255 17,960 17,657 8,663 25,084 215504 Largest contig size (Kb) 29.7 30.9 30.4 31.0 36.0 29.6 24.3 Avg. contig size (bases) 826 808 809 815 868 900 N50 contig size (bases) 785 787 802 771 861 894 Q40 plus bases 92.78 92.6 92.25 92.63 94.25 92.99 93.8

Gene Prediction and Annotations AUGUSTUS 90294 GenScan 125422 GlimmmerHMM 97533 Reciprocal Blast Common gene models (present in any of two or more prediction tools) 93363 Full length genes 21399 Annotation NCBI nr Total hits: 63950 Unique: 38645 TAIR 10 Total hits: 52838 Unique: 16956 Cotton EST Total hits: 45054 Unique: 19513

Similarity of Predicted Gene Models with Other Plant Genomes V. vinifera R. communis G. raimondii A. thaliana

qRT PCR Validation of 12 Randomly Selected Predicted Gene Models in G qRT PCR Validation of 12 Randomly Selected Predicted Gene Models in G. hirsutum Y- axis: Fold Expression in Fiber and Root as compared to Leaf tissues

Identification of Transcription Factor Encoding Genes

qRT PCR Validation of 12 Randomly Selected Predicted Transcription factor encoding Gene Models in G. hirsutum Y- axis: Fold Expression in Fiber and Root as compared to Leaf tissues

Genome-wide SNP Discovery in G. hirsutum Pooling contigs from each germplasm Super contigs Output of AutoSNP Filtered out False SNP Detected SNP contigs Assembly (Newbler v2.5.3) AutoSNP Using customized program SNP discovery using Newbler v2.5.3 Assembler Assembly of individual germplasms Assembly (Newbler v2.5.3)

Strategy for SNP Discovery in G. hirsutum G. hirsutum genotype-1 Allelic SNP (Taken) Non-Allelic SNP (Discarded) Assembly using gsAssembler v2.5.3 (40 bp overlap with 97% identity) G C autoSNP v2.0 for contigs with minimum six reads (≥3 reads from each genotype) × G. hirsutum genotype-2

Genome-wide SNP Discovery in G. hirsutum… Cultivars SNP summary Sequence alignment JKC 703 LRA 5166 JKC 725 JKC 770

Genome-wide SNP Discovery in G. hirsutum… Details of SNP discovery v Distribution of identified SNPs Total identified SNPs 4,22,617 True SNPs called 75,714 Non-redundant SNPs 66,444 Novel SNPs 66,364 UTRs, 4446 Intronic, 4518 Annotated Exonic SNPs Synonymous 2604 Non-synonymous SNPs 6506

Validation of Identified SNPs in G. hirsutum JKC 703 T C JKC 725 JKC 770 LRA 5166 JKC 737 MCU5 SNPs used for Validation : 30 Germplasms used : 6 SNPs Detected : 30

SSRs distribution on the basis of Motifs SSRs identification and Novelty comparison against Cotton Marker Database SSRs distribution on the basis of Motifs SSR novelty analysis Number of SSRs Unit size of different repeat type 47,093 Novel SSRs Total SSRs Identified 1,48,930 Unmatched whole sequence wise Unmatched primer sequence wise SSRs Successful in designing primers 56,142 Unmatched flanking sequence wise Novel SSRs developed 47,093

Validation of Identified SSRs in G. hirsutum 291/297/300 bp JKC 703 291/300 bp JKC 770 291/297 bp JKC 725 JKC 737 LRA 5166 MCU 5 SSRs used for Validation : 40 Germplasms used : 12 Polymorphic SSRs : 6 % Polymorphism : 15

Distribution of G. hirsutum SSRs and SNPs containing sequences on G Distribution of G. hirsutum SSRs and SNPs containing sequences on G. raimondii reference genome G. raimondii (JGI) (Chinese draft) SSRs SNPs

miRNA Novel to Gossypium (on the basis of miRBase) miRNAs in Gossypium (on the basis of miRBase) Total miRNAs identified 78 miRNA families identified 42 miRNA novel to G. hirsutum 17 Novel miRNAs miR-1713 miR-2112 miR-2675 miR-3522 miR-3696 miR-165 miR-437 miR-477 miR-536 miR-950 miR-1070 miR-4343 miR-4371 miR-5023 miR-5065 miR-5555 miR-3963

Promoters and Cis Regulatory Elements Promoters identified 24839 ≥ 1000 bp 826 ≥ 500 bp 3135 Fiber developmental stage specific Promoters Initiation 184 Elongation 28 Secondary cell wall  110 Size Distribution of Identified Promoters No. of Promoters Size in bases Initiation (184) Elongation (28) Sec. Cell Wall Synthesis (110)

Genomic Resources Developed for G. hirsutum An Overview Assembled Sequences 4095128 AssembledBases 1272 Mb Novel SNPs 66364 GC Content 37.76 % Novel SSRs 47093 Repetitive content 12.16 % TF’s 1093 Gene Models 93363 Promoters 3135 Full length genes 21399 Rai et al. (2013) Plant Biotechnology J.

Total SSRs from G. herbaceum Development & Characterization of gSSRs and eSSRs in Diploid Cotton (G. herbaceum) Total SSRs from G. herbaceum UPGMA tree of 15 genotypes of G. herbaceum based on Nei’s genetic distance using 200 SSRs 263 gSSRs 1970 eSSRs Repeat Enriched Genomic Libraries Drought Transcriptome Sequencing SSR “NBRI_gB010” among four species of cotton Cross-species transferability of G. herbaceum derived gSSRs and eSSRs Jena et al. (2012) Theoretical Applied Genetics 124 (3):565-76

(superior in fibre quality) (inferior in fibre quality) Development of molecular markers from Indian genotypes of two Gossypium L. species G. hirsutum G. herbaceum JKC 703 (superior in fibre quality) JKC 777 (inferior in fibre quality) Vagad (Drought tolerant) RAHS-14 (Drought Sensitive) GujCot (Drought tolerant) RAHSIPS-187 (Drought Sensitive) 1440 SSRs 2608 SNPs 10,947 SNPs SSR Sequence 50bp Flanking NBRI SNPs Public Domain SNPs NBRI SNPs Public Domain SNPs 111 20 2 1,847 2 38,780 334 307 15 10,947 206 Primers 334 Novel SSRs 1,847 Novel SNPs 10,947 Novel SNPs Srivastava et al., Journal Plant Breeding doi:10.1111/pbr.12087 (In Press)

Microarray Based Single Feature Polymorphisms (SFPs) in Gossypium hirsutum Superior fiber quality Inferior fiber quality In Silico analysis of 37,473 SFPs in six crosses (JKC703) (JKC737) (JKC783) (JKC725) (JKC770) Biological replicate 1, 2,3 Biological replicate 1, 2,3 RNA extraction/microarray hybridization Validation of SFPs in two germplasm (JKC 703 x JKC 770) No. of Selected SFPs 224 No. of SNPs found 122 No. of indels found 10 RMA background correction “770-1,2,3.CEL” “703-1,2,3.CEL” “725-1,2,3.CEL” “737-1,2,3.CEL” “783-1,2,3.CEL” Further analysis for SFPs Srivastava et al. (2012) Communicated

SSRs/SNPs/SFPs development from Cotton at NBRI NBRI COTTON MARKERS A-genome derived SSRs (genomic & expressed) 2,233 Total SSRs 59,805 AD-genome derived SSRs (Genic enrichment) 56,142 AD-Genome derived SSRs (Transcriptome sequencing) 1440 A-Genome derived SNPs (Transcriptome sequencing) 592 AD-genome derived SNPs (Genic enrichment) 66,444 Total SNPs 69,768 AD-genome derived SNPs (Transcriptome sequencing) 2,600 AD-genome derived SFPs (Microarray based) 132 Total Novel Markers 1,29,573

Axiom™ myDesign™ Array: Targeted genotyping, tailored for our study COTTON SNP CHIP (Affymetrix’s Axiom® myDesign Cotton Array) (CSIR-NBRI) Axiom myDesign TG Array Plates enable us to: Easily select relevant SNPs from our SNP database Creating panels of 500,000 markers per sample Axiom™ myDesign™ Array: Targeted genotyping, tailored for our study A streamlined assay: Total genomic DNA (200 ng) is amplified and randomly fragmented into 25 to 125 base pair (bp) fragments. These fragments are purified, re-suspended, and hybridized to Axiom Genome-Wide and myDesign Array Plates. Following hybridization, the bound target is washed under stringent conditions to remove non-specific background to minimize background noise caused by random ligation events. Each polymorphic nucleotide is queried via a multi-color ligation event carried out on the array surface. After ligation, the arrays are stained and imaged on the Gene Titan MC Instrument. Targeting 50,000 SNPs for Genotyping with Mapping Population

Deployment of COTTON SNP CHIP on Mapping Populations CICR, Nagpur: a. H X H RIL population (Fiber Traits) b. A X He RIL population (Mapping and Fiber Traits) 2. UAS, Dharwad: a. H X B RIL population (Fiber Traits) b. Core Collection (Association Mapping) 3. TNAU, Coimbatore: a. H X H RIL population (Fiber Traits) b. H X H RIL population (Sap sucking pests)

NBRI’s Cotton Database A Webpage for Cotton Resources

Drought Tolerance & Fiber Quality Traits II. Genes Underlying Drought Tolerance & Fiber Quality Traits

Screening of Cotton Genotypes for Drought Tolerance and Sensitivity Screening of G. herbaceum genotypes on different concentrations of mannitol Effect of drought on tolerant and sensitive genotype Mannitol percentage Accessions Control 2% 4% 6% 8% Vagad 100 86 Guj cot-21 82 66 RAHS-14 76 12 RAHS-IPS-187 16 H-17 84 62 14 AH-7GP AH-127 22 4 AH-41 18 RAS-45 DB-3-12 64 30 RAHS-131 JYLEHAR 2 GH-18-2LC RAHS-132 34 10 Tolerant genotype (Vagad) Sensitive genotype (RAHS-14) Drought sensitive Continuous watering 1 week alternate watering Drought Tolerant Ranjan et.al., BMC Genomics (2012) 13:94

Physiological Parameters in Response to Drought in Vagad and RAHS- 14 Properties of Vagad Reduced stomatal conductance (gs) Decreased transpiration rate (E) Reduced water potential (WP) Higher realtive water content (RWC) Leading to better water use efficiency (WUE). Vagad has inherent ability to sense the drought at much early stage and respond to it in much efficiently. Ranjan A et.al., BMC genomics 2012 March

Differentially up regulated genes (Fold change ≥ 2) Transcriptional profiling during drought and water condition in Leaf tissue of Vagad and RAHS-14 Pyrosequencing data Microarray data Parameters Vagad library RAHS-14 library Total reads (overlap size of 100 bp and 96% identity)a 85368 56354 Total contigs (100 bp or greater)b 11439 6313 Singletone 24087 20780 Exemplar 31244 23155 Average length of contigs 350 bp 180 bp Number of contigs with greater than 500 bp 946 705 cNumber of genes with significant hit in NCBI NR database 10772 10408 dNumber of genes with significant hit in cotton EST database 16301 13822 Genotypes Differentially up regulated genes (Fold change ≥ 2) Vagad water 656 RAHS- 14 water 535 Vagad drought 430 RAHS- 14 drought 411 Ranjan A et.al., BMC genomics 2012 March

Genome wide gene expression profiling of leaf tissue of Vagad and RAHS-14 Propanoid pathway Pigment biosynthesis Polyketide biosynthesis Responses to various abiotic stresses Secondary metabolite pathways RAHS-14 Ethylene responsive factor WRKY Programmed cell death Senescence Lipid metabolism Ranjan A et.al., BMC genomics 2012 March

Differentially up regulated genes (Fold change ≥ 2) Comparative root Transcriptome Analysis of Drought Tolerant and Sensitive Genotypes of G. herbaceum Root architectures Pyrosequencing data Drought tolerant Drought sensitive Reads Bases Contigs Singleton Av. Contig length Av. S. length GujCot-21 55, 620 13, 020, 140 1,281 30, 501 481.7bp 237.8bp RAHS-IPS 187 49, 308 11, 199, 207 858 30, 776 532.9bp 228.6bp Supercontigs 1, 04, 928 24,219, 347 2, 664 50, 531 508.7bp 231.7bp Microarray data Genotypes Differentially up regulated genes (Fold change ≥ 2) Vagad water 165 RAHS- 14 water 156 Vagad drought 256 RAHS- 14 drought 538 Ranjan A et.al., BMC genomics 2012 November

Functional enrichment of genes of root tissue in drought tolerant and sensitive genotypes Tolerant genotype Sensitive genotype Regulation of Transcription factors (TFs) under drought stress Ranjan A et.al., BMC genomics 2012 November

Differentially expressed genes analyzed by Genevestigator in mapping the specific expression of genes in different root zones Ranjan et.al., BMC Genomics (2012) 13: 680

Selection of Candidate Gene for Studying the Abiotic Response Identification of Transcription Activator (TA) from Cotton Transcriptome of root tissue Expression of TA Library name TL TR SL SR Number of Transcript (tpm) 72 (TL-tolerant leaf, TR-tolerant root, SL-sensitive leaf, SR-sensitive root)

Increased tolerance to drought and salt stress and better root development in tobacco over expressing GheTA WT GheTA control 50 mM Mannitol 150 mM Mannitol 200 mM Mannitol 250 mM Mannitol WT GheTA control 50 mM NaCl 75 mM NaCl 100 mM NaCl 150 mM NaCl GheTA WT GheTA WT

Abiotic stress tolerance of the GheTA over-expressing tobacco transgenic plants by leaf disk assay Control 5 % PEG 10 % PEG 100 mM NaCl 150 mM NaCl 250 mM Mannitol 500 mM Mannitol WT GheTA WT GheTA

Over-Expression of GheTA leads to increased root biomass and better WUE in cotton transgenic plants Wild type Cotton Transgenic Carbon Isotope Discrimination ratio shows higher water use efficiency (WUE) of cotton TA transgenic plants

Expressional Reprogramming During Fiber Development In Contrasting Genotypes of G. hirsutum Fiber quality of genotypes Fiber Quality Parameters Superior genotypes Inferior genotypes JKC 725 JKC 777 JKC 703 JKC 737 JKC 783 2.5% span length (mm) 30.5-32.5 21.5-23.5 25.5-26.5 23.5-25.5 Fiber strength (g/tex) 24.5-26.0 21.5-22.5 22.5-23.5 Fineness (micronaire) 3.7-4.0 4.0-4.3 3.2-3.5 4.0-4.2 Fiber lignin content Fiber cellulose content Nigam et al. (2013) Communicated

Microarray data analysis Method used for Analyzing Microarray Data from Contrasting Cotton Genotypes Microarray data analysis Two way ANOVA study

Singular Enrichment Analysis (SEA) Genotype significant genes DPA significant genes Interaction significant genes

MapMan Bins Cluster Analysis 0DPA 9DPA 12DPA 19DPA 25DPA 6DPA Cluster-4 Cluster-6 Cluster-5 A B Superior genotypes Inferior genotypes

Enrichment of Transcription factors

25 DPA Fiber Transcriptome: Assembly and Annotations de novo and merged assembly Annotation of unigenes Parameters Merged assembly of both genotypes JKC 703 JKC 777 Total reads generated 547939 488128 1036067 Total bases generated (Mb) 168.4 160.9 329.3 Average read size (bp) 307 330 318 High Quality reads used in assembly 529496 473666 972907 All Contigs (>100 bp) 17900 16457 21308 Singletons 53983 45023 81120 Total bases after assembly (MB) 24.7 22.3 37.2 Large contigs (>500 bp) 8752 8153 12947 Largest contig size (bp) 3560 4991 5008 Average contig size (bp) 860 823 936 N50 contig sizea (bp) 893 838 987 Aligned Reads (%) 86.8 88.0 86.9 Aligned Bases (%) 86.4 87.4 85.6 Inferred read errorb (%) 2.0 1.8 Q40 plus basesc (%) 94.0 94.1 95.5 Parameters Genotypes Merged assembly of both genotypes JKC 703 JKC 777 Total unigenesa 71883 61480 102428 Hits in 'NCBI nr' database 40134 36844 56390 Hits in 'tair9' database 33,359 37,017 51120  Hits in ESTScan 44592 40570 20,852 Differential unigenes 2156 2076 -

Correlation between Cotton Fiber Microarray and Transcriptome Sequencing R2 =0.68, p-value = 0.001

Hypothetical regulatory model showing over-represented genes and pathways in Superior and Inferior genotypes SA ? ABA 25 19 12 9 6 C2C2-GATA NAC BR Increased Stress Tolerance to facilitate elongation process up to its completion Barssinosteroid signalling WRKY Auxin GA Fibre Cells Continuative Elongation JA Increased Stress Environment within fibre cell and complete end of cell elongation process H202 POX Transcription factors Oxiplipins Continued Barssinosteroid Signalling SCW ABI3/VP1 SPL5 Phospholipases Pat5, Pat6 AGPs Flavonoid Biosynthesis DET2 BES1 End of Cell Elongation Energy Exhaust Efficient energy source for fast elongating fiber (Oxidative Phosphorylation e.g.TCA ,Glycolysis) Transporter Machinery BIN2 DET2 BES1 ABC transporter Barssinosteroid Targeted Gene Expression Barssinosteroid Targeted Gene Expression Elongation CSEA Asp family Continued Energy Providing Machinery Cell wall Enzymes Pectin Modification SCW formation Starts PG,PAE,PME,PMEI Cell wall loosening ALDH LIM domain S3 mRNA and Protein degradation Ubiquitin ligase, Proteasome, Splisosome Large number of ribosomal subunits (Better Protein synthesizing machinery ) MEE 59 PRF5 EXL5 Decreased high elongation rate of fibre cells during elongation period APY2 AGPs SPL3 Cell death HSF Calcium Signalling Induction of Stress Hormones signal like, Ethylene and ABA Lipid peroxidation HSPs, Ca ++ /CAM1 Initiation AP2-EREBP H2O2 Oxidative stress ROS ROS Ascorbate peroxidase Glutathione-S-Transferase WRKY Oxidative stress SUPERIOR GENOTYPE INFERIOR GENOTYPE

Positions of mapped differentially expressed genes on QTL QTL start QTL end QTL size QTL size (MB) chr size chr size (MB) % covered by QTL in a chr QTL location Total No. of genes in the QTL region Completely Mapped genes in QTL region (Our data) Total mapped gene on Chromomosome (our data) 17773028 45685292 27912264 27.91 55868233 55.86 49.9 Chr 1 2757 50 138 1390547 20704735 19314188 19.31 62769430 62.76 30.7 Chr 2 2148 49 130 3185870 35433985 32248115 32.24 45765648 45.76 70.4 Chr 3.1 2986 29 92 42011485 45719191 3707706 3.707 8.1 Chr 3.2 884 18 50443620 58823186 8379566 8.37 62178258 62.17 13.47 Chr 4 1008 32 133 4002919 62380583 58377664 58.37 64140413 64.14 91.0 Chr 5 11474 77 109 14491832 56506108 42014276 42.01 51074515 51.07 82.2 Chr 6 1334 45 114 3099961 5211929 2111968 2.11 60982465 60.98 3.4 Chr 7.1 295 13 172 13773747 57194167 43420420 43.42 71.2 Chr 7.2 4441 68 37014370 55710457 18696087 18.69 57128820 57.12 32.7 Chr 08 4592 79 147 4942098 29728227 24786129 24.78 70713020 70.71 35.0 Chr 09 3146 245 2519220 3161683 642463 0.64 62681010 62.68 1.0 Chr 11 95 2 140 600 1684 Percentage of genes mapped in QTL 600/1684 = ~35% Circos Plot of all the differential genes mapped on the cotton jgi genome with all the QTL.

Heat map of 9 TA5 transcription factors microRNA156 Targeted Transcription Factor (TA5) Governs the Boll Number, Size and Lint Yield in G. hirsutum Heat map of 9 TA5 transcription factors q-RT PCR measurement Ghi8127.1.S1_s_at (GhSPL5) Gra1201.1.A1_at (GhSPL13) Gra1300.1.A1_at (GhSPL14)

GhTA5 produce transcripts that are targeted by miR156 Northern blot of miR156 & miR172 in cotton fiber

Phenotypic evaluation of overexpression and knockdown lines Overexpression Lines Wild Type Knockdown Lines Overexpression lines Knockdown lines 1 2 3 5 4 6 Wild Type Number of Cotton Bolls Overexpression line Knockdown line Lint cotton weight (gm) /plant Seed cotton weight (gm) /plant Average of cotton boll per plant

Identification and characterization of fiber specific promoter in G Identification and characterization of fiber specific promoter in G. hirsutum (NBRI_2800) Fold Change Fiber developmental stages Expression pattern of NBRI 2800 gene in different fiber developmental stages

Histochemical localization of GUS expression in NBRI_2800 transgenic cotton plant

& Epigenetic regulation Acknowledgments HMPR Sequencing & Epigenetic regulation Sunil K. Singh Krishan M. Rai Verandra Kumar Functional Genomics Neha Pandey Rajiv Tripathi Vrijesh Yadav Anshulika Sable Bioinformatics Dr. Mehar Asif Dr. Sumit K. Bag Dipti Nigam Archana Bhardwaj Ridhi Goel Pooman Pant Cotton Marker Dr. S N Jena Anukool Srivastava Ravi P. Shukla Collaborators J K Agrigenetics Tierra Seed Science TNAU, Coimbatore CICR, Nagpur UAS, Dharwad

Thank You