Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.

Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing | Bioinformatics Facility Tutorials and Workshops http://facility.bioinformatics.ucr.edu/home/workshops The facility is also focused on the education and training of researchers and users of the facility; biologists in the analysis and interpretation of their experimental data. We provide custom advice and training in bioinformatics as well as in the acquisition of basic / advanced programming skills. Following schedule is available for the training events: i)Quarterly workshops: focused on introduction to R / Bioconductor, hands-on training on Next-Generation Sequencing (NGS) data analysis using R. ii) 5-days NGS workshop: organized every year, this one-week comprehensive workshop is focused on users who want to acquire the skills required to analyze the NGS and other large-scale datasets independently and in a proficient manner. Gene expression arrays STATISTICAL TESTING between compared sample groups Fold-change for the size of the change, P-values and false discovery rates for the reliability of the change FUNCTIONAL ANALYSIS GO, KEGG, other ontologies Ingenuity GSEA Motif analysis (known and de-novo) Specific requests Raw Data Preprocessing Statistical testing Normalization Data inspection & Quality Control Filtering for differentially expressed genes Functional analysis MicroarraysMicroarrays NGS-to-Spreadsheet Analysis Pipelines Other Services Other NGS pipelines: The facility is developing workflows for analyses in other areas, including: i)Small RNA analysis ii)Variant Annotation using VAAST iii)Other Custom projects Web resources: ➢ Many project specific web services and databases http://facility.bioinformatics.ucr.edu/resources/software ➢ Galaxy server http://galaxy.bioinfo.ucr.edu ➢ BLAST / EMBOSS servers http://facility.bioinformatics.ucr.edu/resources/web- tools/www-blast ➢ RStudio server https://rstudio.bioinfo.ucr.edu ➢ Web container support Facility Members Visit http://bioinfo.ucr.edu/ for more details about any of our services. We can also work with you to develop custom bioinformatics solutions on problems not listed here or on the website. Contact: Rakesh Kaundal, Ph.D. Director, Bioinformatics Facility (rkaundal@ucr.edu)rkaundal@ucr.edu) Systems Administrator: Jordan Hayes Bioinformatics Programmer: Neerja Katiyar  Collaborative: please contact to discuss your research projects  Cost of analysis: price varies for UC-system, other academic institutes, and commercial organizations; please see details at: http://illumina.bioinfo.ucr.edu/ht/documentation/analysis http://biocluster.ucr.edu/~rkaundal/Documents/Recharge_Rates.pdf http://bioinfo.ucr.edu/ About the Facility http://bioinfo.ucr.edu Mission The UCR’s High-Performance Computing / Bioinformatics Facility is part of the Institute for Integrative Genome Biology (IIGB) in the Genomics building that provides access to high-performance compute resources, data analysis and programming expertise. The resources serve the scientists at UC Riverside and to the external institutes / industry to master the informatics needs of their research in a proficient and cost-effective manner. The following services are offered: Development and maintenance of a high-performance informatics hardware and software infrastructure for the research community, Instruction of hands-on tutorials and workshops on a wide variety of informatics topics, Custom data analysis and consultation services for bioinformatics and cheminformatics projects, Establishment of research collaborations with experimental scientists from different departments. Goals i)Service: provide solutions for managing, visualizing, analyzing, and interpreting genomic data, including studies of gene expression (RNA-seq, microarrays), pathway analysis, protein-DNA binding (ChIP-seq), DNA methylation, and DNA variation, using high-throughput platforms, ii) Collaborations: identify opportunities and implement research collaborations, make available findings through presentations, and contribute to scientific publications as well as participate in the preparation of joint grant applications and reports, iii) Training: provide outreach and education in the form of seminars, workshops, and training series. Learn more at http://facility.bioinformatics.ucr.edu/home/workshops. Extensive manuals and tutorials are available on this website. Quality Control AATGCGTACATGCACCANTTCAG TGTCANNTGCATTACATGCATTG AATGCGTACATGCACCANTTCAG TGTCTTTTGCATNACATGCAAAA TGTCTTTTGCATNACATGCAGGG Billion of short reads Detection of alternative splicing, differential expression (RNA-Seq) Generate read counts, convert to RPKM values Statistically identify differentially expressed genes (edgeR, DESeq2, DEXSeq) Detection of novel splicing sites Visualization of read mapping / analysis results Detection of alternative splicing, differential expression (RNA-Seq) Generate read counts, convert to RPKM values Statistically identify differentially expressed genes (edgeR, DESeq2, DEXSeq) Detection of novel splicing sites Visualization of read mapping / analysis results NGS Data Analysis Functional annotation Annotation of coding and non-coding genes Functional domain prediction Molecular Function assignment, Gene Ontology (GO) terms Mapping of proteins into metabolic pathway and reactions Functional annotation Annotation of coding and non-coding genes Functional domain prediction Molecular Function assignment, Gene Ontology (GO) terms Mapping of proteins into metabolic pathway and reactions RNA, DNA Whole genome & Transcriptome De novo genome & transcriptome assembly Exome - enriched sequences SNP, indels, structural variation (Var-Seq) Variant calling, SNPs and short indels using GATK Annotation of variants, synonymous/non- synonymous SNPs, map to genomic regions SNP, indels, structural variation (Var-Seq) Variant calling, SNPs and short indels using GATK Annotation of variants, synonymous/non- synonymous SNPs, map to genomic regions Filtering / trimming Removing low quality reads. Trimming low quality ends. Filtering adapters, primers etc… D etection of enriched regions (ChIP-Seq) Peak calling (e.g. BayesPeak) Annotation of peaks with genomic information Differential peak analysis with edgeR or DESeq Detection of enriched motifs in binding sites Analysis of gene enrichment Visualization in genome browser / IGV D etection of enriched regions (ChIP-Seq) Peak calling (e.g. BayesPeak) Annotation of peaks with genomic information Differential peak analysis with edgeR or DESeq Detection of enriched motifs in binding sites Analysis of gene enrichment Visualization in genome browser / IGV Check assembly quality (QUAST), Repeat masking, Gene prediction (GeneMark, FGENESH). Read mapping (BWA, Bowtie, TopHat, Bismark) CHROMOSOMAL DNA Working model D etection of methylated regions (Methyl-Seq) Sample correlation and clustering Differential methylation analysis Annotate differentially methylated regions Visualization in genome browser / IGV D etection of methylated regions (Methyl-Seq) Sample correlation and clustering Differential methylation analysis Annotate differentially methylated regions Visualization in genome browser / IGV

Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.

Similar presentations

Presentation on theme: "Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.

Similar presentations

Presentation on theme: "Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing."— Presentation transcript:

Similar presentations

About project

Feedback