Session 1: WELCOME AND INTRODUCTIONS

Slides:



Advertisements
Similar presentations
NISS Metabolomics Workshop, Integrative Analysis of High Dimensional Gene Expression, Metabolite and Blood Chemistry Data Kwan R. Lee, Ph.D. and.
Advertisements

Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Introduction to Bioinformatics Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.
Bioinformatics and the Engineering Library ASEE 2008 Amy Stout.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
BI420 – Course information Web site: Instructor: Gabor Marth Teaching.
NCBI resources III: GEO and expression data analysis Yanbin Yin Fall
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Bioinformatics.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Data Analysis Summary. Elephant in the room General Comments General understanding that informatics is integral in medical sequencing and other –omics.
Functional Genomics Carol Bult, Ph.D. Course coordinator The Jackson Laboratory Winter/Spring 2012 Keith Hutchison, Ph.D. Course co-coordinator.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Gene Expression Omnibus (GEO)
EB3233 Bioinformatics Introduction to Bioinformatics.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Accessing and visualizing genomics data
Bioinformatics Shared Resource Introduction to Gene Expression Omnibus (GEO) bsrweb.sanfordburnham.org
Open Genomic Data Repositories and Analysis Resources Megan Laurance, Ph.D. Research Library.
Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.
Web Resources for Genomics Kei Cheung, Ph.D. Assistant Professor Yale Center for Medical Informatics (MBB 452a Genomics & Bioinformatics) Oct. 8, 2003.
Entrez, dbSNP, GEO, OMIM & LinkOut JanPlan Entrez Distributed by NCBI in 1991 on CD-ROM Included linked nodes: GenBank & PDB Translated GenBank,
To develop the scientific evidence base that will lessen the burden of cancer in the United States and around the world. NCI Mission Key message:
Introduction to Genes and Genomes with Ensembl
Pathway Informatics 16th August, 2017
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Introduction to Bioinformatics and Functional Genomics
Cancer Genomics and Class Discovery
Uma Chandran, MSIS, PhD Research Associate Professor, DBMI Training Program Core Faculty Co-director, Cancer Bioinformatics Services Director, Genomics.
greasing the wheels of biological big data analysis
Uma Chandran, MSIS, PhD Research Associate Professor, DBMI Training Program Core Faculty Director, Cancer Bioinformatics Services Director, Genomics Analysis.
Optimizing Biological Data Integration
Introduction to Bioinformatics February 13, 2017
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
생물정보학 Bioinformatics.
NCI’s Genomics Data Commons (GDC) & NCI Cloud Pilots
Day 5 Session 29: Questions and follow-up…. James C. Fleet, PhD
Week 1 Notes/Slides/Information for the Purdue Big Data For Biologists Workshop 2016 Fleet 2016.
Day 4 Session 22: Questions and follow-up…. James C. Fleet, PhD
Many Sample Size and Power Calculators Exist On-Line
Department of Genetics • Stanford University School of Medicine
BigTaP: Week II Wanqing Liu, PhD Min Zhang, MD, PhD
Covering the Bases: Carrie Iwema, PhD, MLS
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
Functional Annotation of the Horse Genome
Gene Expression Omnibus (GEO)
Project2 Daotai Nie, PhD Hongwei Dong PhD Wenjun Zhang PhD
Genomes and Their Evolution
Gene Expression Analysis and Proteins
Instructor: Kritika Karri
BF528 - Applications in Translational Bioinformatics
Introduction to Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Pathway Informatics December 5, 2018 Ansuman Chattopadhyay, PhD
Department of Biochemistry and Molecular Biology
TOPMed Analysis Workshop Genetic Analysis Center Biostatistics Department University of Washington TOPMed Data Coordinating Center August 7-9, 2017 Introduction.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
AI in Genomics Saed Sayad M.D., Ph.D. Department of Computer Science
BF528 - Applications in Translational Bioinformatics
Introduction to Bioinformatics
The Cancer genome atlas (TCGA) and the search for a CUP genetic/epigenetic signature Manel Esteller, MD, PhD. Director, Josep Carreras Leukaemia Research.
Biological Databases.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Session 1: WELCOME AND INTRODUCTIONS 2017 Session 1: WELCOME AND INTRODUCTIONS

Instructors and Teaching Assistants Main Instructors: James C. Fleet, PhD (Nutrition Science) Wanqing Liu, PhD (Medicinal Chemistry and Molecular Pharmacology) Pete Pascuzzi, PhD (Libraries) Min Zhang, PhD (Statistics) Teaching Assistants: (Statistics) Chen Chen Min Ren Kirsen Sullivan Will Eagan Harley Schawadron Fleet 2017

Introductions Who are you? Where are you from? What is your research interest? Why are you interested in “big data”? Fleet 2017

Workshop Overview Fleet and Pascuzzi Unit 1: Microarray Unit 2: Next Generation Sequencing Liu and Zhang Unit 3: Biomarker Discovery Unit 4: Genetic Variation Technical Goals: Analysis pipelines Statistical issues Visualization Functional annotation Databases Project management Computation and programming Fleet 2017

Course Materials http://www.stat.purdue.edu/bigtap/index.html

Guest Lecturers Doug Crabill (Purdue University) Bruce Craig (Purdue University) Xiang Zhang (University of Louisville) Sean Davis (National Cancer Institute) Dan Raftery (University of Washington) Yonglan Zheng (University of Chicago) Nancy Cox (Vanderbilt University) Nadia Atallah (Purdue University) Fleet 2017

Session 2: Working with the Purdue Computer Infrastructure Doug Crabill Department of Statistics Purdue University

Sites to Understand Computing UNIX operating system Learn UNIX http://www.tutorialspoint.com/unix/index.htm Linux operating system http://www.tutorialspoint.com//operating_system/os_linux.htm R coding http://bioinformatics.knowledgeblog.org/2011/06/21/using-r-a-guide-for-complete-beginners/ https://www.r-project.org/about.html Fleet 2017

Session 3: Data Repositories and Pre-processed Data Sites James C. Fleet, PhD Distinguished Professor Department of Nutrition Science

Data Archives Web link Description NIH Data Sharing Repositories https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) sites Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/ NCBI; transcriptome and ChIP-seq datasets Array Express http://www.ebi.ac.uk/arrayexpress/ EMBL-EBI repository to archive functional genomics data European Nucleotide Archive (ENA) http://www.ebi.ac.uk/ena Comprehensive record of worlds nucleotide sequencing information The Cancer Genome Atlas (TCGA) http://cancergenome.nih.gov/ Multi "omic" phenotype characterization of tumors Proteomics IDEntifications (PRIDE) http://www.ebi.ac.uk/pride/archive/ European proteomics datasets Metabolomics Workbench http://metabolomicsworkbench.org/standards/nominatecompounds.php metabolomic datasets Fleet 2017

Genotype-Tissue Expression project (Gtex) Data Archives Web link Description Oncomine https://www.oncomine.org/resource/login.html 715 microarray datasets from 19 cancers Gene Expression across Normal and Tumor tissue (GENT) http://medical-genome.kribb.re.kr/GENT/ gene expression patterns in human cancer from Affy Chips (+ 1000 cell lines) cBioPrortal http://www.cbioportal.org/ TCGA cancer genomics Genotype-Tissue Expression project (Gtex) http://www.gtexportal.org/home/ human, multi-tissue gene expression and gene variation for eQTL Immunological Genome Project (Immgen) http://www.immgen.org/ transcriptome data from cultured mouse immune cells Human Brain Transcriptome http://hbatlas.org/ transcriptome and associated metadata for developing and adult human brain. NHLBI Kidney Transcriptome database https://hpcwebapps.cit.nih.gov/ESBL/Database/Transcriptomic/index.html Segment-specific expression in rat kidney Kidney Systems Biology Project https://hpcwebapps.cit.nih.gov/ESBL/Database/ Multi-omic database from rat and mouse studies Saccharomyces Genome Database http://www.yeastgenome.org/transcriptome-data-in-yeastmine Integrated biological information on budding yeast miRBase http://mirbase.org/ published miRNA sequences, annotation. Expression dataset links available Fleet 2017

Fleet 2017

Training GEO Datasets Unit 1 and 2 GSE15947: Time course of 1,25(OH)2 D treated RWPE1 cells (Unit 1) GSE80182: A TGFb-PRMT5-MEP50 Axis regulates cancer cell invasion through histone H3 and H4 arginine methylation coupled to transcriptional activation and repression. (Unit 2) GSE #: Accession number for an original, submitter supplied record that summarizes a study GDS #: GSE data that is reassembled by GEO staff into a curated data set GSM #: Accession number for a specific sample within a dataset GPL #: The platform used to generate a dataset SRX #: Accession number for a sample generated by NGS that is deposited in the Sequence Read Archive (SRA) Fleet 2017

Assignment 1 (Individual) Search GEO for datasets that relate to your research Select one dataset Identify important information about your dataset Description/design GSE and GDS # Sample information Platform Analyze your dataset using GEO2R Tools in the dataset browser