Presentation is loading. Please wait.

Presentation is loading. Please wait.

Session 1: WELCOME AND INTRODUCTIONS

Similar presentations


Presentation on theme: "Session 1: WELCOME AND INTRODUCTIONS"— Presentation transcript:

1 Session 1: WELCOME AND INTRODUCTIONS
2017 Session 1: WELCOME AND INTRODUCTIONS

2 Instructors and Teaching Assistants
Main Instructors: James C. Fleet, PhD (Nutrition Science) Wanqing Liu, PhD (Medicinal Chemistry and Molecular Pharmacology) Pete Pascuzzi, PhD (Libraries) Min Zhang, PhD (Statistics) Teaching Assistants: (Statistics) Chen Chen Min Ren Kirsen Sullivan Will Eagan Harley Schawadron Fleet 2017

3 Introductions Who are you? Where are you from?
What is your research interest? Why are you interested in “big data”? Fleet 2017

4 Workshop Overview Fleet and Pascuzzi Unit 1: Microarray
Unit 2: Next Generation Sequencing Liu and Zhang Unit 3: Biomarker Discovery Unit 4: Genetic Variation Technical Goals: Analysis pipelines Statistical issues Visualization Functional annotation Databases Project management Computation and programming Fleet 2017

5 Course Materials http://www.stat.purdue.edu/bigtap/index.html

6 Guest Lecturers Doug Crabill (Purdue University)
Bruce Craig (Purdue University) Xiang Zhang (University of Louisville) Sean Davis (National Cancer Institute) Dan Raftery (University of Washington) Yonglan Zheng (University of Chicago) Nancy Cox (Vanderbilt University) Nadia Atallah (Purdue University) Fleet 2017

7 Session 2: Working with the Purdue Computer Infrastructure
Doug Crabill Department of Statistics Purdue University

8 Sites to Understand Computing
UNIX operating system Learn UNIX Linux operating system R coding Fleet 2017

9 Session 3: Data Repositories and Pre-processed Data Sites
James C. Fleet, PhD Distinguished Professor Department of Nutrition Science

10 Data Archives Web link Description
NIH Data Sharing Repositories Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) sites Gene Expression Omnibus (GEO) NCBI; transcriptome and ChIP-seq datasets Array Express EMBL-EBI repository to archive functional genomics data European Nucleotide Archive (ENA) Comprehensive record of worlds nucleotide sequencing information The Cancer Genome Atlas (TCGA) Multi "omic" phenotype characterization of tumors Proteomics IDEntifications (PRIDE) European proteomics datasets Metabolomics Workbench metabolomic datasets Fleet 2017

11 Genotype-Tissue Expression project (Gtex)
Data Archives Web link Description Oncomine 715 microarray datasets from 19 cancers Gene Expression across Normal and Tumor tissue (GENT) gene expression patterns in human cancer from Affy Chips ( cell lines) cBioPrortal TCGA cancer genomics Genotype-Tissue Expression project (Gtex) human, multi-tissue gene expression and gene variation for eQTL Immunological Genome Project (Immgen) transcriptome data from cultured mouse immune cells Human Brain Transcriptome transcriptome and associated metadata for developing and adult human brain. NHLBI Kidney Transcriptome database Segment-specific expression in rat kidney Kidney Systems Biology Project Multi-omic database from rat and mouse studies Saccharomyces Genome Database Integrated biological information on budding yeast miRBase published miRNA sequences, annotation. Expression dataset links available Fleet 2017

12 Fleet 2017

13 Training GEO Datasets Unit 1 and 2
GSE15947: Time course of 1,25(OH)2 D treated RWPE1 cells (Unit 1) GSE80182: A TGFb-PRMT5-MEP50 Axis regulates cancer cell invasion through histone H3 and H4 arginine methylation coupled to transcriptional activation and repression. (Unit 2) GSE #: Accession number for an original, submitter supplied record that summarizes a study GDS #: GSE data that is reassembled by GEO staff into a curated data set GSM #: Accession number for a specific sample within a dataset GPL #: The platform used to generate a dataset SRX #: Accession number for a sample generated by NGS that is deposited in the Sequence Read Archive (SRA) Fleet 2017

14 Assignment 1 (Individual)
Search GEO for datasets that relate to your research Select one dataset Identify important information about your dataset Description/design GSE and GDS # Sample information Platform Analyze your dataset using GEO2R Tools in the dataset browser


Download ppt "Session 1: WELCOME AND INTRODUCTIONS"

Similar presentations


Ads by Google