Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward.

Similar presentations


Presentation on theme: "School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward."— Presentation transcript:

1 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward

2 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Overview Initial QC Normalisation Batch Correction Data MWAS (Methylome Wide Assoc. Study) Results

3 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Initial QC Probe p-value confidence in detection background -ve controls overall QC indicator High background Low signal Poor stringency

4 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Initial QC: Control Probes Mixture of dependent/independent Sample independent Staining (Biotin/DNP) Hybridisation (synthetic target) Extension (hairpin) Sample dependent Bisulfite conversion (HindIII site) G/T mismatch (non-spec.) Specificity & Non-polymorphic Negative

5 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Initial QC: LIMS

6 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL LIMS Control DashBoard Real time Jscript/JSON Zoom & scroll All Illumina controls probes +ve & -ve Area Max Median Min

7 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Intial QC: MDS Start pre-processing Whats affecting the data? Failures controls

8 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Initial QC: MDS Remove Controls/Failures Remove Sex Chromosomes

9 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Sample Confirmation Genotyping 65 SNP probes Kmeans clustering Call genotype Cross reference with SNP data Calculate % match Fully automated in pipeline Stored in LIMS

10 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation Why? Cancer vs. Control – not req. More sensitive differences... Quantile? Rank & scale according to ref dist. (av.) Not appropriate: Type I & II assays differ Medians – opposite ends of β scale SD (across reps.) smaller in Type I probes Interrogate different subsets of the genome –Type II > proportion in open-sea –Type I > proportion in gene promoters

11 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Method 1 Subset Within Array Normalisation (minfi) To address differences in dist: No. of CpGs in probe body indicates density/loc. Dist. more similar in these groups Approach Reference quantiles: –N random type I & II selected for each group –Split meth/unmeth channels Linear interpolation fit probes to ref. Doesnt treat type I & II separately BUT does decrease difference

12 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Method 2 Touleimat & Tost To address differences: CpG region –Shore / Shelf / Island / Open-sea Treat Type I & II separately Approach: reference quantiles –Type I used anchors for each region –More reliable / lower SD estimate target quantiles Fit type II to target

13 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Method 3 Dasen (wateRmelon) Under review Separate QN of methylated Type I unmethylated Type I methylated Type II unmethylated Type II intensities. Both directions

14 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Comparison wateRmelon metrics: Imprinted DMRs 237 probes within iDMRs iDMR e=50% meth. SE = SD / N –SD of all 237 probes –N = number of samples iDMRs Raw0.00431 Dasen0.00241 Tost0.00214 Swan0.00428

15 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Comparison SNP probes 63 highly polym. SNP probes K-means clustering into 3 genotypes SE like measure for each group AAABBB Raw9.025 e -05 1.910 e -04 5.145 e -05 Dasen1.669 e -04 2.047 e -04 2.321 e -05 Tost8.253 e -05 5.242 e -04 1.541 e -04 SwanNa na

16 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Normalisation: Comparison wateRmelon metrics: X-Chromosome Inactivation 11,232 probes T-test all probes for sex differences ROC analysis –using p-val for sex diff. 1 – AUC –0 being the perfect predictor & best sex separation X-Inact. Raw0.0947 Dasen0.0889 Tost0.0892 Swan0.4952

17 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Comparison: Density Plots Metrics are great but how do they really effect the data? All typeI typeII

18 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Comparison: Density Plots Normalised distributions All typeI typeII

19 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Comparison: Scatter Plot Pepsi Plot – youll see why! Raw (x) vs. Normalised (y) typeI typeII SWAN Tost dasen

20 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Comparison: Scatter Plot

21 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction: Exp. Design Bisulphite Conversion Excess of samples > 48 Redundant controls QC and PCR MSA4 Plate Well dictates chip position (Robot) Randomised Min. 4 of each time point Max 1 control Mix of gender Infinium 450k Chips 12 arrays per chip Throughput doubled

22 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction: Metadata LIMS tracking Every process All consumables ~20 Formamide to hyb. Buffers > 1000 used so far! All equipment Fridge/centrifuge/PCR block

23 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction What are we seeing? Bisulphite batch Correction Many algorithms available SVD/SVA/DWD Gene expression ComBat Chen C, Grennan K, Badner J, Zhang D, Gershon E, et al. (2011) Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods. PLoS ONE 6(2): e17238. doi:10.1371/journal.pone.0017238 Empirical Bayesian framework Create a model matrix Supply batch var Standardise gene-wise –Least squares approach Fits L/S model – find priors Adjust to empirical parametric priors

24 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction Example data Batch correct Tost norm. data use M values Convert back to β Values can escape 0-1 limit Scale 0.02% of probes Dist. unaffected.

25 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction: BEFORE

26 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Batch Correction: AFTER

27 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Datasets ARIES pre-release: Filtered probes SNP probes Age groupn Cord584 F7598 TF3 (15)64 F17280 Antenatal394 FOM329

28 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL MWAS Choice of servers: Epi-garrod BlueCrystal

29 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Epi-garrod Request account via IT-services for: epi-garrod.bris.ac.uk Relatively quiet server in the dept. No queuing system Check htop before running jobs Cord data requires ~15% RAM

30 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Epi-garrod Data: SAN Accessible from multiple servers /mnt/sscm3/ARIES_DATA/… Permissions for this folder You must be a member of the aries group

31 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Blue Crystal Request an account via: https://www.acrc.bris.ac.uk/login-area/apply.cgi Queuing handled Data: /gpfs/cluster/smed/alspac-shared/aries/… Again, permissions required: Member of aries group

32 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Files ALN_dasen_ >_betas.Rdata ALN_tost_ >_betas.Rdata >_manifest.Rdata fdata.Rdata MWAS.r

33 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ALN_dasen_ >_betas.Rdata

34 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL >_manifest.Rdata

35 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Fdata_new.RData

36 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL CpGassoc CRAN http://cran.r-project.org/web/packages/CpGassoc/index.html Tests for association between an independent variable and methylation Option to include additional covariates Assesses significance with: Holm (step-down Bonferroni) FDR methods

37 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL MWAS.r

38 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL MWAS.r continued...

39 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL MWAS.r continued...

40 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Manhattan / QQ Replicated the following studies results: 450K Epigenome-Wide Scan Identifies Differential DNA Methylation in Newborns Related to Maternal Smoking during Pregnancy. Bonnie R. Joubert, et.al., et.al Gene hits: GFI1, AHRR, MYO1G, CYP1A1 "CYP1A1 plays a key role in the aryl hydrocarbon receptor signaling pathway, which mediates the detoxification of the components of tobacco smoke." - Joubert, et.al.,et.al

41 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Results file

42 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL BlueCrystal.bashrc

43 School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL Any Questions?


Download ppt "School of SOCIAL AND COMMUNITY MEDICINE University of BRISTOL ARIES Methylation Pre-processing and Clean up Geoff Woodward."

Similar presentations


Ads by Google