Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biopop, Seoul National University

Similar presentations


Presentation on theme: "Biopop, Seoul National University"— Presentation transcript:

1 Biopop, Seoul National University
Han Kyung University GWAS ( Practice ) Hyeon soo Jeong December 2014 Biopop, Seoul National University

2 Contents Basic statistics Linux system PLINK Quality control
Linear regression Manhattan plot Kinship analysis IBS matrix using EMMAX Mixed linear model for dealing inbred QQ-plot Comparison

3 Contents Basic Statistics

4 Basic Statistics 𝑦~𝑋β+ε Regression analysis
Conall M. O'Seaghdha & Caroline S. Fox, 2012, Nat Rev Neprolo

5 Basic Statistics Hypothesis Null Hypothesis : β=0
Alternative Hypothesis : β≠0 (Two tails)

6 Basic Statistics Type I and Type II error

7 Contents Linux system

8 Linux system Ubuntu LTS Password :

9 Linux system 리눅스 명령어 (Ubuntu ) 명령어 기능 cd <dir>
ls 파일 또는 폴더 리스트를 출력 ll ls 보다 더욱 상세한 파일 또는 폴더 정보 출력 mkdir 디렉토리 생성 cp 파일 또는 폴더 복사 mv 파일 이동 rm 파일 삭제 mv -r 디렉토리 이동 rm -r 디렉토리 삭제

10 Data Structure Pedigree file (PED file) 1 2 3 4 5 6 7 ~~
IID FID father Mother sex phenotype genotype~ ~~ Fam1 Ind G T Fam2 Ind G G Fam2 Ind G T Fam2 Ind T T Fam3 Ind T T Fam4 Ind G T

11 Data management MAP file 1 2 3 4 chr SNP_ID recom position
1 SNP 1 SNP 1 SNP 2 SNP4 0 13 2 SNP 2 SNP

12 Data management PED & MAP 1 SNP1 0 13435 1 SNP2 0 25256 1 SNP3 0 28242
Fam1 Ind G T A A G G Fam2 Ind G G A A G T Fam2 Ind G T T A G G Fam2 Ind T T A A T T Fam3 Ind T T T T T T Fam4 Ind G T A A T T

13 Contents PLINK : Whole Genome Data Analysis Toolkit

14 PLINK ( Download & install )
Program download

15 PLINK ( Download & install )
Program download $ wget

16 PLINK ( Download & install )
installation $ unzip plink-1.07-x86_64.zip

17 PLINK ( Download & install )
$ wget $ unzip plink-1.07-x86_64.zip $ cd plink-1.07-x86_64 $ sudo cp plink /usr/local/bin/

18 Data management How to convert ped file to bed file?
$ plink --cow --file Testfile --make-bed --out TestData --noweb .bed : genotype information (열 수 없음.) .fam : family information (ped file 1~6 columns) .bim : SNP information (map file + allele types)

19 Data management 각종 여러 가지 데이터 변형 관련 명령어

20 Contents Quality Control

21 Quality control Hardy-Weinberg Equilibrium
$ plink --noweb --cow --bfile TestData --hardy2 --out TestData_hardy $ vi TestData_hardy.hwe

22 Quality control Hardy-Weinberg Equilibrium filteration
$ plink --noweb --cow --bfile TestData --hwe make-bed --out TestData_hwe

23 Quality control Minor allele frequency
$ plink --cow --bfile TestData --freq --out TestData_Freq --noweb

24 Quality control Minor allele frequency filteration
$ plink --cow --bfile TestData --maf make-bed --out TestData_maf --noweb

25 Quality control Missing data filteration
$ plink --cow --bfile TestData --missing --out TestData_missing --noweb imiss lmiss

26 Quality control --geno : genotype filter (missing > 0.9)
--mind : individual filter (missing > 0.9) $ plink --cow --bfile TestData --geno make-bed --out TestData_geno --noweb $ plink --cow --bfile TestData --mind make-bed --out TestData_mind --noweb $ plink --cow --bfile TestData --mind geno maf hwe out 1_QC_TestData --noweb

27 Quality control

28 Contents Linear Regression Analysis

29 Linear Regression Analysis
##Linear model plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci out 2_LM_QC_TestData --noweb --cow plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci dominant --out 2_LM_QC_dom_TestData --noweb --cow plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci recessive --out 2_LM_QC_rec_TestData --noweb --cow

30 Linear Regression Analysis

31 Linear Regression Analysis
Convert “Space” to “Tab” determination #1.LM_Convert.py #!/usr/bin/python inputs = raw_input("input file : ") data = open(inputs, 'r') outf = open('QQinput.txt', 'w') for i in data: ip = i.strip().split(' ') co = ip.count('') for i2 in range(co): ip.remove('') for i3 in ip: outf.write(i3+'\t') outf.write('\n') data.close() outf.close()

32 Linear Regression Analysis
Convert “Space” to “Tab” determination #1.LM_Convert.py python 1.LM_Conver.py

33 Manhattan plot #2.Manhattan.R
Data <- read.table("QQinput.txt", sep= '\t', header = T) # file import library(gap) # import gap library png("Manhattan.png", height = 6000, width=7000, res=900) # Save as png format mhtdata <- with(Data,cbind(CHR,BP,P)) # Select columns color <- rep(c("blue","red"),15) # Manhattan plot color par(cex=0.5) ops <- mht.control(colors=color,yline=1.5,xline=1) mhtplot(mhtdata,ops,pch=25) axis(2,pos=1,at=1:25) bon <- -log10(1.0E-05) abline(h=bon, col="black") abline(h=0) title("Main title",cex.main=2.5) dev.off()

34 Manhattan plot #2.Manhattan.R Rscript 2.Manhattan.R

35 Manhattan plot

36 Contents Mixed Linear Model

37 Mixed Linear Model QQ-plot #3.QQplot.R Input: QQinput.txt

38 Mixed Linear Model QQ-plot Significant loci Real??

39 Mixed Linear Model Kinship Analysis
$ wget

40 Mixed Linear Model Kinship Analysis and its problem Identical twins
Family king -b 1_QC_TestData.bed --kinship

41 Mixed Linear Model EMMAX

42 Mixed Linear Model IBS Matrix using EMMAX
$ emmax-kin-intel64 -v -s -M 1 -d 10 3_Trans_QC_TestData

43 Mixed Linear Model Mixed Linear Model using EMMAX
$ emmax-intel64 -v -d 10 -t 3_Trans_QC_TestData -p Data/nBFzH.pheno -k 3_Trans_QC_TestData.aIBS.kinf -o 4_MLL_QC_TestData

44 Contents QQ-plot Comparison (PLINK vs EMMAX)

45 QQ-plot Comparison PLINK vs. EMMAX

46 Contents Functional Analysis

47 Functional Analysis Coremine

48 Functional Analysis DAVID

49 THANK YOU


Download ppt "Biopop, Seoul National University"

Similar presentations


Ads by Google