Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biopop, Seoul National University

Similar presentations

Presentation on theme: "Biopop, Seoul National University"— Presentation transcript:

1 Biopop, Seoul National University
Han Kyung University GWAS ( Practice ) Hyeon soo Jeong December 2014 Biopop, Seoul National University

2 Contents Basic statistics Linux system PLINK Quality control
Linear regression Manhattan plot Kinship analysis IBS matrix using EMMAX Mixed linear model for dealing inbred QQ-plot Comparison

3 Contents Basic Statistics

4 Basic Statistics 𝑦~𝑋β+ε Regression analysis
Conall M. O'Seaghdha & Caroline S. Fox, 2012, Nat Rev Neprolo

5 Basic Statistics Hypothesis Null Hypothesis : β=0
Alternative Hypothesis : β≠0 (Two tails)

6 Basic Statistics Type I and Type II error

7 Contents Linux system

8 Linux system Ubuntu LTS Password :

9 Linux system 리눅스 명령어 (Ubuntu ) 명령어 기능 cd <dir>
ls 파일 또는 폴더 리스트를 출력 ll ls 보다 더욱 상세한 파일 또는 폴더 정보 출력 mkdir 디렉토리 생성 cp 파일 또는 폴더 복사 mv 파일 이동 rm 파일 삭제 mv -r 디렉토리 이동 rm -r 디렉토리 삭제

10 Data Structure Pedigree file (PED file) 1 2 3 4 5 6 7 ~~
IID FID father Mother sex phenotype genotype~ ~~ Fam1 Ind G T Fam2 Ind G G Fam2 Ind G T Fam2 Ind T T Fam3 Ind T T Fam4 Ind G T

11 Data management MAP file 1 2 3 4 chr SNP_ID recom position
1 SNP 1 SNP 1 SNP 2 SNP4 0 13 2 SNP 2 SNP

12 Data management PED & MAP 1 SNP1 0 13435 1 SNP2 0 25256 1 SNP3 0 28242
Fam1 Ind G T A A G G Fam2 Ind G G A A G T Fam2 Ind G T T A G G Fam2 Ind T T A A T T Fam3 Ind T T T T T T Fam4 Ind G T A A T T

13 Contents PLINK : Whole Genome Data Analysis Toolkit

14 PLINK ( Download & install )
Program download

15 PLINK ( Download & install )
Program download $ wget

16 PLINK ( Download & install )
installation $ unzip

17 PLINK ( Download & install )
$ wget $ unzip $ cd plink-1.07-x86_64 $ sudo cp plink /usr/local/bin/

18 Data management How to convert ped file to bed file?
$ plink --cow --file Testfile --make-bed --out TestData --noweb .bed : genotype information (열 수 없음.) .fam : family information (ped file 1~6 columns) .bim : SNP information (map file + allele types)

19 Data management 각종 여러 가지 데이터 변형 관련 명령어

20 Contents Quality Control

21 Quality control Hardy-Weinberg Equilibrium
$ plink --noweb --cow --bfile TestData --hardy2 --out TestData_hardy $ vi TestData_hardy.hwe

22 Quality control Hardy-Weinberg Equilibrium filteration
$ plink --noweb --cow --bfile TestData --hwe make-bed --out TestData_hwe

23 Quality control Minor allele frequency
$ plink --cow --bfile TestData --freq --out TestData_Freq --noweb

24 Quality control Minor allele frequency filteration
$ plink --cow --bfile TestData --maf make-bed --out TestData_maf --noweb

25 Quality control Missing data filteration
$ plink --cow --bfile TestData --missing --out TestData_missing --noweb imiss lmiss

26 Quality control --geno : genotype filter (missing > 0.9)
--mind : individual filter (missing > 0.9) $ plink --cow --bfile TestData --geno make-bed --out TestData_geno --noweb $ plink --cow --bfile TestData --mind make-bed --out TestData_mind --noweb $ plink --cow --bfile TestData --mind geno maf hwe out 1_QC_TestData --noweb

27 Quality control

28 Contents Linear Regression Analysis

29 Linear Regression Analysis
##Linear model plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci out 2_LM_QC_TestData --noweb --cow plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci dominant --out 2_LM_QC_dom_TestData --noweb --cow plink --bfile 1_QC_TestData --pheno Data/nBFzH.pheno --linear --ci recessive --out 2_LM_QC_rec_TestData --noweb --cow

30 Linear Regression Analysis

31 Linear Regression Analysis
Convert “Space” to “Tab” determination #!/usr/bin/python inputs = raw_input("input file : ") data = open(inputs, 'r') outf = open('QQinput.txt', 'w') for i in data: ip = i.strip().split(' ') co = ip.count('') for i2 in range(co): ip.remove('') for i3 in ip: outf.write(i3+'\t') outf.write('\n') data.close() outf.close()

32 Linear Regression Analysis
Convert “Space” to “Tab” determination python

33 Manhattan plot #2.Manhattan.R
Data <- read.table("QQinput.txt", sep= '\t', header = T) # file import library(gap) # import gap library png("Manhattan.png", height = 6000, width=7000, res=900) # Save as png format mhtdata <- with(Data,cbind(CHR,BP,P)) # Select columns color <- rep(c("blue","red"),15) # Manhattan plot color par(cex=0.5) ops <- mht.control(colors=color,yline=1.5,xline=1) mhtplot(mhtdata,ops,pch=25) axis(2,pos=1,at=1:25) bon <- -log10(1.0E-05) abline(h=bon, col="black") abline(h=0) title("Main title",cex.main=2.5)

34 Manhattan plot #2.Manhattan.R Rscript 2.Manhattan.R

35 Manhattan plot

36 Contents Mixed Linear Model

37 Mixed Linear Model QQ-plot #3.QQplot.R Input: QQinput.txt

38 Mixed Linear Model QQ-plot Significant loci Real??

39 Mixed Linear Model Kinship Analysis
$ wget

40 Mixed Linear Model Kinship Analysis and its problem Identical twins
Family king -b 1_QC_TestData.bed --kinship

41 Mixed Linear Model EMMAX

42 Mixed Linear Model IBS Matrix using EMMAX
$ emmax-kin-intel64 -v -s -M 1 -d 10 3_Trans_QC_TestData

43 Mixed Linear Model Mixed Linear Model using EMMAX
$ emmax-intel64 -v -d 10 -t 3_Trans_QC_TestData -p Data/nBFzH.pheno -k 3_Trans_QC_TestData.aIBS.kinf -o 4_MLL_QC_TestData

44 Contents QQ-plot Comparison (PLINK vs EMMAX)

45 QQ-plot Comparison PLINK vs. EMMAX

46 Contents Functional Analysis

47 Functional Analysis Coremine

48 Functional Analysis DAVID


Download ppt "Biopop, Seoul National University"

Similar presentations

Ads by Google