# Variable selection method for Boolean networks 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU.

## Presentation on theme: "Variable selection method for Boolean networks 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU."— Presentation transcript:

Variable selection method for Boolean networks 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU

Table of contents Introduction Objective Boolean networks Drawback Method GLM for binary data Result Inference of gene regulatory networks Computing time Discussion

INTRODUCTION

Objective Introduce a variable selection method to reduce the computing time in the Boolean network construction.

Boolean networks Gene 1Gene 2…Gene n time 10.6770.319…1.143 time 20.5421.321…0.648 …………… time m2.6520.184…0.532 Gene 1Gene 2…Gene n time 100…1 time 201…0 …………… time m10…0 Binary data Time series microarray data Find Boolean Functions REVEL algorithm Identification problem Consistency problem Best-Fit Extension problem G1 G2 G5 G4G3 G6 f 3 = G1 and G2 f 5 = G6 f 4 = G3 and not G5 Boolean functions Network structure

Drawback of Boolean networks G1G2G3G4G5G6 time 1101001 time 2100110 time 3110001 time 4001010 time 5100100 Boolean Functions for G4 Truth table ErrorG4f 4,1 G1G2 11010 00010 10111 11000 3 f 4,1 G1andG2 f 4,2 G1andnotG2 f 4,3 G1orG2 f 4,4 G1ornotG2 f 4,5 notG1andG2 f 4,6 notG1andnotG2 f 4,7 notG1orG2 f 4,8 notG1ornotG2 k : indegree, n : total genes, m : total time points Three Boolean operator (AND, OR, NOT) LAHDESMAKI H. 2003 Binary data Time complexity Find Boolean Functions (indegree k=2)

METHOD

GLM for binary data G1G2G3G4G5G6 time 1101001 time 2100110 time 3110001 time 4001010 time 5100100 Binary data 1. Simple regression 2. GLM Y,beta~normal t-test, p-value

2x2 Contingency Table G4 G1 01 0112 1437 549 G2G3G4G5G6 time 1101001 time 2100110 time 3110001 time 4000111 time 5100011 time 6111011 time 7100111 time 8100010 time 9001001 time 10100100 Binary data G4 G3 01 0516 1033 549 G4 G5 01 0033 1516 549 3. Independence test f4= G3 and not G5

RESULT

Simmulated network G1 G5 G8 G6 G4 G3 G2 G7 G1 G2 G3 G4 G5 G6 G7 G8 1 0 1 1 0 0 1 0 0 2 1 0 1 1 0 0 1 0 3 1 1 0 1 1 0 0 1 4 0 1 0 0 0 0 0 0 5 1 0 1 0 0 0 0 0 6 1 1 0 1 0 0 0 0 7 1 1 0 0 0 0 0 0 8 1 1 0 0 0 0 0 0 9 1 1 0 0 0 0 0 0 10 1 1 0 0 0 0 0 0 G1 G2 G3 G4 G5 G6 G7 G8 1 0 0 0 0 1 1 1 1 2 0 0 1 0 0 1 1 1 3 0 0 1 1 0 0 1 1 4 0 0 1 1 1 0 0 1 5 0 0 1 1 1 1 0 0 6 1 0 1 1 1 1 1 0 7 1 1 0 1 1 1 1 1 8 0 1 0 0 0 0 1 1 9 0 0 1 0 0 0 0 1 10 0 0 1 1 0 0 0 0 G1 G2 G3 G4 G5 G6 G7 G8 1 0 0 0 1 0 1 0 0 2 1 0 1 0 0 0 1 0 3 1 1 0 1 0 0 0 1 4 0 1 0 0 0 0 0 0 5 1 0 1 0 0 0 0 0 6 1 1 0 1 0 0 0 0 7 1 1 0 0 0 0 0 0 8 1 1 0 0 0 0 0 0 9 1 1 0 0 0 0 0 0 10 1 1 0 0 0 0 0 0 G1 G2 G3 G4 G5 G6 G7 G8 1 0 1 1 1 1 1 1 1 2 0 0 1 1 1 0 1 1 3 0 0 1 1 1 1 0 1 4 0 0 1 1 1 1 1 0 5 1 0 1 1 1 1 1 1 6 0 1 0 1 1 1 1 1 7 0 0 1 0 0 0 1 1 8 0 0 1 1 0 0 0 1 9 0 0 1 1 1 0 0 0 10 1 0 1 1 1 1 0 0 4 experiments with different initial state 8 genes 10 time points No noise Time Gene

f1 = not G8 Number of Bf:1, error:0.000000 f2 = G1 Number of Bf:1, error:0.000000 f3 = not G1 Number of Bf:1, error:0.000000 f4 = G3 Number of Bf:1, error:0.000000 f5 = G3 and G4 Number of Bf:1, error:0.000000 f6 = not G2 and G5 Number of Bf:1, error:0.000000 f7 = G6 Number of Bf:1, error:0.000000 f8 = G7 Number of Bf:1, error:0.000000 elapsed time is 1.002264 sec Result of original Boolean networks G1 G5 G8 G6 G4 G3 G2 G7

Result of variable selection method. Standard Chi- Par DF Estimate Error Square Pr > ChiSq x1 1 -0.3963 0.1511 6.88 0.0087 x2 1 -0.1734 0.1630 1.13 0.2874 x3 1 0.1734 0.1630 1.13 0.2874 x4 1 0.2848 0.1582 3.24 0.0719 x5 1 0.3880 0.1600 5.88 0.0153 x6 1 0.2083 0.1733 1.45 0.229 x7 1 0.4416 0.1536 8.26 0.0040 x8 0 1.0000 0.0000.. (Log Likelihood -1.79769E308) x1 0 -1.0000 0.0000.. x2 1 -0.3313 0.1575 4.42 0.0354 x3 1 0.3313 0.1575 4.42 0.0354 x4 1 0.1084 0.1658 0.43 0.5134 x5 1 0.2575 0.1645 2.45 0.1175 x6 1 0.3333 0.1605 4.31 0.0378 x7 1 0.1883 0.1663 1.28 0.2576 x8 1 0.4000 0.1520 6.93 0.0085 x1 0 1.0000 0.0000.. x2 1 0.3313 0.1575 4.42 0.0354 x3 1 -0.3313 0.1575 4.42 0.0354 x4 1 -0.1084 0.1658 0.43 0.5134 x5 1 -0.2575 0.1645 2.45 0.1175 x6 1 -0.3333 0.1605 4.31 0.0378 x7 1 -0.1883 0.1663 1.28 0.2576 x8 1 -0.4000 0.1520 6.93 0.0085 x1 1 0.3313 0.1575 4.42 0.0354 x2 1 0.7771 0.1052 54.58 <.0001 x3 0 -1.0000 0.0000.. x4 1 -0.2198 0.1628 1.82 0.1769 x5 1 -0.2575 0.1645 2.45 0.1175 x6 1 -0.2083 0.1699 1.50 0.2201 x7 1 -0.3052 0.1599 3.64 0.0563 x8 1 -0.1750 0.1644 1.13 0.2871 Standard Chi- Par DF Estimate Error Square Pr > ChiSq x1 1 0.2972 0.1472 4.08 0.0435 x2 1 0.5201 0.1268 16.82 <.0001 x3 0 -0.6316 0.0000.. x4 0 -0.6316 0.0000.. x5 1 -0.5619 0.1460 14.81 0.0001 x6 1 -0.2500 0.1693 2.18 0.1396 x7 1 -0.2727 0.1607 2.88 0.0898 x8 1 -0.1875 0.1573 1.42 0.2334 x1 1 0.2508 0.1355 3.43 0.0642 x2 0 0.4737 0.0000.. x3 1 -0.3622 0.1268 8.16 0.0043 x4 1 -0.3622 0.1268 8.16 0.0043 x5 0 -0.6923 0.0000.. x6 1 -0.3750 0.1593 5.54 0.0186 x7 1 -0.1753 0.1522 1.33 0.2493 x8 1 -0.1125 0.1464 0.59 0.4422 x1 1 0.2972 0.1472 4.08 0.0435 x2 1 0.1858 0.1530 1.47 0.2248 x3 1 -0.1858 0.1530 1.47 0.2248 x4 1 -0.2972 0.1472 4.08 0.0435 x5 1 -0.5619 0.1460 14.81 0.0001 x6 0 -1.0000 0.0000.. x7 1 -0.3896 0.1557 6.26 0.0124 x8 1 -0.1875 0.1573 1.42 0.2334 x1 1 0.1796 0.1592 1.27 0.2592 x2 1 0.2910 0.1540 3.57 0.0587 x3 1 -0.2910 0.1540 3.57 0.0587 x4 1 -0.1796 0.1592 1.27 0.2592 x5 1 -0.3545 0.1631 4.73 0.0297 x6 1 -0.4167 0.1623 6.59 0.0102 x7 0 -1.0000 0.0000.. x8 1 -0.4250 0.1505 7.98 0.0047 G1 G2 G3 G4 G5 G6 G7 G8

Computing time Boolean network with variable selection methodOriginal Boolean network method 49613 sec 548 sec Yeast cell cycle (spellman 1998) Faster 90 times than original Boolean networks k=4 k=3

DISCUSSION

Modify the method Add a real data analysis

Download ppt "Variable selection method for Boolean networks 2005. 08. 11 Ha Seong, Kim Bioinformatics & Biostatistics Lab., SNU."

Similar presentations