Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChemBank Building a Public Web Resource Using Daycart Erik Brauner Head of Chemical and Biological Computing Harvard Institute of Chemistry and Cell Biology.

Similar presentations


Presentation on theme: "ChemBank Building a Public Web Resource Using Daycart Erik Brauner Head of Chemical and Biological Computing Harvard Institute of Chemistry and Cell Biology."— Presentation transcript:

1 ChemBank Building a Public Web Resource Using Daycart Erik Brauner Head of Chemical and Biological Computing Harvard Institute of Chemistry and Cell Biology Eli and Edith L. Broad Institute June 10, 2004

2 The Institute of Chemistry and Cell Biology (ICCB) The ICCB is an academic small molecule screening facility located at Harvard Medical School with the goals of: –Enabling academic labs to perform high throughput chemical screens –Creating small molecule libraries for screening –Advancing the field of Chemical Genetics –Creating a public database: ChemBank

3 The Dual Roles of ChemBank: To handle internal needs –Support tools for high throughput screeners –Support for chemists and library synthesis As a public web resource –Freely available assay data and information on compounds relevant to chemical genetics Had to satisfy the needs of both chemists and biologists. http://ChemBank.med.harvard.edu

4

5 ChemBank Status Publicly available at: chembank.med.harvard.edu >900,000 structures in the database >5,000 known bioactives with annotation in the database Selected assay data also available Supports similarity queries, and substructure searching.

6

7

8

9

10

11

12

13 Structure Sources Commercial libraries Outside databases Curated structures DOS Libraries Virtual library files with undecoded structures (no plate mapping), decoding tag pattern, building blocks, reagents

14

15 SD File Handling SD file TDT file XML file mol2tdt.sh (contrib code) custom PERL script This becomes the official load record

16 A Simple Structure Table create table COMPOUND ( id number not null, smiles varchar2(4000), molecular_weight float, molecular formula varchar2(20), primary key (id) ); create index COMPOUND_idx1 on COMPOUND(smiles) indextype is c$dcischem.ddexact; create index COMPOUND_idx2 on COMPOUND(smiles) indextype is c$dcischem.ddblob; needed for exact smiles matching needed for fingerprints to support similarity

17 Basic Manipulations Inserting Compounds: insert into COMPOUND (id, smiles, molecular_weight, molecular_formula) values ( 1, smi2cansmi(CCC, 1), smi2amw(CCC), smi2mf(CCC)) Similarity Search: select id, tanimoto(smiles, O=C=O) as similarity from COMPOUND where tanimoto(smiles, O=C=O) >= 0.8 order by similarity desc; Substructure Search: select id from COMPOUND where contains(smiles, O=C=O) = 1

18

19 Structure Loading Are These Structures of the Same Compound? =?

20

21 Salt Stripping In Daycart Daycart 4.8.2 supports salt stripping via the function vcs_desalt(smiles, iso, class) which works in conjunction with a built in salt table. ex: insert into salt values (‘Sodium’, ‘[Na+]’, 0, NULL) VCS_DESALT(‘[Na+].c1ccccc1’, 0, 0)

22

23

24

25 Normalization in Daycart Nitro and azide normalization can be achieved easily using reaction smirks. Ex: [*:1][N:2](=[O:3])=[O:4]>>[*:1][N+:2](=O:3)[O-:4]

26 Normalization In Daycart Daycart 4.8.2 supports normalization via the function vcs_normalize(smiles, iso, class) which works in conjunction with a built in transform table. ex: insert into transform values (‘Nitros’, ‘ [*:1][N:2](=[O:3])=[O:4]>>[*:1][N+:2](=O:3)[O-:4] ’, ‘FORWARD’, 0, NULL) VCS_NORMALIZE(‘CCN(=O)(=O)’, 0, 0)

27

28

29

30

31 Acknowledgements ICCB Informatics Group: –Jeremy Muhlich –Jason McIntosh –Carol Chang –Andrew Lach –Justin Klekota ICCB Chemistry Group: –John Tallarico –Jared Shaw ICCB Screening Group: –Caroline Shamu –Nicky Tolliday National Cancer Institute Tudor Oprea Daylight The University of New Mexico SCHOOL OF MEDICINE


Download ppt "ChemBank Building a Public Web Resource Using Daycart Erik Brauner Head of Chemical and Biological Computing Harvard Institute of Chemistry and Cell Biology."

Similar presentations


Ads by Google