Presentation is loading. Please wait.

Presentation is loading. Please wait.

FLAIRS '991 Applying the SUBDUE Substructure Discovery System to the Chemical Toxicity Domain Ravindra N. Chittimoori, Diane J. Cook, Lawrence B. Holder.

Similar presentations


Presentation on theme: "FLAIRS '991 Applying the SUBDUE Substructure Discovery System to the Chemical Toxicity Domain Ravindra N. Chittimoori, Diane J. Cook, Lawrence B. Holder."— Presentation transcript:

1 FLAIRS '991 Applying the SUBDUE Substructure Discovery System to the Chemical Toxicity Domain Ravindra N. Chittimoori, Diane J. Cook, Lawrence B. Holder Lawrence B. Holder Department of Computer Science and Engineering University of Texas at Arlington http://cygnus.uta.edu/subdue/

2 FLAIRS '992 Motivation and Goal b Ever-increasing number of chemical compounds in use today (~100,000). b Needs to identify relationships between the molecular structure and the toxicity of a chemical compound. b Apply knowledge discovery to the U.S. National Toxicology Program (NTP) to identify such relationships.

3 FLAIRS '993 Knowledge Discovery in SUBDUE b Structural discovery system b Graph-based input representation b Beam search through substructure (subgraph) space b Graph compression heuristic based on minimum description length b Inexact, polynomial graph match

4 FLAIRS '994 object triangle R1 C1 S1 S2 S3S4 Input DatabaseSubstructure S1 (graph form) Compressed Database R1 C1 object square on shape T1 T2 T3T4 SUBDUE Example

5 FLAIRS '995 Chemical Toxicity Domain b Database of 367 chemicals b Levels of evidence assigned by NTP CE: clear evidence of cancerous activityCE: clear evidence of cancerous activity SE: some evidenceSE: some evidence E: equivocal evidenceE: equivocal evidence NE: no evidenceNE: no evidence

6 FLAIRS '996 Predictive Toxicology Evaluation b Predictive Toxicology Evaluation (PTE) challenge b PTE-2 ended November 1998 http://dir.niehs.nih.gov/dirlecm/pte2.htmhttp://dir.niehs.nih.gov/dirlecm/pte2.htm b PTE-3 scheduled for July 1999 - July 2000

7 FLAIRS '997 Chemical Toxicity Data b Atoms (name, type, partial charge) b Bonds (type) b Chemical groups Alcohol, amine, amino, benzene, ester, ether, ketone, methanol, methyl, nitro, phenol and sulfideAlcohol, amine, amino, benzene, ester, ether, ketone, methanol, methyl, nitro, phenol and sulfide

8 FLAIRS '998 Chemical Toxicity Data b Carcinogenicity-related tests AmesAmes ChromexChromex ChromaberrChromaberr DrosophiliaDrosophilia Mouse-LymphMouse-Lymph Salmonella AssaySalmonella Assay

9 FLAIRS '999 Chemical Compound Representation

10 FLAIRS '9910 Input Representation b Sample Atomic Structure b SUDBUE graph input C H 1 v 1 atom v 2 C v 3 atom v 4 H d 1 2 name d 3 4 name u 1 3 1

11 FLAIRS '9911 Methodology b Training set further divided into learning and testing sets b Find best substructures in learning-set positives not prevalent in negatives b Find occurrences of substructure in testing

12 FLAIRS '9912 Results b b Learning set: 268 Positive compounds: 134/143 Negative compounds: 24/125 b b Testing set: 30 Positive compounds: 15/19 Negative compounds: 4/11 atom 10 c n tp 0.062 atom br n tp 0.057 1 3

13 FLAIRS '9913 atom 10 c n tp 0.211 atom 1 h n tp 0.34 atom 32 n n tp  0.778 atom h n tp 1 1 1 1 0.36 Results b Learning set: 268 Positive compounds: 60/143Positive compounds: 60/143 Negative compounds: 0/125Negative compounds: 0/125 b Testing set: 30 Positive compounds: 8/19Positive compounds: 8/19 Negative compounds: 0/11Negative compounds: 0/11

14 FLAIRS '9914 Discussion b Consistent with results obtained by ILP system PROGOL (Srinivasan et al., ILP-97). b Groups discovered by SUBDUE (e.g., Amino) are unique substructures found only in compounds which test positive on carcinogenicity.

15 FLAIRS '9915 Conclusion b SUBDUE has the ability to discover interesting patterns (substructures) that might be helpful in predicting carcinogenicity. b SUBDUE is suitable for knowledge discovery in the chemical toxicity domain.

16 FLAIRS '9916 Future Research b Applying concept-learning SUBDUE to the chemical toxicity database Find substructures compressing positive graph, but not negative graphFind substructures compressing positive graph, but not negative graph b Incorporate more domain knowledge b PTE-3 challenge (July 1999)


Download ppt "FLAIRS '991 Applying the SUBDUE Substructure Discovery System to the Chemical Toxicity Domain Ravindra N. Chittimoori, Diane J. Cook, Lawrence B. Holder."

Similar presentations


Ads by Google