1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.

1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005

2 X-ray Structure Protein surface Bound drug Zinc H 2 O’s Hiding Around Note Zinc ion

3 Outline Background Background Computational Procedure and Algorithm Computational Procedure and Algorithm Examples Examples Conclusions Conclusions

4 Conformation Generation OMEGA® generates thousands of conformers in a few seconds. OMEGA® generates thousands of conformers in a few seconds. It is able to reproduce bioactive conformations. It is able to reproduce bioactive conformations. Boström, Greenwood, and Gottfries. J. Mol. Graph. Mod., 2003, 21, 449-462

5 Many feature combinations Exhaustive enumeration of pharmacophore hypotheses Exhaustive enumeration of pharmacophore hypotheses No. of Features Possible combinations 45 516 642 799

6 Pharmacophore Identification Active molecules are known, receptor unknown. Active molecules are known, receptor unknown. Assume that all molecules bind in a common manner to the biological target. Assume that all molecules bind in a common manner to the biological target. Difficulties: Difficulties:  Conformational flexibility  Many different combinations of pharmacophoric groups Two very large search spaces: conformations and feature combinations. conformations and feature combinations.

7 Work Flow for Pharmacophore Identification Single conformer SDF or SMILES External Conformation Generation Program PharmID Different Pharmacophore Hypotheses

8 Our Strategy To superimpose the molecules in 3D, we first align the bit string for each conformer in 1D. To superimpose the molecules in 3D, we first align the bit string for each conformer in 1D. Ideally, the important features and best conformers will be picked out at the same time. Ideally, the important features and best conformers will be picked out at the same time. Our search is a many to one, Our search is a many to one, not many to many! not many to many!

9 Computation Procedure 1. Pharmacophore bit string generation 2. Bit string alignment/assessment 3. Hypothesis generation 4. Refinement

10 Feature Definition Predefined pharmacophore features: HD : Hydrogen Bond Donor HA : Hydrogen Bond Acceptor POS: Positive Charge Center NEG: Negative Charge Center ARC: Aromatic Center HYP: Hydrophobic Center Predefined pharmacophore features: HD : Hydrogen Bond Donor HA : Hydrogen Bond Acceptor POS: Positive Charge Center NEG: Negative Charge Center ARC: Aromatic Center HYP: Hydrophobic Center User defined groups: Any functional groups can be defined using Daylight® SMART strings. User defined groups: Any functional groups can be defined using Daylight® SMART strings.

11 Bit String Generation 1 0 1 0 1... 1 0 0... Conf. 1 Conf. 2 Conf. 3 0 0 1 0 0... 1 0 0 1 0 0 0 0... 1 0 0 1 0 0 0 1... 1 0 0 N H N 3D Atom (group) – Distance – Atom (group) features. F 1 ………………F m

12 Definition of Distance Bins homogeneous non-overlapped. 0-1, 1-2, 2-3, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 11-12, 12 Å and above. homogeneous non-overlapped. 0-1, 1-2, 2-3, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 11-12, 12 Å and above. heterogeneous non-overlapped. 1-2, 2-5, 5-8, 8-12, 13 Å and above. heterogeneous non-overlapped. 1-2, 2-5, 5-8, 8-12, 13 Å and above. Overlapped. 1-3, 2-4, 3-5, 4-6, 5-7, 6-8, 7-9, 8-10, 9-11, 10-12 Å. Overlapped. 1-3, 2-4, 3-5, 4-6, 5-7, 6-8, 7-9, 8-10, 9-11, 10-12 Å.

13 Data Structure for Input 0 0 1 0 0... 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1... 1 1 0 1 0 1 1 0 1 0 0 0 0 0 0... 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0... 1 0 0 0 1 1 0 0 1 0... 1 0 1 0 0... 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0... 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0... 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0... 0 1 0 0 0 0 1 0 1 0... M1C1M1C2M1C3......M2C1M2C2M2C3......

14 The Trick If you know the correct conformation for each molecule, then it is relatively easy to identify the key features. If you know the correct features and distances, then it is easy to identify the correct conformation. Guess one, predict the other, iterate.

15 Given the features, easy to find the conformations M1C1M1C2M1C3......M2C1M2C2M2C3...... 0 0 1 0 0... 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0... 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1... 1 1 0 1 0 1 1 0 1 0 0 0 0 0 0... 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0... 1 0 0 0 1 1 0 0 1 0... 1 0 1 0 0... 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0... 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0... 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0... 0 1 0 0 0 0 1 0 1 0...

16 Given the conformations, easy to find the features. M1C1M1C2M1C3......M2C1M2C2M2C3...... 0100.0010. 0 0 1 0 0... 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1... 1 1 0 1 0 1 1 0 1 0 0 0 0 0 0... 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0... 1 0 0 0 1 1 0 0 1 0... 1 0 1 0 0... 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0... 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0... 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0... 0 1 0 0 0 0 1 0 1 0...

17 Bioinformatics Motif Finding using Gibbs Sampling. 1. Remove one sequence. 2. Randomly select one position for each sequence. 3. Calculate probabilities for all positions for the motif “window”. 4. Using the “window” compute probabilities for removed sequence motif position. 5. Repeat the above steps for all sequences until converged. This will be easier to see with pictures.

18 Objective Function W : bit string length c i,j : count of residue j in position i q i,j : residue frequencies, position i, residue j p j : residue background frequencies J: residue types, 20 for protein, 4 for DNA, RNA W x 20 Window

19 Alignment Algorithm Mostly used in sequence alignment to find the common motif. ………….. Mostly used in sequence alignment to find the common motif. TCAGAACCAGTTATAAATTTATCATTTCCTTCTCCACTCCT GCCTCAGGATCCAGCACACATTATCACAAACTTAGTGTCCA CATTATCACAAACTTAGTGTCCATCCATCACTGCTGACCCT ………….. Fast and sensitive, less likely to fall into local minimum. Fast and sensitive, less likely to fall into local minimum. Lawrence, et al. (1993) Science, 262, 208-214 W x 20

20 PharmID Algorithm using Gibbs. 1. Remove one compound. 2. Start with a random conformer for other compounds. 3. Calculate probabilities for feature importance. 4. Compute conformation probabilities for omitted compound. 5. Repeat steps 1-4 until converges. Again, pictures will make this clear.

21 Gibbs Sampling: Fingerprints Movement Conf_1 Conf_2 Conf_3 possible Mol_1 010000000 100010001 010100100 0, 9, 18 Mol_2 000100000 010000100 100010001 0, 9, 18 Mol_3 101010001 000100100 0, 9, Mol_4 001000100 100110001 010101001 0, 9, 18 1_2 010000000 100010001 010100100 2_3 000100000 010000100 100010001 3_1 101010001 000100100 4_2 001000100 100110001 010101001

22 Bit String Alignment Only 2 residue types (0, 1) Only 2 residue types (0, 1) Rigid molecules that have only 1 or a few conformers can speed up the alignment and help to determine the best set of features. Rigid molecules that have only 1 or a few conformers can speed up the alignment and help to determine the best set of features.

23 Hypothesis Generation Why? Why? Features may not be part of the same pharmacophore. How? How? Clique Detection. (Bron-Kerbosch Algorithm) A clique is a set of ALL connected points.

24 Hypothesis Generation in Selected Conformers : Clique Detection Pharmacophore Features Two point Pharmacophores identified by Gibbs Sampling A pharmacophore hypotheses should be an all-connected graph Discarded two point pharmacophores

25 Hypothesis Generation: Output Pharmacophore 1 Members: 1 2 3 5 …(Mol. ID) Features: Hydrogen Bond Donor, Hydrogen Bond Acceptor, … Pharmacophore 1 Members: 1 2 3 5 …(Mol. ID) Features: Hydrogen Bond Donor, Hydrogen Bond Acceptor, … Pharmcophore 2 Members: 4 6 8 … Features: … Pharmcophore 2 Members: 4 6 8 … Features: … … …

26 Refinement For all molecules   For all conformers   For all hypotheses generated Test each qualified conformer against each hypothesis End For If new hypothesis found Insert the new hypothesis into the list End For

27 Benchmarking: Test Datasets 1. Bit string alignment 20 20-bit strings 2. Single binding mode Angiotensin-Converting Enzyme (ACE) inhibitors 3. Multiple binding modes/mechanisms Dopamine receptor inhibitors (D2/D4) Dopamine receptor inhibitors (D2/D4)

28 Example 1: A Toy Dataset (Gibbs Sampling Only) 20 x 20 bit strings, mimic 20 molecules, each with 20 conformers. Each bit string is 20 bits long. Computation time: <1 sec. Result: 1_14 10001000010000000100 2_14 10001000010000000100 3_15 10001000010000000100 4_12 10001000010000000000 5_15 10001000010010000100 6_7 10001000010000000000 7_8 10001000010000000000 8_19 10001000010000000100 …

29 Example 2: ACE Inhibitors 78 active compounds. 78 active compounds. OMEGA® From OpenEye® is used to generate multiple conformers. OMEGA® From OpenEye® is used to generate multiple conformers. Two RMSD cutoffs used: 2.0 Å : 4,613 conformers generated. 1.0 Å : 46,268 conformers generated. Two RMSD cutoffs used: 2.0 Å : 4,613 conformers generated. 1.0 Å : 46,268 conformers generated.

30 ACE inhibitors Results Using 4,613 conformers, 55/78 molecules contain expected pharmacophore. Using 4,613 conformers, 55/78 molecules contain expected pharmacophore. Using 46,268 conformers, 65/78 molecules contain expected pharmacophore. Using 46,268 conformers, 65/78 molecules contain expected pharmacophore.

31 Example 2: ACE inhibitors: Best Identified Pharmacophore 2.84 ~ 4.50 Å 4.51 ~ 5.70 Å 4.99 ~ 6.77

32 Example 2: ACE inhibitors Other possible pharmacophore

33 Example 3: Testing on Multiple Binding Modes (D2, D4 ligands)

34 Example 3: Dopamine antagonists Two pharmacophores were extracted from one data set!

35 Conclusion Traditional Methods: Exhaustive enumeration of pharmacophores, limited coverage of conformational space. Traditional Methods: Exhaustive enumeration of pharmacophores, limited coverage of conformational space. “Many to many” limits search. “Many to many” limits search. PharmID: Selective enumeration of pharmacophores, better coverage of conformational space. PharmID: Selective enumeration of pharmacophores, better coverage of conformational space. Each search is “many to one”. Each search is “many to one”.

36 Acknowledgements Coworkers Stan Young, Jun Feng, Ashish Sanil Coworkers Stan Young, Jun Feng, Ashish Sanil OMEGA is a product from OpenEye Scientific Software Inc. OMEGA is a product from OpenEye Scientific Software Inc. Support from Hereditary Disease Foundation. Support from Hereditary Disease Foundation. Become a NISS affiliate!

1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.

Similar presentations

Presentation on theme: "1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.

Similar presentations

Presentation on theme: "1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005."— Presentation transcript:

Similar presentations

About project

Feedback