Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang.

Similar presentations


Presentation on theme: "Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang."— Presentation transcript:

1 Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang

2 Agenda Drug Ranking Problem Contributions Background and Problem Formulation Method Experiments Results Conclusions

3 Agenda Drug Ranking Problem Contributions Background and Problem Formulation Method Experiments Results Conclusions

4 Drug Ranking Problem

5 Difficulties:  Ranking of the intermediate ones 1 234

6 Contributions Fractional Adjusted Bi-partitional Score FABS graph theoryHigh-throughput Screening Combinatorial Solution Photo credit: Oregon State University

7 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

8 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

9 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

10 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

11 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

12 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

13 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

14 Problem Formulation

15 1 234

16 Two approaches: Generative Modeling: model everything  X: input  Y: output For each drug: Distributions of chemical components, reactions, distribution processes in the cell … Effectiveness distribution for each drug Compare parameters (e.g. mean) to obtain ranking

17 Discriminative Learning Only focus on output, “Black box”  X: input  Y: output Criteria and/or training Input Output Little domain knowledge required Task-related criteria Direct optimization Drug ranking problem

18 Method: Graph Formulation V

19 E

20 l Extreme Case 1 Extreme Case 2 Perfect EffectivenessZero effectiveness

21 Method: Graph Formulation w How similar data point 1 is to data point 2 Inverse to the distance between two points. Euclidean distance Minkowski distance (generalized Euclidean) Mahalahobis distance (scaled Euclidean) City block distance (absolute value version of Euclidean) 1 2

22 Method: Graph Formulation Encode : All possible pair-wise relationship. Compact & discrete Digitalized by matrix: Adjacency matrix w

23 Method: Graph Cut w

24 Method: Graph Cut w Cut

25 Method: Graph Cut Graph cut: (Well studied problem) Criteria: Normalized Cut Prime [Hochbaum, 2010] Normalized Cut [Shi, Malik 2000] Minimum Cut Ratio Region [Cox, Rao, Zhong 1996]

26 Method: Graph Cut Graph cut: (Well studied problem) Criteria: Normalized Cut’ Normalized Cut Minimum Cut Ratio Region P P NP-complete P

27 Method: Graph Cut Graph cut: (Well studied problem) Goal: find a bi-partition Drug ranking: rank drugs Question: how to use partition algorithm?

28 Method: FABS Question: how to use partition algorithm? Extreme 1Extreme 2 Zero EffectivenessPerfect Effectiveness Edges and edge weights are omitted

29 Method: FABS Seeds: Force extreme cases in two separate partitions Extreme 1Extreme 2 Edges and edge weights are omitted Perfect Effectiveness Zero Effectiveness

30 Method: FABS Seeds: Force extreme cases in two separate partitions Extreme 1Extreme 2 Edges and edge weights are omitted Bipartition: Perform any partition algorithm Zero EffectivenessPerfect Effectiveness

31 Method: FABS Seeds: Force extreme cases in two separate partitions Edges and edge weights are omitted Bipartition: Perform any partition algorithm proportion: 1/3 proportion: 3/3 proportion: 0/3 Extreme 1Extreme 2 Zero EffectivenessPerfect Effectiveness

32 Method: FABS Seeds: Force extreme cases in two separate partitions Edges and edge weights are omitted Bipartition: Perform any partition algorithm proportion: 1/3 proportion: 3/3 proportion: 0/ > 1/3 > 0

33 Method: FABS Edges and edge weights are omitted Algorithm RankDrugs

34 Method: FABS Edges and edge weights are omitted FABS

35 Method: FABS – NC’ Edges and edge weights are omitted FABS Normalized Cut’ Blackbox: solves Normalized Cut’ criterion Normalized Cut’ criterion [Hochbaum 2010]: combinatorial & efficient solution good track record in various other fields

36 Method: FABS – NC’ Edges and edge weights are omitted FABS Normalized Cut’ Advantages: FABS – NC’ FABS is one dimensional – ranks unambiguously; FABS is based on counts – diminishes effects of the outliers and noise; FABS-NC’ is obtained by a combinatorial algorithm FABS uses extreme cases for seeds – minimizes expert intervention; FABS uses individual points – avoids aggregating for each drug;

37 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Effectiveness criterion: Toxicity (degree of fragmentation)

38 Extreme cases [Lin et al. 2010] Intact Completely fragmented Intermediate cases

39 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Effectiveness criterion: Toxicity (degree of fragmentation)

40 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Extreme case 1 (Good) Extreme case 2 (Bad) Group 1 Effectiveness criterion: Toxicity (degree of fragmentation) Group 2 Group 3 Increasing fragmentation Intact Complete fragmentation Group 1 Group 2 Group 3 The Ground Truth

41 Experiment: Evaluation Procedure Extreme CasesGroups Calculate Predication Accuracy 1000 runs Sample Extreme Points Sample Group Points Subsampling FABS-NC’ calculation Compare to the ground truth Ranking groups

42 Experiment: Another Methods Used in practice: Center ranking: Find the centers for all groups and extreme cases; PCA ranking: Project onto the first principal component; Z-factor ranking: Calculate z-score for each group. [Zhang et al. 1999]

43 Results:

44 Artificial Noise/Outliers: Robustness Add noise/outliers to the ground truth: Calculate the mean and the standard deviation for a group Randomly generate a data point: If it is 3 standard deviation from the mean of the group, Accept as an outlier Otherwise, Reject Robustness: More robust method is less effected by the noise. Repeat

45 Result with Noise

46 FABS-SVM & Group Distance Measure Group 1Group 2Group 3Accuracy FABS-SVM % FABS-NC’ % How to measure the distance sensitivity of groups: FABS-SVM: Group Distance Measure (GDM) Algorithm calGDM

47 Conclusions A new drug ranking framework FABS graph-based - producing a single scalar score; sidesteps many pitfalls of other traditional methods. mitochondria database FABS-NC′ better than three other methods; Robust when noise is introduced; Outperforms FABS-SVM. Group distance measure (GDM). In addition

48 Future Directions 2. Expand our FABS application; 1. Assess other FABS implementation by GDM 3. Change Edge weight 4. Add node weight Thanks DHS Grant CBET UC Berkeley Information Sciences Institute


Download ppt "Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang."

Similar presentations


Ads by Google