Download presentation
Presentation is loading. Please wait.
Published byLynn Weaver Modified over 6 years ago
1
DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network July 18th on 2018
2
Overview Goals: Contributions:
Show that deep learning to survival analysis performs good in predicting risk. Demonstrate that deep neural network can be used as a personalized treatment recommender system for further medical research. Contributions: The model performs good than other survival analysis method. The model can be used on treatment recommendations which could increase the median survival time.
3
Survival analysis A study on the relationship between duration of time and survival probability of individual. death in biological organisms failure in mechanical systems. Can be generalized to the method to study the correlation between events and its happen time.
4
Survival analysis COX proportional hazards model (CPH) Censored data
A survival analysis model to analyze how multi-variables affect survival time. Advantage over logistics regression: dealing with censored data. Censored data Termination of the study A person withdraws from the study or lost to follow-up during the study period
5
Survival analysis Survival data Survival function: Hazard function:
x: a patient’s baseline data T: a failure event time E: event indicator (1: event was observed 0: response was censored) Survival function: Hazard function:
6
Linear survival model---CPH
Hazard function: Estimation of regression parameter COX partial likelihood: Maximize log-likelihood to get estimator Only non-censored individual has partial likelihood (E=1) Summary: The CPH is a semiparametric model that calculates the effects of observed covariates on the risk of an event occurring. The model assumes that a patient’s log-risk of failure is a linear combination of the patient’s covariates. Baseline hazard function Risk score
7
Non-linear survival model---DeepSurv
DeepSurv: combination of CPH & neural network. Neural network learns complex non-linear relationship between prognostic features and individual’s risk of failure. Input: a patient’s baseline data x Output: a single node with a linear activation which estimates the log-risk function in CPH Objective function: average negative log partial likelihood Training method: gradient decent
8
Treatment recommender system
Recommender function: Log of hazard ratio to calculate the personal risk-ratio of prescribing one treatment option over another. Eg. Positive: I leads to higher risk to death than j Advantage over CPH Calculates the recommender function without an a priori specification of treatment interaction terms. CPH model computes a constant recommender function unless treatment interaction terms are added to the model.
9
Treatment recommender system
Advantage over CPH Calculates the recommender function without an a priori specification of treatment interaction terms. CPH model computes a constant recommender function unless treatment interaction terms are added to the model.
10
Journal of Biomedical Informatics 68 (2017) 50–57
Dynamic strategy for personalized medicine: An application to metastatic breast cancer Journal of Biomedical Informatics 68 (2017) 50–57 Xi Chen a,⇑, Ross D. Shachter a, Allison W. Kurian b, Daniel L. Rubin c a Department of Management Science & Engineering, Stanford University, Stanford, CA, USA b Department of Medicine, Stanford University, Stanford, CA, USA c Department of Radiology, Stanford University, Stanford, CA, USA July 18th on 2018
11
Challenging question Is the current therapy effective?
Tumor size will not change significantly between visits, hardly observable. Tumors develop resistance to therapies. When to switch to another therapy? Effectiveness of current therapy is not directly observable. Traditional signal based on progression might not be the best signal. Use the past treatment and observation history to update our belief about the current therapy effectiveness.
12
POMDPs introduction POMDPs describe the uncertainty about actions outcome.
13
Belief state Probability distributions over states of the underlying MDP. The agent keeps an internal belief state, b, that summarizes its experience.
14
Belief state probability for the environment being in state
The agent uses a state estimator, SE, for updating the belief state b’ Since b’ contains all the information of observation & model, B in POMDP can be regarded as S in MDP. b ′ 包含了观察和模型的全部信息: 1 上一时刻的全部信息 b; 2 模型信息 T; 3 观察信息 O(s ′ , a, o), s ′ ∈ S
15
POMDP as Belief-MDP The policy of a POMDP maps the current belief state into an action. As the belief state holds all relevant information about the past, the optimal policy of the POMDP is the the solution of belief MDP. The Bellman equation for this belief MDP: In general case: Very Hard to solve continuous space MDPs. Unfortunately, DP updates cannot be carried out because there are uncountably many of belief states. One cannot enumerate every equation of value function. To conduct DP steps implicitly, we need to discuss the properties of value functions.
16
POMDP-Belief MDP Action: A,B,C,D Reward: State:
2 hormone therapies:Tamoxifen and Fulvestrant 2 chemotherapies: Docetaxel and Capecitabine One of the four therapies can be chosen each period. Reward: Tumor response: negative log of the tumor size Disutility due to chemotherapy side effects State: Current probabilities for effectiveness of the different therapies What action to take given current probability distribution rather than given current state. current tumor response joint probability of effectiveness for all therapies
17
Inputs tumor response observation probability matrix
the initial probability of effectiveness within-therapy decline rate 𝛼 between-therapy decline rate 𝛽
18
Compute belief states Kalman Filter Q-Learning Methods (KFQL)
Q-value: Approximates the optimal net present value of the future rewards. Kalman Filter is used to upgrade the weights r. Approximate Kalman Filter Q-Learning Methods (AKFQL) Ignores the dependence among basis functions reduces the computation complexity Successive Approximations of the Reachable Space under Optimal Policies (SARSOP)
19
Results Comparison of policies
policies from KFQL and AKFQL are the most effective at limiting tumor growth. all three computed policies apply chemotherapy in fewer periods than the standard therapy strategy. Converge to policies with similar average reward. AKFQL converges faster
20
Results Comparison of policies
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.