Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch PhD, DSc Dept. of Informatics, Nicholas Copernicus University, Poland.

Similar presentations


Presentation on theme: "Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch PhD, DSc Dept. of Informatics, Nicholas Copernicus University, Poland."— Presentation transcript:

1 Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch PhD, DSc Dept. of Informatics, Nicholas Copernicus University, Poland

2 What is it about? Data is precious! But also overwhelming...Data is precious! But also overwhelming... Statistical methods are important but new techniques may frequently be more accurate and give more insight into the data.Statistical methods are important but new techniques may frequently be more accurate and give more insight into the data. Data analysis requires intelligence.Data analysis requires intelligence. Inspirations come from many sources, including biology: artificial neural networks, evolutionary computing, immune systems...Inspirations come from many sources, including biology: artificial neural networks, evolutionary computing, immune systems...

3 Computational Intelligence Computational Intelligence Data + Knowledge Artificial Intelligence Expert systems Fuzzy logic Pattern Recognition Machine learning Probabilistic methods Multivariate statistics Visuali- zation Evolutionary algorithms Neural networks

4 What do these methods do? Provide non-parametric models of data.Provide non-parametric models of data. Allow to classify new data to pre-defined categories, supporting diagnosis & prognosis.Allow to classify new data to pre-defined categories, supporting diagnosis & prognosis. Allow to discover new categories.Allow to discover new categories. Allow to understand the data, creating fuzzy or crisp logical rules.Allow to understand the data, creating fuzzy or crisp logical rules. Help to visualize multi-dimensional relationships among data samples.Help to visualize multi-dimensional relationships among data samples. Help to model real neural networks!Help to model real neural networks!

5 Neural networks Inspired by neurobiology: simple elements cooperate changing internal parameters.Inspired by neurobiology: simple elements cooperate changing internal parameters. Large field, dozens of different models, over 500 papers on NN in medicine each year.Large field, dozens of different models, over 500 papers on NN in medicine each year. Supervised networks: heteroassociative mapping X=>Y, symptoms => diseases, universal approximators.Supervised networks: heteroassociative mapping X=>Y, symptoms => diseases, universal approximators.Supervised networksSupervised networks Unsupervised networks: clusterization, competitive learning, autoassociation.Unsupervised networks: clusterization, competitive learning, autoassociation.Unsupervised networksUnsupervised networks Reinforcement learning: modeling behavior, playing games, sequential data.Reinforcement learning: modeling behavior, playing games, sequential data.Reinforcement learningReinforcement learning

6 Supervised learning Compare the desired with the achieved outputs … you can’t always get what you want.

7 Unsupervised learning Find interesting structures in data.

8 Reinforcement learning Reward comes after the sequence of actions.

9 Real and artificial neurons Synapses Axon Dendrites Synapses (weights) Nodes – artificial neurons Signals

10 Neural network for MI diagnosis Myocardial Infarction ~ p(MI|X) SexAgeSmoking ECG: ST Pain Intensity Pain Duration Elevation 0.7 5 1  1365 Inputs: Output weights Input weights

11 MI network function Training: setting the values of weights and thresholds, efficient algorithms exist. Effect: non-linear regression function Such networks are universal approximators: they may learn any mapping X => Y

12 Knowledge from networks Simplify networks: force most weights to 0, quantize remaining parameters, be constructive! Regularization: mathematical technique improving predictive abilities of the network. Result: MLP2LN neural networks that are equivalent to logical rules.

13 Recurrence of breast cancer Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia. 286 cases, 201 no recurrence (70.3%), 85 recurrence cases (29.7%) no-recurrence-events, 40-49, premeno, 25-29, 0-2, ?, 2, left, right_low, yes 9 nominal features: age (9 bins), menopause, tumor-size (12 bins), nodes involved (13 bins), node-caps, degree-malignant (1,2,3), breast, breast quad, radiation.

14 Recurrence of breast cancer Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia. Many systems used, 65-78% accuracy reported. Single rule: IF (nodes-involved  [0,2]  degree-malignant = 3 THEN recurrence, ELSE no-recurrence 76.2% accuracy, only trivial knowledge in the data: Highly malignant breast cancer involving many nodes is likely to strike back.

15 Recurrence - comparison. Method 10xCV accuracy MLP2LN 1 rule 76.2 SSV DT stable rules75.7  1.0 k-NN, k=10, Canberra74.1  1.2 MLP+backprop. 73.5  9.4 (Zarndt) CART DT 71.4  5.0 (Zarndt) FSM, Gaussian nodes 71.7  6.8 Naive Bayes 69.3  10.0 (Zarndt) Other decision trees < 70.0

16 Breast cancer diagnosis. Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg. 699 cases, 9 features quantized from 1 to 10: clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, mitoses Tasks: distinguish benign from malignant cases.

17 Breast cancer rules. Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg. Simplest rule from MLP2LN, large regularization: If uniformity of cell size  3 Then benign Else malignant Sensitivity=0.97, Specificity=0.85 More complex NN solutions, from 10CV estimate: Sensitivity =0.98, Specificity=0.94

18 Breast cancer comparison. Method 10xCV accuracy k-NN, k=3, Manh97.0  2.1 (GM) FSM, neurofuzzy 96.9  1.4 (GM) Fisher LDA 96.8 MLP+backprop. 96.7 (Ster, Dobnikar) LVQ 96.6 (Ster, Dobnikar) IncNet (neural)96.4  2.1 (GM) Naive Bayes 96.4 SSV DT, 3 crisp rules 96.0  2.9 (GM) LDA (linear discriminant)96.0 Various decision trees 93.5-95.6

19 l Collected in the Outpatient Center of Dermatology in Rzeszów, Poland. l Four types of Melanoma: benign, blue, suspicious, or malignant. l 250 cases, with almost equal class distribution. l Each record in the database has 13 attributes: asymmetry, border, color (6), diversity (5). l TDS (Total Dermatoscopy Score) - single index l Goal: hardware scanner for preliminary diagnosis. Melanoma skin cancer

20 Method Rules Training % Test % MLP2LN, crisp rules 498.0 all 100 SSV Tree, crisp rules 497.5±0.3 100 FSM, rectangular f. 7 95.5±1.0 100 knn+ prototype selection 13 97.5±0.0 100 FSM, Gaussian f. 15 93.7±1.0 95±3.6 knn k=1, Manh, 2 features -- 97.4±0.3 100 LERS, rough rules 21 -- 96.2 Melanoma results

21 27 features taken into account: polarity, size, hydrogen-bond donor or acceptor, pi-donor or acceptor, polarizability, sigma effect. Pairs of chemicals, 54 features, are compared, which one has higher activity? 2788 cases, 5-fold crossvalidation tests. Antibiotic activity of pyrimidine compounds. Pyrimidines: which compound has stronger antibiotic activity? Common template, substitutions added at 3 positions, R 3, R 4 and R 5.

22 Antibiotic activity - results. Pyrimidines: which compound has stronger antibiotic activity? Mean Spearman's rank correlation coefficient used:  r s  Method Rank correlation FSM, 41 Gaussian rules 0.77±0.03 Golem (ILP)0.68 Linear regression 0.65 CART (decision tree)0.50

23 Thyroid screening. Garavan Institute, Sydney, Australia 15 binary, 6 continuous Training: 93+191+3488 Validate: 73+177+3178 l Determine important clinical factors l Calculate prob. of each diagnosis. Hidden units Final diagnoses TSH T4U Clinical findings Age sex … T3 TT4 TBG Normal Hyperthyroid Hypothyroid

24 Thyroid – some results. Accuracy of diagnoses obtained with different systems. Method Rules/Features Training % Test % MLP2LN optimized 4/6 99.9 99.36 CART/SSV Decision Trees 3/5 99.8 99.33 Best Backprop MLP -/21 100 98.5 Naïve Bayes -/- 97.0 96.1 k-nearest neighbors -/- - 93.8

25 PsychometryPsychometry MMPI (Minnesota Multiphasic Personality Inventory) psychometric test. Printed formsPrinted forms are scanned or computerized version of the test is used. computerized version Printed formscomputerized version Raw data: 550 questions, ex: I am getting tired quickly: Yes - Don’t know - No Results are combined into 10 clinical scales and 4 validity scales using fixed coefficients. Each scale measures tendencies towards hypochondria, schizophrenia, psychopathic deviations, depression, hysteria, paranoia etc.Each scale

26 Scanned form

27 Computer input

28 ScalesScales

29 PsychometryPsychometry There is no simple correlation between single values and final diagnosis. Results are displayed in form of a histogram, called ‘a psychogram’. Interpretation depends on the experience and skill of an expert, takes into account correlations between peaks.a psychogram Goal: an expert system providing evaluation and interpretation of MMPI tests at an expert level. Problem: agreement between experts only 70% of the time; alternative diagnosis and personality changes over time are important.

30 PsychogramPsychogram

31 Psychometric data 1600 cases for woman, same number for men. 27 classes: norm, psychopathic, schizophrenia, paranoia, neurosis, mania, simulation, alcoholism, drug addiction, criminal tendencies, abnormal behavior due to... Extraction of logical rules: 14 scales = features. Define linguistic variables and use FSM, MLP2LN, SSV - giving about 2-3 rules/class.

32 Psychometric data 10-CV for FSM is 82-85%, for C4.5 is 79-84%. +G x Input uncertainty +G x around 1.5% (best ROC) improves FSM results to 90-92%. MethodData N. rules Accuracy +Gx%+Gx%+Gx%+Gx% C 4.5 ♀5593.093.7 ♂6192.593.1 FSM♀6995.497.6 ♂9895.996.9

33 Psychometric Expert Probabilities for different classes. For greater uncertainties more classes are predicted. Fitting the rules to the conditions: typically 3-5 conditions per rule, Gaussian distributions around measured values that fall into the rule interval are shown in green. Verbal interpretation of each case, rule and scale dependent.

34 MMPI probabilities

35 MMPI rules

36 MMPI verbal comments

37 VisualizationVisualization Probability of classes versus input uncertainty. Detailed input probabilities around the measured values vs. change in the single scale; changes over time define ‘patients trajectory’. Interactive multidimensional scaling: zooming on the new case to inspect its similarity to other cases.

38 Class probability/uncertainty

39 Class probability/feature

40 MDS visualization

41 SummarySummary Neural networks and other computational intelligence methods are useful additions to the multivariate statistical tools. They support diagnosis, predictions, and data understanding: extracting rules, prototypes. FDA has approved many devices that use ANNs: Oxford’s Instruments Ltd EEG analyzer, Cardionetics (UK) ECG analyzer. PAPNET (NSI), analysis of Pap smears …

42 ChallengesChallenges Discovery of theories rather than data models Discovery of theories rather than data models Integration with image/signal analysis Integration with image/signal analysis Integration with reasoning in complex domains Integration with reasoning in complex domains Combining expert systems with neural networks Combining expert systems with neural networks…. Fully automatic universal data analysis systems: press the button and wait for the truth … We are slowly getting there. More & more computational intelligence tools (including our own) are available.


Download ppt "Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch PhD, DSc Dept. of Informatics, Nicholas Copernicus University, Poland."

Similar presentations


Ads by Google