Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1.

Similar presentations


Presentation on theme: "CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1."β€” Presentation transcript:

1 CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1 1Life Sciences Division, Lawrence Berkeley National Laboratory, 2Center for Comparative Medicine, University of California, Davis, 3Center for Spatial Systems Biomedicine, Oregon Health & Science University, Portland Major Challenges and Approach Introduction Proposed Model a) Learn a dictionary 𝐷using a generative model based on an extended version of restricted Boltzmann machine (RBM). Two stages of feedforward (encoding) and feedback (decoding). Second layer of pooling makes the system robust to translation. Fig: Example images of tumor samples in GBM showing diversity in the sample preparation Challenges: Requires a large cohort of histology sections which may be generated at different labs with significant amount of technical variations Expensive to generate large amount of annotated training data. Approach – Learn a set of automated features from unlabeled data and train learned features against an annotated dataset for classifying a collection of small patches in each image. Diagram of the proposed method. Objective: To evaluate tumor compositions in terms of multiparametric morphometric indices and link them to the clinical data. Decompose histology sections into different components (e.g., Stroma, tumor) and test nuclear compartment specific morphometric indices against outcomes. Fig: (a) Architecture for restricted Boltzmann machine (RBM), (b)Illustration of the 2-layer recognition framework including the encoder, decoder and pooling. Unsupervised Feature Learning Classification and Reconstruction Experimental Design Input vectorized image patches 𝑋 to generate an overcomplete set of π‘˜ basis functions 𝐷 and a sparse representation 𝑍 for each input. An encoder π‘Š is also learnt. Optimization function 𝐹 𝑋 = π‘Šπ‘‹βˆ’π‘ 2+ πœ† 𝑍 1+ 𝐷𝑍 βˆ’π‘‹ 2 , where 𝑋 πœ– ℝn , D πœ– ℝ𝑛xk , W πœ– ℝkxn, π‘πœ– ℝk πœ† is a parameter which controls the sparsity of the solution and is chosen by cross validation. Compute optimal 𝐷, 𝑍 and π‘Š given 𝑋 by: Randomly initialize 𝐷 and π‘Š. Fixing 𝐷 and π‘Š, then minimizing 𝐹 𝑋 w.r.t. 𝑍 via gradient descent. Fixing 𝑍, then estimating 𝐷 and π‘Š via stochastic gradient descent. Experiments were conducted on two datasets derived from (i) Glioblastoma Multiforme (GBM) and (ii) Kidney clear cell renal carcinoma (KIRC). Each image is 1K-by-1K pixels, which is cropped from whole slide images (WSI).1000 bases were constructed for each dataset. GBM: Necrosis has been shown to be predictive of outcome; We curate three classes that correspond to necrosis, transition to necrosis (an intermediate step), and tumor. Dataset contains 1400 images of samples scanned with 20X objective. Feature learning was performed using 50 randomly selected patches from each image of size 25 x 25. Max pooling was performed on 100 x 100 patches in 4 x 4 neighborhood. The patches were downsampled by a factor of 2 and normalized in the range of 0-1 in the color space . KIRC: Tumor type is the best prognosis for outcome, and in most sections, there is mix grading of clear cell carcinoma (CCC) and Granular tumors. The histology is typically complex since it contains components of stroma, blood, and cystic space. We opted the strategy to label each image patch as normal, granular tumor type, CCC, stroma, and others. The dataset contains 2,500 images of samples scanned with 40X objective. The patches were downsampled by a factor of 4 and normalized in the range of 0-1 in the color space. Fig: (a) A heterogeneous tissue section with β€œnecrosis transition” on the left and tumor on the right, and (b) its reconstruction after encoding and decoding Classification: The labeled training data is divided into non-overlapping image patches. Codes for these patches are computed as 𝑍=π‘Šπ‘‹. Codes are pooled over a local neighborhood. Pooled codes are modeled for different tissue types using a multiclass regularized support vector machine (SVM) implemented using LIBSVM. Reconstruction: The original image can be reconstructed from the codes using the decoder 𝑋=𝐷𝑍 The reconstruction error is a measure of how well the computed bases represent the data. Fig: Representative set of computed basis function, D, for a) the KIRC dataset and b) the GBM dataset. Classification results for GBM and KIRC Conclusion Classification of Heterogeneous Tissue Sections For GBM, from a total of 12,000, 8,000 and 16,000 patches obtained for necrosis, transition to necrosis and tumor. 4,000 patches were randomly selected, from each class, for training. An overall accuracy of 84.3% was obtained. For KIRC, from a total of 10, 000 patches for CCC, 16,000 patches for normal and stromal tissues, and 6,500 patches for tumor and others, we used 3,250 patches for training from each class. The overall classification accuracy was at 80.9% A method for automated feature learning from unlabeled images has been proposed for classifying distinct morphometric regions. Automated feature learning provides a rich representation when a cohort of WSI has to be processed in the context of batch effect. Automated feature learning is a generative model that reconstructs the original image from a sparse representation of an auto encoder. The system has been tested on two tumor types from the TCGA archive. Proposed approach will enable identifying morphometric indices that are predictive of the outcome. The preliminary performance of the computational protocol, for labeling tumor composition, was tested on several GBM sections. Whole slide sections of the size 20,000 Γ— 20,000 pixels were selected, and each 100-by-100 pixel patch was classified against the learned model. Classification has been consistent with pathological evaluation. Fig: Two examples of classification results of heterogeneous GBM tissue sections. The left and right images correspond to the original and classification results, respectively. Color coding is black (tumor), pink (necrosis), and green (transition to necrosis).


Download ppt "CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1."

Similar presentations


Ads by Google