Analysis of Trained CNN (Receptive Field & Weights of Network)

Analysis of Trained CNN (Receptive Field & Weights of Network)
Bukweon Kim

Basic Example:Mnist Data
Mnist data is data of 28 by 28 sized handwritten digit images, all labeled which is which. Let us observe the structural characteristics with this Mnist data using simple CNN.

Example CNN structure Input Image 24 X 24 28 X 28 5 X 5 X 4 5 X 5 X 4
First Convolution Weights 24 X 24 12 X 12 Convolution Pooling X 8 5 X 5 X 4 Second Convolution Weights 8 X 8 4 X 4 Fully Connected Classify 4 X 4 X 8 X 16 Fully Connected Layer Weights Classify Dictionary 16 X 1 16 X 10 ReLU Softmax ReLU ReLU

Receptive Field Receptive field is the area that this value is determined from. The green box is determined by red red boxes and green box affects cyan box or values. Any value that is not inside red box does not affect the value of the green box.

Classification Similar (high inner product value) 1 2 3 4 5 6 7 8 9 Inner product and softmax Result achieved by applying softmax on inner-product result of signal and library This feature strongly suggest input is either 0, 4, 5, 6, or 8 Signals of 16 features extracted from previous steps of CNN Library of 16 features for each digit

Outputs of several 6s 6 We will focus on this 13th signal and analysis what it means. Library of 16 features for 6

6 6 like image examples for explanation
We will focus on this 13th signal and analysis what it means. Library of 16 features for 6 14th Mnist Data Confuses between 6 and 0 Classified as 6 Classified as 6 Classified as 1

What is the meaning of this 13th features?
0.2 + 0.1 0.4 0.7 Inner Product ReLU 1.7 For the convenience of understanding I will focus on the strongest signal Output of previous pooling layer Weights for 13th signal of next layer

Weight and signal analysis : Weight value
Receptive field of each values. positive negative Main signal given from previous layer 0.22 0.18 0.04 0.05 2.22 2.97 0.07 0.66 0.83 0.61 -1.04 0.15 4.27 -1.18 -1.19 -0.21 -1.22 -1.33 -0.55 -0.67 0.07 0.05 0.12 0.22 0.18 0.04 Value translated as image moved 8 pixels below. Inner Product Inner Product 0.7 -0.7

Weight and signal analysis : Pooling
0.22 0.18 0.04 0.05 The max pooling made signal a bit local translation invariant. Even though we moved the image 2 pixels, the signal of selected pixel did not change 0.14 0.22 0.05 2 pixels 0.22 0.18 0.04 2 pixels 2×2 max pooling with stride 2

Weight and signal analysis : Weight value deeper understanding 1
Increased value of input corresponding to positive weight enhances the signal. Increased value of input corresponding to negative weight suppress the signal. 0.03 -0.01 -0.26 -0.09 -0.12 -0.03 -0.85 -0.04 -0.02 0.15 -0.43 0.11 0.14 0.06 0.04 -0.21 0.02 -0.05 0.05 Value change of input corresponding to weight near 0 does not effect the signal much.

Weight and signal analysis : Pooling deeper understanding
Strongest signal output from one of this inner product or

+ Weight and signal analysis : Weight value deeper understanding 2
Map of the pattern determined from 2 previous filters Convolution & pooling Convolution Map of the signal of combination of any of 4 patterns we looked for ReLU + Weights of first convolution layer Outputs of first pooling layer 8th Weight in second convolutional layer

Weight and signal analysis : Weight value deeper understanding 3
ReLU + Each maps are looking for the patterns somewhat similar to these (these are not exact because it is not linear) the final output may be considered as the value derived from taking many combination of patterns in account. These accounted patterns may not only enhance the value, but also suppress the value. ReLU( + + + )= suppress enhance ReLU( + + + )= Input for fully connected layer!! suppress enhance

Why was CNN fooled/not fooled for examples?
The strong 13th signal usually tells if input is 6 or not because of what it looks for. The strong 13th signal usually tells if input is 6 or not because of what it looks for. ignored Library of 16 features for 6 Existence of / pattern on middle fooled them to think it is 1. Confuses between 6 and 0 14th Mnist Data Classified as 6 Classified as 6 Classified as 1

The CNN with ReLU looks for combination of patterns as it gets deeper.
Conclusion The CNN with ReLU looks for combination of patterns as it gets deeper. The pooling layer tells CNN that we are looking for local translation invariant features. Deeper layers of CNN allow the network to look for more complex combination of patterns. Also, it allow wider invariance for local patterns. With knowing what exactly CNN looks for, we can tell have deeper understanding of how the CNN works, and what can or can’t it do.

Semantic Segmentation Using Image Classification
Pixelwise Classification Amniotic Fluid Umbilical Vein Stomach Bubble Shadowing Artifact Bone Other white region Classification CNN Stomach Bubble Repeat for every pixel Extract Patch centered at pixel Classify pixel

Comparison for Segmentation Result with and without spine position
With some change of CNN structure, we could give the spine position information into the CNN structure where we wanted them to be applied

Analysis of Trained CNN (Receptive Field & Weights of Network)

Similar presentations

Presentation on theme: "Analysis of Trained CNN (Receptive Field & Weights of Network)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Analysis of Trained CNN (Receptive Field & Weights of Network)

Similar presentations

Presentation on theme: "Analysis of Trained CNN (Receptive Field & Weights of Network)"— Presentation transcript:

Similar presentations

About project

Feedback