Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11 Practical Methodology

Similar presentations


Presentation on theme: "Chapter 11 Practical Methodology"— Presentation transcript:

1 Chapter 11 Practical Methodology
11.1 Performance Metrics 11.2 Default Baseline Models 11.3 Determining Whether to Gather More Data 11.4. Selecting Hyperparameters Manual Hyperparameter Tuning 전자전기컴퓨터공학부 공민영

2 11.1 Performance Metrics Determine your goals You will also be limited by having a finite amount of training data. Data collection can require time, money, or human suffering. How can one determine a reasonable level of performance to expect? determined desired error rate your design decisions will be guided by reaching this error rate.

3 11.1 Performance Metrics The accuracy, or equivalently, the error rate, of a system. However, many applications require more advanced metrics. Ex) An spam detection system can make two kinds of mistakes: incorrectly classifying a legitimate message as spam, > incorrectly allowing a spam message to appear in the in box.

4 11.1 Performance Metrics Sometimes we wish to train a binary classifier that is intended to detect some rare event. We can easily achieve % accuracy on the detection task, by simply hard-coding the classifier to always report that the disease is absent. Clearly, accuracy is a poor way to characterize the performance of such a system.

5 11.1 Performance Metrics Precision and recall. Precision the fraction of detections reported by the model that were correct, recall the fraction of true events that were detected.

6 11.1 Performance Metrics PR curve(Precision-Recall curve) precision on they-axis and recall on the x-axis Ex) to detect a disease outputs Medical result The probability that a person has the disease. We choose to report a detection whenever this score exceeds some threshold. By varying the threshold, we can trade precision for recall. F-score (precision p and recall r)

7 11.2 Default Baseline Models
establish a reasonable end-to-end system. supervised learning with fixed-size vectors ->feedforward network input has Known topological structure(eg image) -> convolutional network Batch normalization can have a dramatic effect on optimization performance. Early stopping should be used almost universally. Dropout is an excellent regularizer that is easy to implement and compatible with many models and training algorithms. Batch normalization also sometimes reduces generalization error and allows dropout to be omitted, due to the noise in the estimate of the statistics used to normalize each variable.

8 11.2 Default Baseline Models
it is common to use the features from a convolutional network trained on ImageNet to solve other computer vision tasks. unsupervised learning is somewhat domain specific. natural language processing are known to benefit tremendously from unsupervised learning. such as computer vision, unsupervised learning techniques do not bring a benefit. If your application is in a context where unsupervised learning is known to be important, then include it in your first end-to-end baseline. You can always try adding unsupervised learning later if you observe that your initial baseline overfits.

9 11.3 Determining Whether to Gather More Data
YES Measure the performance on training set Acceptable? NO No more data Increase the model YES Acceptable? NO More training data YES Measure the performance on test set

10 Measure the performance on test set
11.3 Determining Whether to Gather More Data Measure the performance on test set Acceptable? NO Gathering more data YES NO Acceptable? YES Done

11 11.3 Determining Whether to Gather More Data
Gathering more data the cost and feasibility of gathering more data. the cost and feasibility of reducing the test error by other means amount of data that is expected to be necessary to improve test set the answer is almost always to gather more training data.

12 11.3 Determining Whether to Gather More Data
To decide how much to gather It is recommended to experiment with training set sizes on a logarithmic scale, for example doubling the number of examples between consecutive experiments

13 11.4 Selecting Hyperparameters
Manual Hyperparameter Tuning Hyperparameters the representational capacity effective capacity The ability of the learning algorithm the degree to which the cost function and training procedure regularize the model.

14 11.4 Selecting Hyperparameters
Manual Hyperparameter Tuning

15 11.4 Selecting Hyperparameters
Manual Hyperparameter Tuning learning rate is important. If you have time to tune only one hyperparameter, tune the learning rate. It controls the effective capacity of the model in a more complicated way than other hyperparameters.

16 11.4 Selecting Hyperparameters
Manual Hyperparameter Tuning

17 11.4 Selecting Hyperparameters
Manual Hyperparameter Tuning Training error > target error -> increase capacity(more layer or more hidden units.. But cost↑) Test error > target error Test error = training error + the gap between train and test error training error is very low(when best performance in Neural network) Test error ≒ the gap between train and test error Usually the best performance comes from a large model that is regularized well, for example by using dropout.

18 11.4 Selecting Hyperparameters


Download ppt "Chapter 11 Practical Methodology"

Similar presentations


Ads by Google