Machine Learning Project

Machine Learning Project
PM2.5 Prediction Disclaimer: This PPT is modified based on Dr. Hung-yi Lee

Outline Project Introduction train/test data
Objective for this project

Task - Predict PM2.5 Real world Data downloaded.
To do: Use Linear Regression to predict PM2.5 value.

Data Introduction Original data was divided into train set and test set. Train set is the first 20 days' data。Then test set is the leftover from the original data. train.csv：First 20 days’ data for each month in a year. [total of 240 days] test_X.csv：From the leftover of 125 days: extract every 10 hours’ data as one data-batch. Every first 9 hours’ observed feature is used to predict the 10th hour’s PM2.5 value. A total of 240 non-overlapped data batch was extracted as test data，To do: predicted the PM2.5 value for the 240 data batches in test set.

Training Data

Testing Data

Submission format Predict the PM2.5 for 240 data points in test set 。
Format: csv First line must be: id,value Second line: Every row is the id and predicted PM2.5 value.

Instruction Python Implement linear regression，with Adagrad Gradient Descent。 Can NOT use the linear regression function from Python packages，but you are allowed to use numpy、scipy and pandas。 ( Standard library is allowed ) (numpy.linalg.lstsq is not allowed!!!) pandas 0.20

Three Steps for Machine Learning
Step 1. Define you function set (Model) Implement models for two different features Use all features from the first 9hours (add bias) Use only PM2.5 feature from the first 9hours (add bias) Note: a. Set NR as 0，No actions for the rest features b. use gradient descent c. Use MSE as your loss.

First 20 days monthly Rest 10 days monthly Process for

Project Implementation Instruction

Outline Simple linear regression using gradient descent (with adagrad)
How to extract feature Implement linear regression Apply model from Step (2) to predict pm2.5

How to extract feature (train.csv)
24 18 2014/1/1 ... 18 2014/1/2 18 2014/1/3 ...

(Pseudo code) 1. Make one 18-dimension vector (Data) 2. for i_th row in training data : 3. Data[i_th row%18].append(every element in i_th row) 4. (make NR in RAINFALL->set as 0) Data will become a vector as: 2014/1/1 2014/1/2 2014/1/3 ... 加上 real python code

480 480 480 18 January data February data March data …... Every 10 hours for one data batch 18 …...

(Pseudo code) 1. Make train_x store the first 9 hours’ data, and train_y as the pm2.5 value at the 10th hour 2. for i =1st month、2nd month continuously extract every consecutive 10 hours’ data: 4. train_x.append(first 9 hours’ data) 5. train_y.append(10th hour’s pm2.5 value) 6. add bias to train_x

Implement linear regression –gradient descent
(Pseudo code) 1. Make weight vector、initiate learning rate、# of iteration 2. for i_th iteration : y’ = inner product of train_x & weight vector L = y’ - train_y gra = 2*np.dot( (train_x)’ , L ) weight vector -= learning rate * gra 解釋y’(vector) and train_x(matrix)

Implement linear regression –gradient descent
2. for i_th iteration : y’ = inner product of train_x & weight vector L = y’ - train_y gra = 2*np.dot( (train_x)T , L ) weight vector -= learning rate * gra Implement linear regression –gradient descent L = gra = 2 * p-dim vector 3. 4. 5.

Adagrad

Implement linear regression –Adagrad gradient descent
(Pseudo code) 1. Make weight vector、initiate learning rate、# of iteration Make prev_gra store the gradient for each iteration 2. for i_th iteration : y’ = inner product of train_x & weight vector L = y’ - train_y gra = 2*np.dot( (train_x)’ , L ) prev_gra += gra**2 ada = np.sqrt(prev_gra) weight vector -= learning rate * gra / ada

Predict PM 2.5 (Pseudo code) 1. read test_x.csv 2. every 18 rows : test_x.append([1]) test_x.append(data for 9 hours) test_y = np.dot( weight vector, test_x)

Stt512 Fall 2017 Chapter 2 Dr. Yishi Wang

Prediction with matrix notation
Let 𝑌 𝑖 be the estimated response for the ith subject with 𝑥= 𝑋 𝑖 . We then have 𝑌 𝑖 = 𝑏 𝑜 + 𝑏 1 𝑋 𝑖 . We can rewrite it as 𝑌 𝑖 = 1, 𝑋 𝑖 𝑏 0 𝑏 1 . If we define 𝑌 = 𝑌 𝑌 ⋮ 𝑌 𝑛 , then 𝑌 = 1, 𝑋 1 1, 𝑋 2 ⋮ 1, 𝑋 𝑏 0 𝑏 1 =𝑋 𝑏 0 𝑏 1 = 𝑋(𝑋 𝑇 𝑋) −1 𝑋 𝑇 𝑌. Recall from section 1.7 that the residual 𝑒 𝑖 = 𝑌 𝑖 − 𝑌 𝑖 . We define 𝑒= 𝑒 1 𝑒 2 ⋮ 𝑒 𝑛 . Then 𝑒=𝑌− 𝑌 = 𝑌−𝑋(𝑋 𝑇 𝑋) −1 𝑋 𝑇 𝑌= 𝐼− 𝑋(𝑋 𝑇 𝑋) −1 𝑋 𝑇 𝑌 Show that 𝑒 𝑇 𝑌 =0

Machine Learning Project

Similar presentations

Presentation on theme: "Machine Learning Project"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning Project

Similar presentations

Presentation on theme: "Machine Learning Project"— Presentation transcript:

Similar presentations

About project

Feedback