Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau.

Similar presentations


Presentation on theme: "Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau."— Presentation transcript:

1 Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau

2 Overview Background Background Research Question Research Question Data Sources Data Sources Methodology Methodology Implementation Implementation Results Results Conclusion Conclusion

3 Background Information distortion in the supply chain Difficult for manufacturers to forecast

4 Current solutions Exponential Smoothing Exponential Smoothing Moving Average Moving Average Trend Trend Etc.. Etc.. Wide range of software forecasting solutions Wide range of software forecasting solutions M3 Competition research tests most forecasting solutions and finds the simplest work best M3 Competition research tests most forecasting solutions and finds the simplest work best

5 Artificial Intelligence Universal Approximators Universal Approximators Artificial Neural Networks (ANN) Artificial Neural Networks (ANN) Recurrent Neural Networks (RNN) Recurrent Neural Networks (RNN) Support Vector Machines (SVM) Support Vector Machines (SVM) Theorectically should be able to match or outperform any traditional forecasting approach. Theorectically should be able to match or outperform any traditional forecasting approach.

6 Neural Networks Learns by adjusting weights of connections Learns by adjusting weights of connections Based on empirical risk minimization Based on empirical risk minimization Generalization can be improved by: Generalization can be improved by: Cross Validation based early stopping Cross Validation based early stopping Levenberg-Marquardt with Bayesian Regularization Levenberg-Marquardt with Bayesian Regularization

7 Support Vector Machine Learns be separating data in a different feature space with support vectors Learns be separating data in a different feature space with support vectors Feature space can often be a higher or lower dimensionality space than the input space Feature space can often be a higher or lower dimensionality space than the input space Based on structural risk minimization Based on structural risk minimization Optimality guaranteed Optimality guaranteed Complexity constant controls the power of the machine Complexity constant controls the power of the machine

8 Support Vector Machine CV 10-fold Cross Validation based optimization of Complexity Constant 10-fold Cross Validation based optimization of Complexity Constant More effective than NN because of guaranteed optimality More effective than NN because of guaranteed optimality

9 SVM Complexity Example SVM Complexity Constant optimization based on 10-Fold Cross Validation SVM Complexity Constant optimization based on 10-Fold Cross Validation

10 Research Question For a manufacturer at the end of the supply chain who is subject to demand distortion: For a manufacturer at the end of the supply chain who is subject to demand distortion: H1: Are AI approaches better on average than traditional approaches (error) H1: Are AI approaches better on average than traditional approaches (error) H2: Are AI approaches better than traditional approaches (rank) H2: Are AI approaches better than traditional approaches (rank) H3: Is the best AI approach better than the best traditional H3: Is the best AI approach better than the best traditional

11 Data Sources 1. Chocolate Manufacturer (ERP) 2. Toner Cartridge Manufacturer (ERP) 3. Statistics Canada Manufacturing Survey

12 Methodoloy Experiment Experiment Using top 100 from 2 manufacturers and random 100 from StatsCan Using top 100 from 2 manufacturers and random 100 from StatsCan Comparison based on out-of-sample testing set Comparison based on out-of-sample testing set

13 Implementation Experiment programmed in MATLAB Experiment programmed in MATLAB Using existing toolbox where possible (eg, NN, ARMA, etc) Using existing toolbox where possible (eg, NN, ARMA, etc) Programming missing ones Programming missing ones SVM implemented using mySVM called from MATLAB SVM implemented using mySVM called from MATLAB

14 Experimental Groups CONTROL GROUP Traditional Techniques TREATMENT GROUP Artificial Intelligence Techniques  Moving Average  Trend  Exponential Smoothing  Theta Model (Assimakopoulos & Nikolopoulos 1999)  Auto-Regressive and Moving Average (ARMA) (Box and al. 1994)  Multiple Linear Regression (Auto- Regressive)  Neural Networks  Recurrent Neural Networks  Support Vector Machines

15 Super Wide model Time series are short Time series are short Very noisy because of supply chain distortion Very noisy because of supply chain distortion Super Wide model combined data from many products Super Wide model combined data from many products Much larger amount of data to learn from Much larger amount of data to learn from Assumes similar patterns occur in the group of products. Assumes similar patterns occur in the group of products.

16 Result Table (Chocolate) RankCntrl./Treat.MAEMethodType 1Treatment 0.76928454SVM CV_WindowSuperWide 2Treatment 0.77169699SVM CVSuperWide 3Control 0.77757298MLRSuperWide 4Treatment 0.79976471ANNBPCVSuperWide 5Control 0.82702030ES Init 6Control 0.83291872ES20 7Control 0.83474625Theta ES Init 8Control 0.83814324MA6 9Control 0.85340016MA 10Control 0.86132238ES Avg 11Control 0.87751655Theta ES Average 12Control 0.90467127MLR 13Treatment 0.92085160ANNLMBRSuperWide 14Treatment 0.93065086RNNLMBR 15Treatment 0.93314457ANNLMBR 16Treatment 0.93353440SVM CV 17Treatment 0.94270139SVM CV_Window 18Treatment 0.98104892ANNBPCV 19Treatment 0.99538663RNNBPCV 20Control 1.01512843ARMA 21Control 1.60425383TR 22Control 8.19780648TR6

17 Results Table (Toner) RankCntrl./Treat.MAEMethodType 1Treatment 0.67771156SVM CVSuperWide 2Treatment 0.67810404SVM CV_WindowSuperWide 3Control 0.69281237ES20 4Control 0.69929521MA6 5Control 0.69944606ES Init 6Treatment 0.70027399SVM CV_Window 7Control 0.70535163MA 8Control 0.70595237MLRSuperWide 9Treatment 0.72214623SVM CV 10Control 0.72443731Theta ES Init 11Control 0.72587771ES Avg 12Control 0.73581062Theta ES Average 13Control 0.76767181MLR 14Treatment 0.77807766ANNLMBRSuperWide 15Treatment 0.80899048RNNBPCV 16Treatment 0.81869933RNNLMBR 17Treatment 0.81888839ANNLMBR 18Treatment 0.84984560ANNBPCV 19Treatment 0.88175390ANNBPCVSuperWide 20Control 0.93190430ARMA 21Control 1.60584233TR 22Control 8.61395034TR6

18 Results Table (StatsCan) RankCntrl./Treat.MAEMethodType 1Treatment 0.44781737SVM CV_WindowSuperWide 2Treatment 0.45470378SVM CVSuperWide 3Control 0.49098436MLR 4Treatment 0.49144177SVM CV_Window 5Treatment 0.49320980SVM CV 6Control 0.50517910Theta ES Init 7Control 0.50547172ES Init 8Control 0.50858447ES Average 9Control 0.51080625MA 10Control 0.51374179Theta ES Average 11Control 0.53272253MLRSuperWide 12Control 0.53542068MA6 13Treatment 0.53553823RNNLMBR 14Treatment 0.53742495ANNLMBR 15Control 0.54834604ES20 16Treatment 0.58718750ANNBPCVSuperWide 17Treatment 0.64527015ANNLMBRSuperWide 18Treatment 0.80597984RNNBPCV 19Treatment 0.82375877ANNBPCV 20Control 1.36616951ARMA 21Control 1.99561045TR 22Control 20.89770108TR6

19 Results Discussion AI provides a lower forecasting error on average. (H1=Yes) AI provides a lower forecasting error on average. (H1=Yes) However, this is only because of the extremely poor performance of trend based forecasting However, this is only because of the extremely poor performance of trend based forecasting Traditional ranked better than AI. (H2=No) Traditional ranked better than AI. (H2=No) Extreme trend error has no impact on rank. Extreme trend error has no impact on rank. SVM Super Wide performed better than the best traditional (ES). (H3=Yes) SVM Super Wide performed better than the best traditional (ES). (H3=Yes) However, exponential smoothing was found to be the best and no non-super-wide AI technique reliably performed better. However, exponential smoothing was found to be the best and no non-super-wide AI technique reliably performed better.

20 Results SVM Super Wide details SVM Super Wide performed better than all others SVM Super Wide performed better than all others Isolated to SVM / Super Wide combination only Isolated to SVM / Super Wide combination only Other Super Wide did not reliably perform better than ES Other Super Wide did not reliably perform better than ES Other SVM models did not perform better than ES Other SVM models did not perform better than ES Dimensionality augmentation/reduction (non-linearity) is important Dimensionality augmentation/reduction (non-linearity) is important Super Wide SVM performed better than Super Wide MLR Super Wide SVM performed better than Super Wide MLR

21 Conclusion When unsure, us Exponential Smoothing it is the simplest and second best. When unsure, us Exponential Smoothing it is the simplest and second best. Super Wide SVM provides the best performance Super Wide SVM provides the best performance Cost-benefit analysis by a manufacturer should help decide if the extra effort is justified. Cost-benefit analysis by a manufacturer should help decide if the extra effort is justified. If implementations of this technique proves useful in practice, eventually it should be built into ERP systems. Since it may not be feasible to build for SME. If implementations of this technique proves useful in practice, eventually it should be built into ERP systems. Since it may not be feasible to build for SME.

22 Implications Useful for forecasting models which should include more information sources / more variables (Economic indicators, product group performances, marketing campaigns) because: Useful for forecasting models which should include more information sources / more variables (Economic indicators, product group performances, marketing campaigns) because: Super Wide = More observations Super Wide = More observations SVM+CV = Better Generalization SVM+CV = Better Generalization Not possible with short and noisy time series on their own. Not possible with short and noisy time series on their own.


Download ppt "Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau."

Similar presentations


Ads by Google