Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005.

Similar presentations


Presentation on theme: "Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005."— Presentation transcript:

1 Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005

2 2 Outline What is Value System What is Value System Basics of SOLAR Basics of SOLAR Least Square Method Least Square Method Value Learning in SOLAR Value Learning in SOLAR Application Application How to use in SOLAR-V Project How to use in SOLAR-V Project

3 3 Value System In value system, terms such as 'good', 'bad', 'better', and 'worse' are quantified. In value system, terms such as 'good', 'bad', 'better', and 'worse' are quantified. Reinforcement learning Reinforcement learning Data analysis Data analysis Model interpreting and decision making Model interpreting and decision making Value prediction Value prediction With the right value system, one will make good short term decisions and have good long term results. With the right value system, one will make good short term decisions and have good long term results.

4 4 Value System Reinforcement Learning (RL) Reinforcement Learning (RL) a computational approach to learning a computational approach to learning An agent tries to maximize the total reward when interacting with complex uncertain environment An agent tries to maximize the total reward when interacting with complex uncertain environment Future expected reward — value Future expected reward — value t... s t a r t +1 s a r t +2 s a r t +3 s... t +3 a

5 5 Value System Value functions in RL Value functions in RL Functions of state-action pair: how good it is to perform a given action in a given state Functions of state-action pair: how good it is to perform a given action in a given state Value functions can be estimated from experience Value functions can be estimated from experience θ Trial X- initial X over limit r=1 1 0 x1, x2, x3, x4, a value

6 6 Basics of SOLAR Basics of SOLAR Training data: Training data: SOLAR have N inputs and reads the information in parallel from M feature vectors. SOLAR have N inputs and reads the information in parallel from M feature vectors. Prewiring procedure. Prewiring procedure. SOLAR is a feed forward structure. SOLAR is a feed forward structure. Interconnections and neuron operations are dynamic based on data during the interaction with environment. Interconnections and neuron operations are dynamic based on data during the interaction with environment. M N input samples

7 7 Basics of SOLAR An Deviation-based Selection (DBS) — determine a proper operation and inputs for each neuron. An Deviation-based Selection (DBS) — determine a proper operation and inputs for each neuron. Each neuron is a value estimator. Final value approximation is the global voting from all the neurons Each neuron is a value estimator. Final value approximation is the global voting from all the neurons

8 8 Least Square Method Least Square Method: least square fit to obtain least sum of squared errors between the data and approximation Least Square Method: least square fit to obtain least sum of squared errors between the data and approximation Function — linear combination of k basis functions Function — linear combination of k basis functions W is a set of weights — needs to be found out W is a set of weights — needs to be found out Projection to the space spanned by the basis function Projection to the space spanned by the basis function Easy to implement and debug — quantify the importance of each basis feature, engineer the features for better performance. Easy to implement and debug — quantify the importance of each basis feature, engineer the features for better performance.

9 9 Least Square Method Signal-to-noise Ratio (SNR)-controlled LSF Signal-to-noise Ratio (SNR)-controlled LSF How many basis functions should be considered? How many basis functions should be considered? Polynomials as basis functions — up to which order? Polynomials as basis functions — up to which order? Information may be corrupted by noise — over-fitting should be avoided Information may be corrupted by noise — over-fitting should be avoided we need to determine when the difference between the approximated and the measured data has characteristics of the noise signal. we need to determine when the difference between the approximated and the measured data has characteristics of the noise signal.

10 10 Least Square Method SNR-controlled LSF SNR-controlled LSF approximation error signal e(x)=s(x)-a(x) approximation error signal e(x)=s(x)-a(x) Determine the signal to noise ratio of e(x) by using signal correlation Determine the signal to noise ratio of e(x) by using signal correlation This S/N of the Gaussian noise is a random variable and its statistics can be directly estimated This S/N of the Gaussian noise is a random variable and its statistics can be directly estimated If this means that, most likely, not all the information was extracted from sampled data. In such case, increase basis function order by 1 If this means that, most likely, not all the information was extracted from sampled data. In such case, increase basis function order by 1

11 11 Least Square Method SNR-controlled LSF SNR-controlled LSF

12 12 Least Square Method Weighted LSF Weighted LSF knowledge is accumulated through the learning process, should relate more to recent information knowledge is accumulated through the learning process, should relate more to recent information Recent data weights more in the learning Recent data weights more in the learning Apply weights to data: exponentially declining going back from the most recent data Apply weights to data: exponentially declining going back from the most recent data

13 13 Least Square Method Weighted LSF Weighted LSF

14 14 Least Square Method Weighted LSF Weighted LSF error signal has to be weighted as well. error signal has to be weighted as well. The SNR of Gaussian noise The SNR of Gaussian noise Comparison Comparison

15 15 Value Learning Training data: Training data: Neuron Neuron initial wiring Trial X- initial X over limit r=1 1 0 x1, x2, x3, x4, a value

16 16 Value Learning Non-linear Scaling: Non-linear Scaling:

17 17 Value Learning Deviation-based Selection in neurons Deviation-based Selection in neurons Inputs: 1 or 2 inputs Inputs: 1 or 2 inputs Operation: Operation: ident, half, exp, log, add, sub Self-organized Self-organizedstructure

18 18 Value Learning Value information is stored distributed in neurons. Value information is stored distributed in neurons. Value approximation is collecting the approximation result from all the neurons. Value approximation is collecting the approximation result from all the neurons.

19 19 Value Learning Information fusion: use information from many sources. Information fusion: use information from many sources. Value approximation: global voting Value approximation: global voting

20 20 Value Learning Learning performance Learning performance

21 21 Application On-line Learning control by Reinforcement and Association On-line Learning control by Reinforcement and Association Action Network Value Network System X(t) X(t+1) X(t) u(t) J(t)

22 22 Application Financial data analysis Financial data analysis Data:103 features from 52 companies. Value: 52 gain values given by the one-year gain on investment. Prediction Prediction Prediction future gain based on current features.

23 23 SOLAR-V Project Prepare the data Prepare the data Data sample along the row: N samples Data sample along the row: N samples Features along the column: M features Features along the column: M features Given values in a row vector: N values Given values in a row vector: N values Save “ features ” and “ values ” in a training MAT file Save “ features ” and “ values ” in a training MAT file Save “ features_test ” and “ values_test ” in testing MAT file Save “ features_test ” and “ values_test ” in testing MAT file How to recall the function How to recall the function Run “ solar_v_main.m ” Run “ solar_v_main.m ” Input MAT file name and number of layers in command window. Input MAT file name and number of layers in command window. Input if you think any feature is more significant and like to repeat how many times. Input if you think any feature is more significant and like to repeat how many times. In the function, data will be scaled to 0~255, values are kept unchanged In the function, data will be scaled to 0~255, values are kept unchanged Function will determine how many neurons per layer, you can decide how many layers Function will determine how many neurons per layer, you can decide how many layers Several figures will be generated Several figures will be generated Prewired network structure Prewired network structure Self-organized network structure Self-organized network structure Learning performance compared with Neural Network Learning performance compared with Neural Network Testing results from SOLAR-V compared with Neural Network Testing results from SOLAR-V compared with Neural Network M x N matrix: features 1 x N vector: values

24 24 SOLAR-V Project Example: data from on-line control model Example: data from on-line control model 0.0074 0.0082 0.0090 0.0115 0.0156 0.0213 0.0287 0.0377 0.0483 … -0.1366 -0.1548 -0.1739 -0.2220 -0.2995 -0.4067 -0.5443 -0.7131 -0.9139 … 0.1312 0.0014 0.1318 0.2622 0.3928 0.5234 0.6542 0.7851 0.9160 … -0.1582 -0.0232 -0.1679 -0.3137 -0.4613 -0.6116 -0.7653 -0.9230 -1.0854 … -0.0074 0.0259 0.0143 0.0268 0.1006 0.2359 0.4482 0.6783 0.8436 … 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 …. 5 neurons per layer, 3 layers 5 neurons per layer, 3 layers Data 4 states 1 action Value


Download ppt "Self-organizing Learning Array based Value System — SOLAR-V Yinyin Liu EE690 Ohio University Spring 2005."

Similar presentations


Ads by Google