# Exact reconstruction of finite memory automata with the GSPS And a surprising application to the reconstruction of cellular automata James Nutaro

## Presentation on theme: "Exact reconstruction of finite memory automata with the GSPS And a surprising application to the reconstruction of cellular automata James Nutaro"— Presentation transcript:

Exact reconstruction of finite memory automata with the GSPS And a surprising application to the reconstruction of cellular automata James Nutaro nutarojj@ornl.gov

Reconstruction with the GSPS Begin with one or more time series Hypothesize a relationship between the variables in these time series – Visualized as a mask with squares for output and circles for input Construct an input-output model from the mask tv1v2v1v2 7ABAB 6ABAB 5BABB 4ABBA 3BBBA 2BAAB 1BBAA v1(t)v1(t-1)v2(t)% AAB100 (3/3) ABB BAA100 (2/2) BBA50 (2/4) B

The reconstruction procedure, step #1 tv1v2v1v2 7ABAB 6ABAB 5BABB 4ABBA 3BBBA 2BAAB 1BBAA v1(t)v1(t-1)v2(t)% AAB100 (1/1) Input observation Output observation v2(t)=f(v1(t),v1(t-1)) B=f(A,A)

The reconstruction procedure, step #2 tv1v2v1v2 7ABAB 6ABAB 5BABB 4ABBA 3BBBA 2BAAB 1BBAA v1(t)v1(t-1)v2(t)% AAB100 (1/1) ABB Input observation Output observation v2(t)=f(v1(t),v1(t-1)) B=f(A,B)

The reconstruction procedure, step #3 tv1v2v1v2 7ABAB 6ABAB 5BABB 4ABBA 3BBBA 2BAAB 1BBAA v1(t)v1(t-1)v2(t)% AAB100 (1/1) ABB BAA Input observation Output observation v2(t)=f(v1(t),v1(t-1)) A=f(B,A)

f may not be deterministic tv1v2v1v2 7ABAB 6ABAB 5BABB 4ABBA 3BBBA 2BAAB 1BBAA v1(t)v1(t-1)v2(t)% AAB100 (1/1) ABB BAA BBA50 (1/2) B Input observation Output observation v2(t)=f(v1(t),v1(t-1)) B=f(B,B) A=f(B,B)

Simulation with the GSPS Begin with first observation and observations of all data not generated by the model Generate subsequent observations with the model tv1v2 7A 6B 5A 4A 3BB 2BA 1BB v1(t)v1(t-1)v2(t)% AAB100 (3/3) ABB BAA100 (2/2) BBA50 (2/4) B

A simulation with the GSPS, step #1 First observation is v1(t)=v1(t-1)=B Outcome is A with 50% change and B with 50 % – A selected at random tv1v2 7A 6B 5A 4A 3B 2BA 1BB v1(t)v1(t-1)v2(t)% AAB100 (3/3) ABB BAA100 (2/2) BBA50 (2/4) B

A simulation with the GSPS, step #2 Second observation is v1(t)=v1(t-1)=B Outcome is A with 50% change and B with 50 % – B selected at random tv1v2 7A 6B 5A 4A 3BB 2BA 1BB v1(t)v1(t-1)v2(t)% AAB100 (3/3) ABB BAA100 (2/2) BBA50 (2/4) B

Finite memory automata

Examples of finite memory automata a b 0/0 1/1 0/0 a b 1/0 1/1 0/1 0/0

Not a finite memory automaton a b 1/1 0/1 0/0 Consider the input string 1111111110. What is the outcome? We can’t know.

GSPS and finite memory automata txy 7 6 Given a complete set of observations of a finite memory automaton, there is a mask that can exactly reconstruct its input/output behavior. This mask is the one corresponding to the function The number of unique entries in a complete set of observations is at most txy 7

GSPS and stochastic, finite memory automata a 1/1 1/0 0.9 0.1 Example of a stochastic automaton with single input, single state, and two outputs.

Cellular automata

Wolfram’s rule #24 5 4 3 2 1 txyxy 5WBBW 4BBWW 3BWWB 2WWBB 1WBBW

Reconstruction of Wolfram’s rule #24 txyxy 5WBBW 4BBWW 3BWWB 2WWBB 1WBBW x(t-1)y(t-1)y(t) WWW WBW BWB BBB 5 4 3 2 1 Reconstruction Simulation

Activity in cellular automata

Activity and computational costs

Is exact reconstruction of highly active systems feasible? Problem posed by highly active systems The necessary data grows exponentially with the variety of input and output – Exponential growth factor increases with the memory Can quickly reach peta- and exa- scale data Taming activity: directions for research Simplification – Preserve essential behaviors while reducing the level of activity High performance computing – GSPS algorithms implemented for large-scale computing and storage systems

In conclusion…a curious example of simplification and HPC Simulated tumor growth at day 90 beginning from 5 occupied pixels on day 1. Expected error in size of the tumor’s bounding box at 90 days is 3 pixels. Simplification: GSPS model has c. 190,000 possible observations at each cell; biological model has millions. Computing: Divide and conquer type parallel algorithm for constructing the GSPS table; required c. 2 days of computing on four cores to process c. 250,000,000 time series. Software for this example @ http://sourceforge.net/projects/gsps/http://sourceforge.net/projects/gsps/ (a) Biologically based simulation (b) GSPS simulation based on data produced by (a)

Similar presentations