Can Genetic Programming Do Manifold Learning Too?

Can Genetic Programming Do Manifold Learning Too?
Andrew Lensen, A/Prof. Bing Xue, and Prof. Mengjie Zhang Victoria University of Wellington, New Zealand EuroGP’19 Lensen, Xue, Zhang.

Manifold Learning An approach to non-linear dimensionality reduction.
Assumes dimensionality of datasets are only artificially high. So, we can represent the structure of data with fewer dimensions without losing any information. April 2019

State-of-the-art Approaches (1)
t-Distributed Stochastic Neighbourhood Embedding (t-SNE) (for visualisation) Autoencoders. Older methods such as: Multidimensional Scaling (MDS) Locally-Linear Embedding (LLE) April 2019 Lensen, Xue, Zhang.

State-of-the-art Approaches (2)
Particular success in visualisation tasks. April 2019 Lensen, Xue, Zhang.

But... one critical limitation
How do the learnt representations relate to the original features? Low-dim space is optimised according to the relationships in the high-dim space. But, no mapping from high to low. Hard to interpret the meaning of the low-dim space. Have to re-train if new instances. April 2019

What about GP? Tree-based GP can easily produce functions taking inputs as terminals and producing an output at the root. Intrinsically suited to interpretability: lots of research. Global learner: less prone to local minima vs GD? No need for a differentiable fitness function (vs GD/AEs). April 2019

Yet, no existing use of GP for manifold learning?
GP-MaL: GP for Manifold Learning

Develop a new multi-tree GP approach to perform manifold learning.
Goals of GP-MaL Develop a new multi-tree GP approach to perform manifold learning. What GP representation to use? Which fitness function to measure how well structure is preserved? How good is GP-MaL vs existing MaL algorithms? Can we interpret the learnt manifolds? April 2019

GP-MaL: Representation
Use t trees per GP individual, to give t-dimensional low-dim space. Similar representation to using GP for feature construction. April 2019

GP-MaL: Terminal & Function Sets
Terminals: The d scaled input features. Random constants in U[-1,+1] for weighting. Functions: Explain why combining lots of sub-trees necessary for complex manifolds. We exclude a − b and a ÷ b as they are the complements of addition and multiplication, and as they were found to negatively affect the learning process by easily producing constant values (i.e. X − X = 0, X ÷ X = 1). if, in addition to max and min, allows complex conditional behaviour and non-continuous functions to be generated. April 2019

Fitness Function (1) Common strategy in MaL is encouraging preserving the high-dim neighbourhood structure in the low-dim space. Attempting to maintain distances have issues. We attempt to preserve the orderings of neighbours. April 2019

Fitness Function (2) Consider instance I with high-dim neighbours 𝑁= 𝑁 1 , 𝑁 2 , ..., 𝑁 𝑛−1 and neighbours 𝑁′ in the low-dim space. In a perfect situation, 𝑁= 𝑁 ′ , i.e. neighbour ordering is preserved. We define the similarity between two orderings as: 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑁, 𝑁 ′ = 𝑎 ∈ 𝑁 𝐴𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡 𝑃𝑜𝑠 𝑎,𝑁 −𝑃𝑜𝑠(𝑎, 𝑁 ′ ) Where Pos gives the position of a neighbor in an ordering. April 2019

Fitness Function (3) The fitness is then normalized similarity across the dataset: 𝐹𝑖𝑡𝑛𝑒𝑠𝑠= 1 𝑛 2 𝐼∈𝑋 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦( 𝑁 𝐼 , 𝑁 𝐼 ′ ) We use an Agreement function based on a Gaussian weighting: further deviations in positions are punished more harshly. April 2019

Fitness Function: an issue…
Robust fitness function, but expensive: 𝑂( 𝑛 2 log 𝑛 ) per individual. However, can omit some neighbours from the function without overly affecting fitness correctness (as model is function!). Should consider nearer neighbours more carefully than distant ones. Consider the first k neighbours, k of the next 2k, k of the next 4k… Reduces complexity to sublinear 𝑂( log(n) log(log 𝑛 ) ) April 2019

Experiment Design Use classification accuracy as a proxy for quality of manifold. Avoids inadvertently biasing any MaL algorithm. Random forest, 10-fold cross validation. Compare to PCA, MDS, LLE, t-SNE. Test at different levels of low-dimensionality. 40 repetitions for each stochastic method. April 2019

Datasets April 2019

Results (1) April 2019 Andrew.Lensen@ecs.vuw.ac.nz
At 2: GP beats PCA & MDS – too simple for linear reductions. At 3: GP still beats PCA, ~even with MDS. tSNE better as designed for 2/3D visualisation. April 2019

Results (2) April 2019 Andrew.Lensen@ecs.vuw.ac.nz
At 5/CR: GP clearly outperforms LLE and tSNE Begins to do worse vs PCA (as enough linear components to cover the manifold) Pretty good considering it evolves a functional mapping!! April 2019

GP-MaL for Visualisation: Dermatology
April 2019

GP-MaL for Visualisation: COIL20
T-SNE obs. Best. It is not clear which of GP-MaL, PCA, and MDS produces the best result; MDS tends to incorrectly separate some classes like t-SNE, whereas GP-MaL and PCA have poorer separation of different classes overall April 2019

GP-MaL: Interpretability
Madelon, 500 features. (c): either a linear or non-linear transformation of X475 based on X138. (d): more complex, but again conditional behaviour not modellable by other methods. (e): clearly X455 very important feature. This is a clear advantage over existing manifold learning techniques which are black (or very grey) boxes, and bodes well for future work April 2019

Summary GP has a number of qualities that make it uniquely suitable for interpretable manifold learning. GP-MaL shows the potential of this area: Competitive performance; Re-usable functional models; Evolved trees show promising interpretability. April 2019

Future Work More advanced functions/terminal sets.
Broader investigation into fitness functions. Improvements to interpretability via parsimony pressure etc. April 2019

Thank You! April 2019

GP-MaL Parameters April 2019

Can Genetic Programming Do Manifold Learning Too?

Similar presentations

Presentation on theme: "Can Genetic Programming Do Manifold Learning Too?"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Can Genetic Programming Do Manifold Learning Too?

Similar presentations

Presentation on theme: "Can Genetic Programming Do Manifold Learning Too?"— Presentation transcript:

Similar presentations

About project

Feedback