A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building.

A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building an intelligent agent that simulates human-level learning using machine learning techniques

A CCELERATED F UTURE L EARNING Accelerated Future Learning Learning more effectively because of prior learning Has been observed a lot How? Expert vs Novice Expert  Deep functional feature (e.g. -3x  -3) Novice  Shallow perceptual feature (e.g. -3x  3)

A C OMPUTATIONAL M ODEL Model Accelerated Future Learning Use Machine Learning Techniques Acquire Deep Feature Integrated into a Machine-Learning Agent

A N E XAMPLE IN A LGEBRA

F EATURE R ECOGNITION AS PCFG I NDUCTION Under lying structure in the problem  Grammar Feature  Intermediate symbol in a grammar rule Feature learning task  Grammar induction Error  Incorrect parsing

P ROBLEM S TATEMENT Input is a set of feature recognition records consisting of An original problem (e.g. -3x) The feature to be recognized (e.g. -3 in -3x) Output A PCFG An intermediate symbol in a grammar rule

A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Extended a PCFG Learning Algorithm (Li et al., 2009) Feature Learning Stronger Prior Knowledge: Transfer Learning Using Prior Knowledge Better Learning Strategy: Effective Learning Using Bracketing Constraint

A T WO -S TEP A LGORITHM Greedy Structure Hypothesizer: Hypothesizes the schema structure Viterbi Training Phase: Refines schema probabilities Removes redundant schemas Generalizes Inside-Outside Algorithm (Lary & Young, 1990)

G REEDY S TRUCTURE H YPOTHESIZER Structure learning Bottom-up Prefer recursive to non-recursive

EM P HASE Step One: Plan parse tree computation Most probable parse tree Step Two: Selection probabilities update s: a i  p, a j a k

F EATURE L EARNING Build Most Probable Parse Trees For all observation sequences Select an Intermediate Symbol that Matches the most training records as the target feature

T RANSFER L EARNING U SING P RIOR K NOWLEDGE GSH Phase: Build parse trees based on previously acquired grammar Then call the original GSH Viterbi Training: Add rule frequency in previous task to the current task 0.6 6 0.3 3 0.5

E FFECTIVE L EARNING U SING B RACKETING C ONSTRAINT Force to generate a feature symbol Learn a subgrammar for feature Learn a grammar for whole trace Combine two grammars

E XPERIMENT D ESIGN IN A LGEBRA

E XPERIMENT R ESULT IN A LGEBRA Fig.2. Curriculum one Fig.3. Curriculum two Fig.4. Curriculum three Both stronger prior knowledge and a better learning strategy can yield accelerated future learning Both stronger prior knowledge and a better learning strategy can yield accelerated future learning Strong prior knowledge produces faster learning outcomes Strong prior knowledge produces faster learning outcomes L00 generated human-like errors L00 generated human-like errors

L EARNING S PEED IN S YNTHETIC D OMAINS Both stronger prior knowledge and a better learning strategy yield faster learning Both stronger prior knowledge and a better learning strategy yield faster learning Strong prior knowledge produces faster learning outcomes with small amount of training data, but not with large amount of data Strong prior knowledge produces faster learning outcomes with small amount of training data, but not with large amount of data Learning with subtask transfer shows larger difference, 1) training process; 2) low level symbols Learning with subtask transfer shows larger difference, 1) training process; 2) low level symbols

S CORE WITH I NCREASING D OMAIN S IZES The base learner, L00, shows the fastest drop The base learner, L00, shows the fastest drop Average time spent per training record Average time spent per training record Less than 1 millisecond except for L10 (266 milliseconds) Less than 1 millisecond except for L10 (266 milliseconds) L10: Need to maintain previous knowledge, does not separate trace into small traces L10: Need to maintain previous knowledge, does not separate trace into small traces Conciseness: Transfer learning doubled the size of the schema. Conciseness: Transfer learning doubled the size of the schema.

I NTEGRATING A CCELERATED F UTURE L EARNING IN S IM S TUDENT x+5 A machine-learning agent that Acquires production rules from Examples and problem solving experience Integrate the acquired grammar into production rules Requires weak operators (non-domain specific knowledge) Less number of operators

C ONCLUDING R EMARKS Presented a computational model of human learning that yields accelerated future learning. Showed Both stronger prior knowledge and a better learning strategy improve learning efficiency. Stronger prior knowledge produced faster learning outcomes than a better learning strategy. Some model generated human-like errors, while others did no make any mistake.

A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building.

Similar presentations

Presentation on theme: "A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building.

Similar presentations

Presentation on theme: "A C OMPUTATIONAL M ODEL OF A CCELERATED F UTURE L EARNING THROUGH F EATURE R ECOGNITION Nan Li Computer Science Department Carnegie Mellon University Building."— Presentation transcript:

Similar presentations

About project

Feedback