Presentation on theme: "Using Relational Structure for Learning and Modeling in Biomedical and Social Domains Mark Goadrich Computer Science and Mathematics Centenary College."— Presentation transcript:
Using Relational Structure for Learning and Modeling in Biomedical and Social Domains Mark Goadrich Computer Science and Mathematics Centenary College of Louisiana Natural Science Colloquium November 6th, 2007
Overview First-Order Logic and Machine Learning –The world is full of Objects –Model these Objects to understand the world Inductive Logic Programming –Objects and Relations/Properties Agent-Based Modeling –Objects and Interactions/Behaviors
Bongard Problems 6 positive examples of a concept on left 6 negative examples on right How to learn this concept using a computer?
First-Order Logic using PROLOG Objects –e3, t1, t2, c1 Types –example(e3) –triangle(t1) –triangle(t2) –circle(c1) Positive Example 3 Relations – has_shape(e3, t1) – has_shape(e3, t2) – has_shape(e3, c1) – inside(t2, c1) – left(t2, t1) – size(c1, 2.5) – above(t2, t1) … Repeat this process for each example in dataset
Inductive Logic Programming (ILP) Search the space of possible rules “positive(E) :- …” Judge rule quality by positive - negative coverage positive(E) positive(E):- has_shape(E, A) positive(E):- has_shape(E, A), triangle(A) positive(E) :- has_shape(E, A), has_shape(E, B), triangle(A), circle(B), inside(A, B).
Research Issues in ILP Enormous space to search for rules Enormous number of examples Incorporation of continuous features Learning of probabilistic rules Evaluation of rule quality Survey of ILP domains and future interests
Mutagenesis Designing effective and selective drugs Represent chemicals as atoms and bonds between them atm(127, 127_1, c, 22, ) bond(127, 127_1, 127_6, 7 ) Learned mutagenic rule: mutagenic(A) :- atm(A, B, c, 27, C), bond(A, D, E, 1), bond(A, B, E, 7).
Breast Cancer Detection Large dataset of abnormalities found in mammograms Not enough radiologists Relational features –More than one abnormality per mammogram –More than one mammogram per person over time malignant(A) :- not asymmetric(A), in_same_mammorgram(A, A2), spiculated_margin(A2), not distorted(A2)
Robot Scientist Represent Metabolic Pathways as a Regulatory Network Graph Knock out genes, and then systematically deduce the unknown function Try to learn the network from time-series microarray data
Social Networks People are connected by friendships into networks Each person has likes/dislikes, possibly influenced by their network Can we learn your interests based on who you know and what they like? Targeted advertisements?
Netflix Prize What movies should Netflix recommend you watch next? Large relational dataset –Movies –Titles –Ratings –Friends –Friend’s ratings –Genre $1 million if you achieve 10% improvement over their algorithm Cinematch
Zendo Board game about inductive logic Master creates a rule which some 3-D pyramid structures fit and others do not Players build structures and try to guess the Master rule Easier to design computer Master to decide if a structure fits the rule Harder to design computer Player that must efficiently guess the rule
Crab Claws What physical characteristics distinguish between two species? Within the same species, what changes due to growth, diet and their relation to predation? Find the “shock graph” of each image Use ILP to learn differences based on these graphs
Agent-Based Modeling Objects have interactions with each other –Flocks of Birds, Schools of Fish Separation Alignment Cohesion Objects interact with their environment –Ant Foraging, Pheromones, Traffic Laws Agent-Based Modeling (ABM) –Create discrete-time computational simulation –Align models with known behavior –Vary parameters to test new hypotheses
Cellular Process Social Science
Conclusions First-Order Logic combines with ILP and ABM to create a powerful representation of the world Research Opportunities –Social Networks –Zendo Player –Claws and Shock Graphs –Cellular Simulation –Social Simulation –[Insert your favorite dataset here]