Maseeh College of Engineering and Computer Science 5/11/2015 1 ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering.

Slides:



Advertisements
Similar presentations
Approaches, Tools, and Applications Islam A. El-Shaarawy Shoubra Faculty of Eng.
Advertisements

Heuristic Search techniques
Autonomic Scaling of Cloud Computing Resources
Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
Introduction of Probabilistic Reasoning and Bayesian Networks
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Yiannis Demiris and Anthony Dearden By James Gilbert.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
Chapter 4 DECISION SUPPORT AND ARTIFICIAL INTELLIGENCE
From Discrete Mathematics to AI applications: A progression path for an undergraduate program in math Abdul Huq Middle East College of Information Technology,
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
19 April, 2017 Knowledge and image processing algorithms for real-life applications. Dr. Maria Athelogou Principal Scientist & Scientific Liaison Manager.
Artificial Intelligence
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Artificial Intelligence (AI) Addition to the lecture 11.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
计算机科学概述 Introduction to Computer Science 陆嘉恒 中国人民大学 信息学院
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
11 C H A P T E R Artificial Intelligence and Expert Systems.
Artificial Intelligence
10/6/2015 1Intelligent Systems and Soft Computing Lecture 0 What is Soft Computing.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Decision Support Systems Chapter 10.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Fundamentals of Information Systems, Third Edition2 Principles and Learning Objectives Artificial intelligence systems form a broad and diverse set of.
Hierarchical Temporal Memory as a Means for Image Recognition by Wesley Bruning CHEM/CSE 597D Final Project Presentation December 10, 2008.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
Chapter 11 Artificial Intelligence Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Combining Theory and Systems Building Experiences and Challenges Sotirios Terzis University of Strathclyde.
I Robot.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
COGNITIVE RADIO NETWORKING AND RENDEZVOUS Presented by Vinay chekuri.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots that Learn from Humans in Creating Brain-like Intelligence. Course: Robots.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Why Can't A Computer Be More Like A Brain?. Outline Introduction Turning Test HTM ◦ A. Theory ◦ B. Applications & Limits Conclusion.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
A field of study that encompasses computational techniques for performing tasks that require intelligence when performed by humans. Simulation of human.
COMPUTER SYSTEM FUNDAMENTAL Genetic Computer School INTRODUCTION TO ARTIFICIAL INTELLIGENCE LESSON 11.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
HIERARCHICAL TEMPORAL MEMORY WHY CANT COMPUTERS BE MORE LIKE THE BRAIN?
Artificial Intelligence
Sub-fields of computer science. Sub-fields of computer science.
Fundamentals of Information Systems, Sixth Edition
Tohoku University, Japan
Model-Driven Analysis Frameworks for Embedded Systems
Artificial Intelligence
Introduction Artificial Intelligent.
Artificial Intelligence introduction(2)
CAP 5636 – Advanced Artificial Intelligence
Bayesian Statistics and Belief Networks
Intelligent Systems and
Artificial Intelligence Lecture No. 28
Graduate School of Information Sciences, Tohoku University
Expectation-Maximization & Belief Propagation
CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano
Presented By: Darlene Banta
Introduction to Artificial Intelligence Instructor: Dr. Eduardo Urbina
Presentation transcript:

Maseeh College of Engineering and Computer Science 5/11/ ERA: Architectures for Inference Dan Hammerstrom Electrical And Computer Engineering

Maseeh College of Engineering and Computer Science Hammerstrom 5/11/ Intelligent Computing In spite of the transistor bounty of Moore’s law, there is a large class of problems that computers still do not solve well These problems involve the transformation of data across the boundary between the real world and the digital world They occur wherever a computer is sampling and acting on real world data, which includes almost all embedded computing applications Our lack of general solutions to these problems, outside of specialized niches, constitutes a significant barrier to computer usage and huge potential markets

Maseeh College of Engineering and Computer Science Hammerstrom 5/11/ These are difficult problems that require computers to find complex structures and relationships through space and time in massive quantities of low precision, ambiguous, noisy data AI pursued solutions in this area, but ran into scaling problems among other things Artificial Neural Networks (ANN) extended computational intelligence in a number of important ways, primarily by adding the ability to incrementally learn and adapt However, ANNs also had trouble scaling, and they were often difficult to apply to many problems Traditional Rule Based Knowledge systems are now evolving into probabilistic structures where inference becomes the key computation, generally based on Bayes’ rule

Maseeh College of Engineering and Computer Science Hammerstrom 5/11/ Bayesian Networks We now have Bayesian networks  A major contribution to this effort was the work of Judea Pearl Pearl, J., Probabilistic Reasoning In Intelligent Systems – Networks of Plausible Inference, Morgan Kaufman, 1988 & 1997  These systems are far less brittle and they also more faithfully model aspects of animal behavior  Animals learn from their surroundings and appear to do a kind of probabilistic inference from learned knowledge as they interact with their environment Bayesian nets express the structured, graphical representations of probabilistic relationships between several random variables  The graph structure is an explicit representation of conditional dependence (encoded by network edges) And the fundamental computation has become probabilistic inference

Maseeh College of Engineering and Computer Science Hammerstrom 5/11/ A Simple Bayesian Network P(d|b,c)d1d1 d2d2 b 1, c b 2, c b 1, c b 2, c The CPT for node D, there are similar tables for A, B, and C Each node has a CPT or “Conditional Probability Table”

Maseeh College of Engineering and Computer Science Hammerstrom 5/11/ We need to “infer” the most likely original message given the data we received and our knowledge of the statistics of channel errors and the messages being generated The Inference Problem: Choose the most likely y, based on P[y|x’] Inference, A Simple Example

Maseeh College of Engineering and Computer Science Hammerstrom Assumption: Bayesian Inference As A Fundamental Computation In Future Systems In mapping inference computations to hardware there there are a number of issues to be considered, including:  The type and degree of parallelism (multiple, independent threads versus data parallelism)  Arithmetic precision  Inter-thread communication  Local storage requirements, etc. There are several variations on basic Bayesian techniques, over a number of different fields, from communication theory and pattern recognition, to computer vision, robotics, and speech recognition However, for this review of inference as a computational model, three general families of algorithms are considered: 1) Inference by Analysis, 2) Inference by Random Sampling, and 3) Inference Using Distributed Representations 5/11/2015 7

Maseeh College of Engineering and Computer Science Hammerstrom Analytic Inference Analytic techniques constitute the most widely used approach for doing inference over Bayesian Networks Most Bayesian Networks are evaluated using Bayesian Belief Propagation (BBP) developed by Pearl  Typically data are input to the network by setting certain variables to known or observed values (“evidence”)  Bayesian Belief Propagation is then performed to find the probability distributions of the free variables Analytic techniques require significant precision and dynamic range, generally in the form of floating point representations This plus the limited parallelism makes them good candidates for multi-core architectures, but not necessarily for more advanced nano-scale computation 5/11/2015 8

Maseeh College of Engineering and Computer Science Hammerstrom Random Sampling Another approach to handling inference is via random sampling There are a number of techniques, most of which fall under the general category of Monte Carlo Simulation  Again, evidence is input by setting some nodes to known values  Then random samples of the free variables are generated Two techniques that are commonly used at Adaptive Importance Sampling and Markov Chain Monte Carlo simulation These techniques basically use adaptive sampling techniques to do an algorithmic search of the model’s vector space For large complex Bayesian Structures, such random sampling can often be the best way to evaluate the network However, random sampling suffers from the fact that as the size of the Bayesian Network increases, increasingly larger sample sets are required to obtain sufficiently accurate statistics 5/11/2015 9

Maseeh College of Engineering and Computer Science Hammerstrom An example of hardware / device support for such a system is the work of Prof. Krishna Palem (Rice University) on probabilistic CMOS (pCMOS)  Prof. Palem has shown that using pCMOS can provide significant performance benefits in implementing Monte Carlo random sampling  pCMOS logic is used to accelerate the generation of random numbers, thereby accelerating the sampling process Monte Carlo techniques are computationally intensive and so still tend to have scaling limitations However, on the plus side, massively parallel evaluations are possible and arithmetic precision requirements are less constrained  So such techniques map cleanly to simpler, massively parallel, low precision computing structures These techniques may also benefit from morphic cores with hardware accelerated random number generation. 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom Distributed Data Representation Networks Bayesian Networks based on distributed data representations (DDR) are actually a different way to structure Bayesian networks, and although analytic and sampling techniques can be used on these structures, they also allow different kinds of massively parallel execution The use of DDR is very promising, but is also the most limited in terms of demonstrations of real applications Computing with DDRs can be thought of as the computational equivalent of spread spectrum communication  In a distributed representation, meaning is not represented by single symbolic units, but is the result of the interaction of a group of units typically configured in a network structure, and often each unit can participate in several representations  Representing data in this manner more easily allows incremental, integrative, decentralized adaptation  The computational and communication loads are spread more evenly across the system  Distributed representation also appears to be an important computational principle in neural systems 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom Biological neural circuits perform inference over huge knowledge structures in fractions of a second It is not clear exactly how they manage this incredible trick, especially since bayesian inference is so computationally intense However, there is some speculation that DDR plays an important role – however, more theory and algorithm development is needed  This is an active area of research and no doubt much progress will be made by the time many Emerging Research Devices become commercially available. One hypothesis is that hierarchical networks lead naturally to the distribution of representation, and subsequently to significant “factorization” of the inference process, this, coupled with massively parallel hardware, may enable entirely new levels of inference capabilities. 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom 13 One such model was developed by Lee and Mumford, who proposed a hierarchical, Bayesian inference model of the primate visual cortex Lee and Mumford Visual cortex model

Maseeh College of Engineering and Computer Science Hammerstrom Another example is the work of Jeff Hawkins and Dileep George at Numenta, Inc. Their model starts with an approximation to a general Bayesian module, which can then be combined into a hierarchy to form what they call a Hierarchical Temporal Memory (HTM) Issues related to hardware architectures for Bayesian Inference and how they may be implemented with emerging devices are now being studied 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom Architecture Mapping Bayesian Networks to a multi-core implementation is straightforward, just implement each node as a task - and in a simple SMP based, multi-core machine that would most certainly provide good performance However, this approach breaks down as we scale to very large networks Bayesian Networks tend to be storage intensive, so implementation issues such as data structure organization, memory management and cache utilization also become important  In fact a potentially serious performance constraint may be access to primary memory, and it is not yet clear how effective caching will be in ameliorating this delay However, as we scale to the very large networks required to solve complex problems, a variety of optimizations become possible and, in fact, necessary 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom One promising massively parallel approach is that of associative processing, which has been shown to approximate Bayesian inference, and which has the potential for huge levels of parallelism Using morphic cores for heterogeneous multi-core structures, such massively parallel implementations of Bayesian networks becomes relevant Another interesting variation is to eliminate synchrony, where inter-module update messages arrive at random times and computation within a module proceeds at its own pace, updating its internal estimates when it receives update messages, otherwise continuing without them More study is needed to explore radical new implementation technologies and how they may be used to do inference 5/11/

Maseeh College of Engineering and Computer Science Hammerstrom Algorithm Family Summary TechniqueParallelism (Threads) Inter-Thread Communicatio n Computational Precision Storage/NodeState of the Art AnalyticModerate HighModerateMature Random Sampling HighLowModerate Mature Distributed / Hierarchies HighLow Preliminary 5/11/