Visiting Scholar, Harvard University Technical Advisor, AFRL

Slides:



Advertisements
Similar presentations
Approaches, Tools, and Applications Islam A. El-Shaarawy Shoubra Faculty of Eng.
Advertisements

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Artificial Intelligence 12. Two Layer ANNs
Formal Computational Skills
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
RELIGION FROM SCIENTIFIC POINT OF VIEW Dr. Leonid Perlovsky Harvard University and AFRL Arizona State University 14 Oct., 2008.
Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Presentation on Artificial Intelligence
MAXIMUM LIKELIHOOD JOINT ASSOCIATION, TRACKING, AND FUSION IN STRONG CLUTTER Leonid Perlovsky Harvard University and the AF Research Lab Seminar Department.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
1 GPS, A Program That Simulates Human Thought 신 동 호신 동 호 Allen Newell & H.A. Simon.
Year 11 Psychology – UNIT 1 Area of Study 1 Revision!
Introduction to Neural Networks Computing
Chapter Thirteen Conclusion: Where We Go From Here.
An Introduction to Artificial Intelligence Presented by : M. Eftekhari.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Learning From Data Chichang Jou Tamkang University.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Maximum likelihood (ML)
Science and Engineering Practices
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Northeastern University,
NEURODYNAMICS OF CONSCIOUSNESS AND CULTURES from a single mind to cultures and coalitions Leonid Perlovsky Visiting Scholar, Harvard University Technical.
Ch1 AI: History and Applications Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.
SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Cognitive Psychology: Thinking, Intelligence, and Language
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
Artificial Intelligence By Michelle Witcofsky And Evan Flanagan.
How Solvable Is Intelligence? A brief introduction to AI Dr. Richard Fox Department of Computer Science Northern Kentucky University.
WHAT IS THE NATURE OF SCIENCE?. SCIENTIFIC WORLD VIEW 1.The Universe Is Understandable. 2.The Universe Is a Vast Single System In Which the Basic Rules.
1 CHAPTER 2 Decision Making, Systems, Modeling, and Support.
1 CS 385 Fall 2006 Chapter 1 AI: Early History and Applications.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Ch. 1: Introduction: Physics and Measurement. Estimating.
Data Mining and Decision Support
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Chapter 1: Introduction. Physics The most basic of all sciences! Physics: The “Parent” of all sciences! Physics: The study of the behavior and the structure.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Computacion Inteligente Least-Square Methods for System Identification.
Artificial Intelligence Hossaini Winter Outline book : Artificial intelligence a modern Approach by Stuart Russell, Peter Norvig. A Practical Guide.
Chapter 8 Thinking and Language.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
PHYSICS OF THE MIND extension of neural networks and psychology to “hard” science Dr. Leonid Perlovsky Northeastern University, Tutorial,
Introduction to Machine Learning, its potential usage in network area,
WSEAS 26 March 2008 Leonid Perlovsky
Knowledge Representation
Introduction Artificial Intelligent.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
CASE − Cognitive Agents for Social Environments
EE513 Audio Signals and Systems
CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano
Introduction to Artificial Intelligence Instructor: Dr. Eduardo Urbina
Presentation transcript:

Visiting Scholar, Harvard University Technical Advisor, AFRL COGNITIVE COMPUTATIONAL INTELLIGENCE for data mining, financial prediction, tracking, fusion, language, cognition, and cultural evolution IASTED CI 2009 Honolulu, HI 1:30 – 5:30 pm, Aug. 19 Leonid Perlovsky Visiting Scholar, Harvard University Technical Advisor, AFRL

OUTLINE 1. Cognition and Logic 2. The Knowledge Instinct -Dynamic Logic 3. Language 4. Integration of cognition and language 5. High Cognitive Functions 6. Evolution of cultures 7. Future directions

INTRODUCTION The mind

PHYSICS AND MATHEMATICS OF THE MIND RANGE OF CONCEPTS Logic is sufficient to explain mind [Newell, “Artificial Intelligence”, 1980s] No new specific mathematical concepts are needed Mind is a collection of ad-hoc principles, [Minsky, 1990s] Specific mathematical constructs describe the multiplicity of mind phenomena “first physical principles of mind” [Grossberg, Zadeh, Perlovsky,…] Quantum computation [Hameroff, Penrose, Perlovsky,…] New unknown yet physical phenomena [Josephson, Penrose]

GENETIC ARGUMENTS FOR THE “FIRST PRINCIPLES” Only 30,000 genes in human genome Only about 2% difference between human and apes Say, 1% difference between human and ape minds Only about 300 proteins Therefore, the mind has to utilize few inborn principles If we count “a protein per concept” If we count combinations: 300300 ~ unlimited => all concepts and languages could have been genetically h/w-ed (!?!) Languages and concepts are not genetically hardwired Because they have to be flexible and adaptive

COGNITION Understanding the world Perception Simple objects Complex situations Integration of real-time signals and existing (a priori) knowledge From signals to concepts From less knowledge to more knowledge

Example: “this is a chair, it is for sitting” Identify objects signals -> concepts What in the mind help us do this? Representations, models, ontologies? What is the nature of representations in the mind? Wooden chairs in the world, but no wood in the brain

VISUAL PERCEPTION Neural mechanisms are well studied Difficulty Projection from retina to visual cortex (geometrically accurate) Projection of memories-models from memory to visual cortex Matching: sensory signals and models In visual nerve more feedback connections than feedforward matching involves complicated adaptation of models and signals Difficulty Associate signals with models A lot of models (expected objects and scences) Many more combinations: models<->pixels Association + adaptation To adapt, signals and models should be matched To match, they should be adapted

ALGORITHMIC DIFFICULTIES A FUNDAMENTAL PROBLEM? Cognition and language involve evaluating large numbers of combinations Pixels -> objects -> scenes Combinatorial Complexity (CC) A general problem (since the 1950s) Detection, recognition, tracking, fusion, situational awareness, language… Pattern recognition, neural networks, rule systems… Combinations of 100 elements are 100100 This number ~ the size of the Universe > all the events in the Universe during its entire life

COMBINATORIAL COMPLEXITY SINCE the 1950s CC was encountered for over 50 years Statistical pattern recognition and neural networks: CC of learning requirements Rule systems and AI, in the presence of variability : CC of rules Minsky 1960s: Artificial Intelligence Chomsky 1957: language mechanisms are rule systems Model-based systems, with adaptive models: CC of computations Chomsky 1981: language mechanisms are model-based (rules and parameters) Current ontologies, “semantic web” are rule-systems Evolvable ontologies : present challenge

CC AND TYPES OF LOGIC CC is related to formal logic Law of excluded middle (or excluded third) every logical statement is either true or false Gödel proved that logic is “illogical,” “inconsistent” (1930s) CC is Gödel's “incompleteness” in a finite system Multivalued logic eliminated the “law of excluded third” Still, the math. of formal logic Excluded 3rd -> excluded (n+1) Fuzzy logic eliminated the “law of excluded third” How to select “the right” degree of fuzziness The mind fits fuzziness for every statement at every step => CC Logic pervades all algorithms and neural networks rule systems, fuzzy systems (degree of fuzziness), pattern recognition, neural networks (training uses logical statements)

LOGIC VS. GRADIENT ASCENT Gradient ascent maximizes without CC Requires continuous parameters How to take gradients along “association”? Data Xn (or) to object m It is a logical statement, discrete, non-differentiable Models / ontologies require logic => CC Multivalued logic does not lead to gradient ascent Fuzzy logic uses continuous association variables, b A new principle is needed to specify gradient ascent along fuzzy associations: dynamic logic

DYNAMIC LOGIC Dynamic Logic unifies formal and fuzzy logic initial “vague or fuzzy concepts” dynamically evolve into “formal-logic or crisp concepts” Dynamic logic based on a similarity between models and signals Overcomes CC fast algorithms Proven in neuroimaging experiments (Bar, 2006) Initial representations-memories are vague “close-eyes” experiment

ARISTOTLE VS. GÖDEL logic, forms, and language Logic: a supreme way of argument Forms: representations in the mind Form-as-potentiality evolves into form-as-actuality Logic is valid for actualities, not for potentialities (Dynamic Logic) Thought language and thinking are closely linked Language contains the necessary uncertainty From Boole to Russell: formalization of logic Logicians eliminated from logic uncertainty of language Hilbert: formalize rules of mathematical proofs forever Gödel (the 1930s) Logic is not consistent Any statement can be proved true and false Aristotle and Alexander the Great

OUTLINE Cognition, complexity, and logic Language Logic does not work, but the mind does The Mind and Knowledge Instinct Neural Modeling Fields and Dynamic Logic Language Integration of cognition and language Higher Cognitive Functions Future directions

STRUCTURE OF THE MIND Concepts Instincts Emotions Behavior Hierarchy Models of objects, their relations, and situations Evolved to satisfy instincts Instincts Internal sensors (e.g. sugar level in blood) Emotions Neural signals connecting instincts and concepts e.g. a hungry person sees food all around Behavior Models of goals (desires) and muscle-movement… Hierarchy Concept-models and behavior-models are organized in a “loose” hierarchy

THE KNOWLEDGE INSTINCT Model-concepts always have to be adapted lighting, surrounding, new objects and situations even when there is no concrete “bodily” needs Instinct for knowledge and understanding Increase similarity between models and the world Emotions related to the knowledge instinct Satisfaction or dissatisfaction change in similarity between models and world Related not to bodily instincts harmony or disharmony (knowledge-world): aesthetic emotion

REASONS FOR PAST LIMITATIONS Human intelligence combines conceptual understanding with emotional evaluation A long-standing cultural belief that emotions are opposite to thinking and intellect “Stay cool to be smart” Socrates, Plato, Aristotle Reiterated by founders of Artificial Intelligence [Newell, Minsky]

Neural Modeling Fields (NMF) A mathematical construct modeling the mind Neural synaptic fields A loose hierarchy bottom-up signals, top-down signals At every level: concepts, emotions, models, behavior Concepts become input signals to the next level

Top-down signals (concept-models) NEURAL MODELING FIELDS basic two-layer mechanism: from signals to concepts Bottom-up signals Pixels or samples (from sensor or retina) x(n), n = 1,…,N Top-down signals (concept-models) Mm(Sm,n), parameters Sm, m = 1, …; Models predict expected signals from objects Goal: learn object-models and match to signals (knowledge instinct)

THE KNOWLEDGE INSTINCT The knowledge instinct = maximize similarity between signals and models Similarity between signals and models, L L = l ({x}) = l (x(n)) l (x(n)) = r(m) l (x(n) | Mm(Sm,n)) l (x(n) | Mm(Sm,n)) is a conditional similarity for x(n) given m {n} are not independent, M(n) may depend on n’ CC: L contains MN items: all associations of pixels and models (LOGIC)

SIMILARITY l (x(n) | Mm(Sm,n)) = pdf(x(n) | Mm(Sm,n)), Similarity as likelihood l (x(n) | Mm(Sm,n)) = pdf(x(n) | Mm(Sm,n)), a conditional pdf for x(n) given m e.g., Gaussian pdf(X(n)|m) = G(X(n)|Mm,Cm) = 2p-d/2 detCm-1/2 exp(-DmnTCm-1 Dmn/2); Dmn = X(n) – Mm(n) Note, this is NOT the usual “Gaussian assumption” deviations from models D are random, not the data X multiple models {m} can model any pdf, not one Gaussian model Use for sets of data points Similarity as information l (x(n) | Mm(Sm,n)) = abs(x(n))*pdf(x(n) | Mm(Sm,n)), a mutual information in model m on data x(n) L is a mutual information in all model about all data Use for continuous data (signals, images)

DYNAMIC LOGIC (DL) non-combinatorial solution Start with a set of signals and unknown object-models any parameter values Sm associate signals (n) and models (m) (1) f(m|n) = r(m) l (n|m) / r(m') l (n|m') Improve parameter estimation (2) Sm = Sm + a f(m|n) [ln l (n|m)/Mm]*[Mm/Sm] Continue iterations (1)-(2). Theorem: NMF is a converging system - similarity increases on each iteration - aesthetic emotion is positive during learning

OUTLINE Cognition, complexity, and logic Language Logic does not work, but the mind does The Mind and Knowledge Instinct Neural Modeling Fields and Dynamic Logic Application examples Language Integration of cognition and language Higher Cognitive Functions Future directions

APPLICATIONS Many applications have been developed Government Medical Commercial (about 25 companies use this technology) Sensor signals processing and object recognition Variety of sensors Financial market predictions Market crash on 9/11 predicted a week ahead Internet search engines Based on text understanding Evolving ontologies for Semantic Web Every application needs models Future self-evolving models: integrated cognition and language

APPLICATION 1 – CLUSTERING (data mining) Find “natural” groups or clusters in data Use Gaussian pdf and simple models l (n|m) = 2p-d/2 detCm-1/2 exp(-DmnTCm-1 Dmn/2); Dmn = X(n) – Mm(n) Mm(n) = Mm; each model has just 1 parameter, Sm = Mm This is clustering with Gaussian Mixture Model For complex l(n|m) derivatives can be taken numerically For simple l(n|m) derivatives can be taken manually Simplification, not essential Simplify parameter estimation equation for Gaussian pdf and simple models ln l (n|m)/Mm =  (-DmnTCm-1 Dmn) /Mm = Cm-1 Dmn + DmnT Cm-1 = 2 Cm-1 Dmn, (C is symmetric) Mm = Mm + a f(m|n) Cm-1 Dmn … In this case, even simpler equations can be derived samples in class m: Nm = f(m|n); N = Nm rates (priors): rm = Nm / N means: Mm = f(m|n) X(n) / Nm covariances: Cm = f(m|n) Dmn * DmnT / Nm - simple interpretation: Nm, Mm, Cm are weighted averages. The only difference from standard mean and covariance estimation is weights f(m|n), probabilities of class m These are iterative equations, f(m|n) depends on parameters; theorem: iterations converge

Example 2: GMTI Tracking and Detection Below Clutter DL starts with uncertain knowledge and converges rapidly on exact solution 1 km Cross-Range Range (a) True Tracks b Initial state of model 2 iterations 5 iterations 9 iterations 12 iterations Converged state Detection and tracking targets below clutter using GMTI radar: (a) true track positions in 0.5km x 0.5km data set; (b) actual data available for detection and tracking (signal is below clutter, signal-to-clutter ratio is about –2dB for amplitude and –3dB for Doppler; 6 scans are shown on top of each other). Dynamic logic operation: (c) an initial fuzzy model, the fuzziness corresponds to the uncertainty of knowledge; (d) to (h) show increasingly improved models at various iterations (total of 20 iterations). Between (c) and (d) the algorithm fits the data with one model, uncertainty is somewhat reduced. There are two types of models: one uniform model describing clutter (it is not shown), and linear track models with large uncertainty; the number of track models, locations, and velocities are estimated from the data. Between (d) and (e) the algorithm tried to fit the data with more than one track-model and decided, that it needs two models to ‘understand’ the content of the data. Fitting with 2 tracks continues till (f); between (f) and (g) a third track is added. Iterations stopped at (h), when similarity stopped increasing. Detected tracks closely correspond to the truth (a). Complexity of this solution is low, about 106 operations. Solving this problem by template matching (with evaluating combinations of various associations – this is a standard state-of-the-art) would take about 10200 operations, unsolvable. 18 dB improvement

TRACKING AND DETECTION BELOW CLUTTER (movie, same as above) DL starts with uncertain knowledge, and similar to human mind does not sort through all possibilities, but converges rapidly on exact solution 3 targets, 6 scans, signal-to-clutter, S/C ~ -3.0dB

complexity and improvement TRACKING EXAMPLE complexity and improvement Technical difficulty Signal/Clutter = - 3 dB, standard tracking requirements 15 dB Computations, standard hypothesis testing ~ 101700, unsolvable Solved by Dynamic Logic Computations: 2x107 Improvement 18 dB

CRAMER-RAO BOUND (CRB) Can a particular set of models be estimated from a particular (limited) set of data? The question is not trivial A simple rule-of-thumb: N(data points) > 10*S(parameters) In addition: use your mind: is there enough information in the data? CRB: minimal estimation error (best possible estimation) for any algorithm or neural neworks, or… When there are many data points, CRB is a good measure (=ML=NMF) When there are few data points (e.g. financial prediction) it might be difficult to access performance Actual errors >> CRB Simple well-known CRB for averaging several measurements st.dev(n) = st.dev(1)/√n Complex CRB for tracking: Perlovsky, L.I. (1997a). Cramer-Rao Bound for Tracking in Clutter and Tracking Multiple Objects. Pattern Recognition Letters, 18(3), pp.283-288.

FINDING PATTERNS IN IMAGES APPLICATION 3 FINDING PATTERNS IN IMAGES

IMAGE PATTERN BELOW NOISE Object Image Object Image + Clutter y y x x

PRIOR STATE-OF-THE-ART Computational complexity Multiple Hypothesis Testing (MHT) approach: try all possible ways of fitting model to the data For a 100 x 100 pixel image: Number of Objects Number of Computations 1 1010 2 1020 3 1030

NMF MODELS Information similarity measure Clutter concept-model (m=1) lnl (x(n) | Mm(Sm,n)) = abs(x(n))*ln pdf(x(n) | Mm(Sm, n)) n = (nx,ny) Clutter concept-model (m=1) pdf(X(n)|1) = r1 Object concept-model (m=2… ) pdf(x(n) | Mm(Sm, n)) = r2 G(X(n)|Mm (n,k),Cm) Mm (n,k) = n0 + a*(k2,k); (note: k, K require no estimation)

ONE PATTERN BELOW CLUTTER Y X SNR = -2.0 dB

DYNAMIC LOGIC WORKING DL starts with uncertain knowledge, and similar to human mind converges rapidly on exact solution Object invisible to human eye By integrating data with the knowledge-model DL finds an object below noise y (m) Range x (m) Cross-range

MULTIPLE PATTERNS BELOW CLUTTER Three objects in noise object 1 object 2 object 3 SCR - 0.70 dB -1.98 dB -0.73 dB 3 Object Image 3 Object Image + Clutter y y x x

IMAGE PATTERNS BELOW CLUTTER (dynamic logic iterations see note-text) f e h g Detection of slow-moving targets using SAR radar. Moving targets appear in SAR images as ‘smile’ and ‘frown’ patterns: (a) true ‘smile’ and ‘frown’ patterns shown without clutter; (b) actual image available for recognition (signal is below clutter, signal-to-clutter ratio is between –2dB and –0.7dB); (c) an initial fuzzy model, the fuzziness corresponds to the uncertainty of knowledge; (d) to (h) show increasingly improved models at various iterations (total of 22 iterations). Between (d) and (e) the algorithm tried to fit the data with more than one model and decided, that it needs three models to ‘understand’ the content of the data. There are several types of models: one uniform model describing the clutter (it is not shown), and a variable number of blob models and parabolic models, which number, locations, and curvatures are estimated from the data. Until about (g) the algorithm ‘thought’ in terms of simple blob models, at (g) and beyond, the algorithm decided that it needs more complex parabolic models to describe the data. Iterations stopped at (h), when similarity stopped increasing. Complexity of this solution is moderate, about 1010 operations. Solving this problem by template matching would take a prohibitive 1030 to 1040 operations. (This example is discussed in more details in [[i]].) [i] Linnehan, R., Mutz, Perlovsky, L.I., C., Weijers, B., Schindler, J., Brockett, R. (2003). Detection of Patterns Below Clutter in Images. Int. Conf. On Integration of Knowledge Intensive Multi-Agent Systems, Cambridge, MA Oct.1-3, 2003. Logical complexity = MN = 105000, unsolvable; DL complexity = 107 S/C improvement ~ 16 dB

MULTIPLE TARGET DETECTION DL WORKING EXAMPLE DL starts with uncertain knowledge, and similar to human mind does not sort through all possibilities like an MHT, but converges rapidly on exact solution y x

COMPUTATIONAL REQUIREMENTS Number of Computations COMPARED Dynamic Logic (DL) vs. Classical State-of-the-art Multiple Hypothesis Testing (MHT) Based on 100 x 100 pixel image Number of Objects Number of Computations DL vs. MHT 1 2 3 108 vs. 1010 2x108 vs. 1020 3x108 vs. 1030 Previously un-computable (1030), can now be computed (3x108 ) This pertains to many complex information-finding problems

Concurrent fusion, navigation, and detection APPLICATION 4 SENSOR FUSION Concurrent fusion, navigation, and detection below clutter

SENSOR FUSION The difficult part of sensor fusion is association of data among sensors Which sample in one sensor corresponds to which sample in another sensor? If objects can be detected in each sensor individually Still the problem of data association remains Sometimes it is solved through coordinate estimation If 3-d coordinates can be estimated reliably in each sensor Sometimes it is solved through tracking If objects could be reliably tracked in each sensor, => 3-d coordinates If objects cannot be detected in each sensor individually We have to find the best possible association among multiple samples This is most difficult: concurrent detection, tacking, and fusion

NMF/DL SENSOR FUSION NMF/DL for sensor fusion requires no new conceptual development Multiple sensor data require multiple sensor models Data: n -> (s,n); X(n) -> X(s,n) Models Mm(n) -> Mm(s,n) PDF(n|m) is a product over sensors This is a standard probabilistic procedure, another sensor is like another dimension pdf(m|n) -> pdf(m|s,n) Note: this solves the difficult problem of concurrent detection, tracking, and fusion

Source: UAS Roadmap 2005-2030 UNCLASSIFIED

CONCURRENT NAVIGATION, FUSION, AND DETECTION multiple target detection and localization based on data from multiple micro-UAVs A complex case detection requires fusion (cannot be done with one sensor) fusion requires exact target position estimation in 3-D target position can be estimated by triangulation from multiple views this requires exact UAV position GPS is not sufficient UAV position - by triangulation relative to known targets therefore target detection and localization is performed concurrently with UAV navigation and localization, and fusion of information from multiple UAVs Unsolvable using standard methods. Dynamic logic can solve because computational complexity scales linearly with number of sensors and targets

GEOMETRY: MULTIPLE TARGETS, MULTIPLE UAVS UAV m Xm = X0m + Vmt UAV 1 Xm=(Xm,Ym,Zm) X1=(X1,Y1,Z1) X1 = X01 + V1t The crux of the problem is “data association”, that is how to program a computer to decide which signatures from the total collection of images correspond to the same physical object, or “target”. This is the crux of the problem, and we need to accomplish it in a computationally feasible manner, without performing a combinatorial search. To illustrate the problem, I’m showing here two photos of the same group of trees, taken from different camera positions. If you consider a particular tree in the first image, how do you identify its corresponding image in the second photo? Since there are roughly 50 trees, there are around 50 factorial (i.e. 3x1064) different mappings between the trees in the 2 photos. Therefore “brute force” target association is out of the question. Furthermore, the problem becomes exponentially worse if you increase the number of images. Unfortunately, the standard methods for multi-target tracking utilize multiple hypothesis testing, which is subject to a combinatorial search.

CONDITIONAL SIMILARITIES (pdf) FOR TARGET k Data from UAV m, sample number n, where βnm = signature position and fnm = classification feature vector: Similarity for the data, given target k: signature position where Now we show some equations describing what I just talked about. Each data sample consists of two items, the signature position in the focal plane (\beta_nm), and the corresponding vector of classification features computed during preprocessing (f_nm). The pdf, conditional on target k, consists of the product of the conditional pdfs for beta and f, which we model as Gaussian functions. Note that the function describing the classification feature distribution is just a standard Gaussian function in which the parameters are the mean vector and covariance matrix. In contrast, the function describing the beta distribution incorporates or sensor model shown several slides back. Here the parameters include the target positions as well as the UAV positions and velocities. Hence by estimating the parameters we perform the target and UAV localization and, at the same time, we estimate mean and covariance values that are used for detection and classification. Note that in addition to the target pdfs, we also have a single clutter pdf in the mixture which is uniform in \beta and Gaussian in f. classification features Note: Also have a pdf for a single clutter component pdf(wnm| k=0) which is uniform in βnm, Gaussian in fnm.

Data Model and Likelihood Similarity Total pdf of data samples is the summation of conditional pdfs (summation over targets plus clutter) (mixture model) classification feature parameters UAV parameters target parameters Compute parameters that maximize the log-likelihood The total pdf of the data is the summation over conditional pdfs, that is, the sum over target and clutter sub-models. To evaluate how well the data fits the model we use the standard log-likelihood metric shown here. The best parameters are found by maximizing the log-likelihood function, and we do this in the usual manner by setting to zero the partial derivatives with respect to the parameters.

Concurrent Parameter Estimation / Signature Association (NMF iterations) FIND SOLUTION FOR SET OF “BEST” PARAMETERS BY ITERATING BETWEEN… Parameter Estimation and Association Probability Estimation (Bayes rule) Thus we end up with the system of equation shown on the left side. The mean and covariance parameters can be computed directly and, in fact are similar to the standard equations for sample mean and covariance, except that here the samples are weighted by their probabilities of being associated with a particular target. The other parameters are found by inverting a coupled set of linear equations. The problem is that the expressions for the parameters on the left depend on the association probabilities described by the equation on the right. Conversely, the expression for association probabilities depends upon the unknown parameters. This is the classic, and well-studied, problem of estimating the parameters of a mixture model from unlabeled samples, and the standard solution is to iterate between the parameter equations on the left and the association probability equation on the right. This method is guaranteed to converge, in the sense that the log-likelihood will never decrease from one iteration to the next. It’s actually a special case of expectation maximization, which has been well-studied. (probability that sample wnm was generated by target k) Note1: bracket notation Note2: proven to converge (e.g. EM algorithm) Note 3: Minimum MSE solution incorporates GPS measurements

Sensor 1 (of 3): Models Evolve to Locate Target Tracks in Image Data

Sensor 2 (of 3): Models Evolve to Locate Target Tracks in Image Data

Sensor 3 (of 3): Models Evolve to Locate Target Tracks in Image Data

NAVIGATION, FUSION, TRACKING, AND DETECTION (this is the basis for the previous 3 figures, all fused in x,y,z, coordinates; double-click on the blob to play movie) Now we have a movie showing the evolution of the system over 20 iterations, this time in the space of the target coordinates rather than the space of data as shown in the previous slide. Note that, as before, we start in a fuzzy and uncertain state, then gradually the system converges to precise estimates for the target locations.

Model Parameters Iteratively Adapt to Locate the Targets Estimated Target Position vs. iteration# (4 targets) Error vs. iteration# (4 targets)

Error falls off as ~ 1/√M, where M = # UAVs in the swarm Parameter Estimation Errors Decrease with Increasing Number of UAVs in the Swarm Error in Parameter Estimates vs. clutter level and # of UAVs in the swarm Target position UAV position These next plots demonstrate that there is a clear advantage for having a swarm of UAVs vs. a single UAV acting independently. On the left is a plot of the error standard deviation (based on Monte-Carlo simulations) for the target elevation estimate z_k, vs. # of UAVs in the swarm and # of targets. Similarly, the right-hand figure plots the error std for UAV position estimates vs. the # of UAVs. Also in the right-hand plot we show the error that would be obtained if the UAV tracks were estimated simply by performing linear regression on the set of GPS data, separately for each UAV. For a single UAV, the curves based on simple linear regression roughly intersect the curves from our fuzzy logic method, however as the # of UAVs increases, our curves fall off by roughly a factor of 1/\sqrt{M}, where M is the # of UAVs. In a sense, the information in the camera images provides a link between the GPS data from all UAVs. Thus, each additional UAV contributes additional GPS data that can be averaged along with the total bin of data to improve the position estimates. We speculate that detection and discrimination performance will improve in a similar fashion as additional UAVs are added to the swarm, and we will investigate this issue in the future. (Note: Results are based upon Monte Carlo simulations with synthetic data) Error falls off as ~ 1/√M, where M = # UAVs in the swarm

DETECTION IN SEQUENCES OF IMAGES APPLICATION 5 DETECTION IN SEQUENCES OF IMAGES

DETECTION IN A SEQUENCE OF IMAGES Signature + low noise level (SNR= 25dB) Signature + high noise level (SNR= -6dB) signature is present, but is obscured by noise

DETECTION IN IMAGE SEQUENCE TEN ROTATION FRAMES WERE USED Iteration 10 Iteration 100 Iteration 400 Upon convergence of the model, important parameters are estimated, including center of rotation, which will next be used for spectrum estimation. Four model components were used, including a uniform background component. Only one component became associated with point source. Compare with Measured Image (w/o noise) Iteration 600

Extracted from low noise image Extracted from high noise image TARGET SIGNATURE Extracted from low noise image Extracted from high noise image

APPLICATION 6 Radar Imaging through walls - Inverse scattering problem - Standard radar imaging algorithms (SAR) do not work because of multi-paths, refractions, clutter

SCENARIO

RADAR IMAGING THROUGH WALLS Standard SAR imaging does not work Because of refraction, multi-paths and clutter Estimated model, work in progress Remains: -increase convergence area -increase complexity of scenario -adaptive control of sensors

INTEGRATED INFORMATION: DYNAMIC LOGIC / NMF INTEGRATED INFORMATION: objects; relations; situations; behavior Data and Signals Dynamic Logic combining conceptual analysis with emotional evaluation MODELS - objects - relations - situations - behavioral A novel algorithm combines data and knowledge across multiple domains. The data come from multiple sensors and intelligent sources. The knowledge has diverse forms, and can be represented as phenomenological models: target signature models, track models, clutter models, but also models of human behavior, action models, both enemy models and models of our response. In the past attempts to combine all this data and extract information encountered unsolvable computational complexity (combinatorial complexity). New algorithm relying on dynamic logic has a potential for solving this problem using existing comuters (by avoiding combinatorial complexity). The integrated information results: targets are separated from clutter, their tracks and locations are identified. In a similar way the new algorithm has a potential for identifying enemy plans and warfighter actions, while selecting the best warfighter response.

CLASSICAL METHODOLGY no closure Result: Conceptual objects MODELS/templates objects, sensors physical models Recognition signals Sensors / Effectors Input: World/scene

NMF: closure basic two-layer hierarchy: signals and concepts Result: Conceptual objects Correspondence / Similarity measures Attention / Action signals Sim.signals MODELS objects, sensors physical models signals Sensors / Effectors Input: World/scene

APPLICATION 7 Prediction - Financial prediction

PREDICTION Simple: linear regression y(x) = Ax+b Multi-dimensional regression: y,x,b are vectors, A is a m-x Problem: given {y,x}, estimate A,b Solution to linear regression (well known) Estimate means <y>, <x>, and x-y covariance matrix C A = Cyx Cxx-1; b = <y> - A<x> Difficulties Non-linear y(x), unknown shape y(x) changes regime (from up to down) and this is the most important event (financial prediction) No sufficient data to estimate C required ~10*dx+y3 data points, or more

NMF/DL PREDICTION General non-linear regression (GNLR) y(x) = f(m|n) ym(x) = f(m|n) (Amx+bm) Amand bm are estimated similar to A,b in linear regression with the following change: all (…) are changed into f(m|n)(…) For prediction, we remember that f(m|n) = f(m|x) Interpretation m are “regimes” or “processes”, f(m|x) determines influence of regime m at point x (probability of process m being active) Applications Non-linear y(x), unknown shape Detection of y(x) regime change (e.g. financial prediction or control) Minimal number of parameters: 2 linear regressions; f(m|n) are functions of the same parameters Efficient estimation (ML) Potential for the fastest possible detection of a regime change

FINANCIAL PREDICTION Efficient Market Hypothesis Efficient market hypothesis, strong: no method for data processing or market analysis will bring advantage over average market performance (only illegal trading on nonpublic material information will get one ahead of the market) Reasoning: too many market participants will try the same tricks Efficient market hypothesis, week: to get ahead of average market performance one has to do something better than the rest of the world: better math. methods, or better analysis, or something else (it is possible to get ahead of the market legally)

FINANCIAL PREDICTION BASICS OF MATH. PREDICTION Basic idea: train from t1 to t2, predict and trade on t2+1; increment: t1->t1+1, t2->t2+1; … Number of data points between t1 to t2 should be >> number of parameters Decide on frequency of trading, it should correspond to your psychological makeup and practical situation E.g. day-trading has more potential for making (or losing) a lot of money fast, but requires full time commitment Get past data, split into 3 sets: (1)developing, (2)testing, (3)final test (best, in real time, paper trades) After much effort on (1), try on (2), if work, try on (3)

FINANCIAL MARKET PREDICITION

BIOINFORMATICS Many potential applications Drug design combinatorial complexity of existing algorithms Drug design Diagnostics: which gene / protein is responsible Pattern recognition Identify a pattern of genes responsible for the condition Relate sequence to function Protein folding (shape) Relate shape to conditions Many basic problems are solved sub-optimally (combinatorial complexity) Alignment Dynamic system of interacting genes / proteins Characterize Relate to conditions

NMF/DL FOR COGNITION SUMMARY Integrating knowledge and data / signals Knowledge = concepts = models Knowledge instinct = similarity(models, data) Aesthetic emotion = change in similarity Emotional intelligence combination of conceptual knowledge and emotional evaluation Applications Recognition, tracking, fusion, prediction…

OUTLINE Cognition, complexity, and logic The Mind and Knowledge Instinct Higher models and Language Integration of cognition and language Higher Cognitive Functions Future directions

LANGUAGE ACQUISITION AND COMPLEXITY Chomsky: linguistics should study the mind mechanisms of language (1957) Chomsky’s language mechanisms 1957: rule-based 1981: model-based (rules and parameters) Combinatorial complexity For the same reason as all rule-based and model-based methods

APPLICATION: SEARCH ENGINE BASED ON UNDERSTANDING Goal-instinct: Find conceptual similarity between a query and text Analyze query and text in terms of concepts “Simple” non-adaptive techniques By keywords By key-sentences = set of words Define a sequence of words (“bag of words”) Compute coincidences between the bag and the document Instead of the document use chunks of 7 or 10 words How to learn useful sentences?

NMF OF SET-MODELS Next level in the hierarchy above patterns Situations are sets of objects (and relations) Language - sets of words (and relations-grammar) Say, we know how to find objects and words oi, wi model-set: Mm({oi}) = (Leonid, chair, sit) – how to take derivatives? Models of sets l (x(n) | Mm(Sm,n)) = pmix(ni) (1 -pmi(1-x(ni)) ) Data { x(n,i) }, 0 or 1 for absent or present objects Parameters { p(m,i) } between 0 and 1, to be estimated Probabilities of object i present in situation m Vague models, p = 0.5; exact p = 0 or 1 Learning difficulty: most of objects are irrelevant

EXAMPLE Total number of objects = 1000 Total number of situations = 10 Number of objects in a situation = 50 Number of relevant objects in a situation = 10 Number of examples of each situation = 800 Number of examples of random sets = 8,000 (50%)

DATA data samples (horizontal axis) are sorted by situations hence the horizontal lines for repeated objects

data samples (horizontal axis) are random as in real life

DL LEARNING (in 3 iterations)

ERRORS

f(m|n) f(m’|n), m=true, m’=computed ASSOCIATIONS f(m|n)* f(m’|n), f(m|n) f(m’|n), m=true, m’=computed

OUTLINE Cognition, complexity, and logic The Mind and Knowledge Instinct Language Integration of cognition and language Higher Cognitive Functions Future directions

WHAT WAS FIRST COGNITION OR LANGUAGE? How language and thoughts come together? Language seems completely conscious A child at 5 knows about “good” and “bad” guys Philosophers and theologists discussed good and evil for millennia What are neural mechanisms? How do we learn correct associations between words and objects? Among zillions of incorrect ones Logic: Same mechanisms for L. & C. Does not work DL: sub-conceptual, sub-conscious integration

LANGUAGE vs. COGNITION “Nativists”, - since the 1950s - Language is a separate mind mechanism (Chomsky) - Pinker: language instinct “Cognitivists”, - since the 1970s Language depends on cognition Talmy, Elman, Tomasello… “Evolutionists”, - since the 1980s - Hurford, Kirby, Cangelosi… - Language transmission between generations Co-evolution of language and cognition

INTEGRATED LANGUAGE AND COGNITION Language and cognition: the dual model Every model m has linguistic and cognitive-sensory parts Mm = { Mmcognitive,Mmlanguage }; Language and cognition are fused at vague pre-conceptual level before concepts are learned Joint evolution of language and cognition Newborn mind: initial models are vague placeholders Language is acquired ready-made from culture Language guides cognition Language hides from us how vague are our thoughts With opened eyes it is difficult to recollect vague imaginations Language is like eyes for abstract concepts Usually we talk without full understanding (like kids)

INNER LINGUISTIC FORM HUMBOLDT, the 1830s In the 1830s Humboldt discussed two types of linguistic forms words’ outer linguistic form (dictionary) – a formal designation and inner linguistic form (???) – creative, full of potential This remained a mystery for rule-based AI, structural linguistics, Chomskyan linguistics rule-based approaches using the mathematics of logic make no difference between formal and creative In NMF / DL there is a difference static form of learned (converged) concept-models dynamic form of vague-fuzzy concepts, with creative learning potential, emotional content, and unconscious content

OUTLINE Cognition, complexity, and logic The Mind and Knowledge Instinct Language Integration of cognition and language Higher Cognitive Functions Future directions

HIGHER COGNITIVE FUNCTIONS Abstract models are at higher levels of hierarchy create higher meaning and purpose from lower models vague-fuzzy, less conscious Emotion of the beautiful when improve knowledge of the highest model = purpose of life objects situations meanings Similarity measures Models Action/Adaptation

BEAUTY Harmony is an elementary aesthetic emotion The highest forms of aesthetic emotion, beautiful related to the most general and most important models models of the meaning of our existence, of our purposiveness beautiful object stimulates improvement of the highest models of meaning Beautiful “reminds” us of our purposiveness Kant called beauty “aimless purposiveness”: not related to bodily purposes he was dissatisfied by not being able to give a positive definition: knowledge instinct absence of positive definition remained a major source of confusion in philosophical aesthetics till this very day Beauty is separate from sex, but sex makes use of all our abilities, including beauty

INTUITION Complex states of perception-feeling of unconscious fuzzy processes involves fuzzy unconscious concept-models in process of being learned and adapted toward crisp and conscious models, a theory conceptual and emotional content is undifferentiated such models satisfy or dissatisfy the knowledge instinct before they are accessible to consciousness, hence the complex emotional feel of an intuition Artistic intuition composer: sounds and their relationships to psyche painter: colors, shapes and their relationships to psyche writer: words and their relationships to psyche

INTUITION: Physics vs. Math. Mathematical intuition is about Structure and consistency within the theory Relationships to a priori content of psyche Physical intuition is about The real world, first principles of its organization, and mathematics describing it Beauty of a physical theory discussed by physicists Related to satisfying knowledge instinct the feeling of purpose in the world

OUTLINE Cognition, complexity, and logic The Mind and Knowledge Instinct Language Integration of cognition and language Higher Cognitive Functions Future directions

WHY ADAM WAS EXPELLED FROM PARADISE? God gave Adam the mind, but forbade to eat from the Tree of Knowledge All great philosophers and theologists from time immemorial pondered this Maimonides, 12th century God wants people to think for themselves (true or false) Adam wanted ready-made knowledge (good or bad) Thinking for oneself is difficult (this is our predicament) Today we can approach this scientifically Rarely we use the KI Often we use ready-made heuristics, rules-of-thumb Both are evolutionary adaptations Cognitive effort minimization (CEM) is opposite to the KI 2002 Nobel Prize in Economics (work of Kahneman and Tversky) People’s choices are often irrational Like Adam we use rules = cultural wisdom, not our own

GOD, SNAKE, and fMRI Majority often make irrational, heuristic choices (CEM-type) Stable minority is rational (KI-type) fMRI KI-type think with cortex (uniquely human) CEM-type think with amygdala (animals) God demands us being humans “Snake’s apple” pulled Adam back to animals

SYMBOL “A most misused word in our culture” (T. Deacon) Cultural and religious symbols Provoke wars and make piece Traffic Signs

SIGNS AND SYMBOLS mathematical semiotics Signs: stand for something else non-adaptive entities (mathematics, AI) brain signals insensitive to context (Pribram) Symbols Symbols=signs (mathematics, AI: mix up) general culture: deeply affect psyche psychological processes connecting conscious and unconscious (Jung) brain signals sensitive to context (Pribram) processes of sign interpretation DL: mathematics of symbol-processes Vague-unconscious -> crisp-conscious

SYMBOLS and “SYMBOLIC AI” Founders of “symbolic AI” believed that by using “symbolic” mathematical notations they would penetrate into the mystery of mind But mathematical symbols are just notations (signs) Not psychic processes This explains why “symbolic AI” was not successful This also illustrates the power of language over thinking Wittgenstein called it “bewitchment (of thinking) by language”

SYMBOLIC ABILITY language cognition Integrated hierarchies of Cognition and Language High level cognition is only possible due to language Much of cognition if vague and unconscious cognition language grounded in real-world objects grounded in language Similarity Action Similarity Action M M M M

OUTLINE Cognition, complexity, and logic The Mind and Knowledge Instinct Language Integration of cognition and language Higher Cognitive Functions Future directions - Evolution of languages and cultures

CULTURE AND LANGUAGE Animal consciousness Undifferentiated, few vague concepts No mental “space” between thought, emotion, and action Evolution of human consciousness and culture More differentiated concepts More mental “space” between thoughts, emotions, and actions Created by evolution of language Language, concepts, emotions Language creates concepts Still, colored by emotions 16-Sep-05 102

EVOLUTION OF CULTURES The knowledge instinct Two mechanisms: differentiation and synthesis Differentiation At every level of the hierarchy: more detailed concepts Separates concepts from emotions Synthesis Knowledge has to make meaning, otherwise it is useless Diverse knowledge is unified at the higher level in the hierarchy Connects concepts and emotions Connect language and cognition Connect high and low: concepts acquire meaning at the next level 16-Sep-05 103

DYNAMICS OF DIFFERENTIATION AND SYNTHESIS Differentiation, D New knowledge comes from differentiating old knowledge, Speed of change of D ~ D Differentiation continues if knowledge is useful (emotional) Speed of change of D ~ - S Differentiation stops if knowledge is “too” emotional Speed of change of D ~ 0, if S id “too large” Synthesis, S Emotional value of knowledge Emotions per concept diminish with more concepts Speed of change of S ~ -D Synthesis grows in the hierarchy (H) Speed of change of S ~ H 16-Sep-05 104

CULTURAL STATES CAN BE MEASURED Differentiation Number of words Synthesis Emotions per word Hierarchy Social, political, cultural, language “Material” measures Demographics, geopolitics, natural resources… Ignore for a moment

MODELING “spiritual aspects” of CULTURAL EVOLUTION Differentiation, synthesis, hierarchy dD/dt = a D G(S); G(S) = (S - S0) exp(-(S-S0) / S1) dS/dt = -bD + dH H = H0 + e*t

KNOWLEDGE-ACQUIRING CULTURE Average synthesis, high differentiation; oscillating solution Knowledge accumulates; no stability

TRADITIONAL CULTURE High synthesis, low differentiation; stable solution Stagnation, stability increases

TERRORIST’S CONSCIOUSNESS Ancient consciousness was “fused” Concepts, emotions, and actions were one Undifferentiated, fuzzy psychic structures Psychic conflicts were unconscious and projected outside Gods, other tribes, other people Complexity of today’s world is “too much” for many Evolution of culture and differentiation Internalization of conflicts: too difficult Reaction: relapse into fused consciousness Undifferentiated, fuzzy, but simple and synthetic The recent terrorist’s consciousness is “fused” European terrorists in the 19th century Fascists and communists in the 20th century Current Moslem terrorists

Knowledge accumulation + stability INTERACTING CULTURES Early: Dynamic culture affects traditional culture, no reciprocity Later: 2 dynamic cultures stabilize each other Knowledge accumulation + stability

FUTURE SIMULATIONS OF EVOLUTION Genetic evolution simulations (1980s - ) Used basic genetic mechanisms Artificial Life, evolution models: Bak and Sneppen, Tierra, Avida Evolution of cultural concepts Genes vs. memes (cultural concepts) Evolution of concepts vs. evolution of genes Culture evolves much faster than genetic evolution Human culture is <10,000 years, likely, no genetic evolution (?) Evolution of languages Concepts evolve from fuzzy to crisp and specific Concepts evolve into a hierarchy Concepts are propagated through language

MECHANISMS OF CONCEPT EVOLUTION Differentiation, synthesis, and language transmission Differentiation Fuzzy contents become detail and clear A priori models, archetypes are closely connected to unconscious needs, to emotions, to behavior Concepts have meanings Cultural and generational propagation of concepts through language Integration of language and cognition is not perfect Language instinct is separate from knowledge instinct Propagation of concepts through language A newborn child encounters highly-developed language Synthesis: cognitive and language models {MC, ML} are connected individually No guarantee that language model-concepts are properly integrated with the adequate cognitive model-concepts in every individual And we know this imperfection occurs in real life Meanings might be lost Some people speak well, but do not quite understand and v.v.

SPLIT BETWEEN CONCEPTUAL AND EMOTIONAL Dissociation between language and cognition Might prevail for the entire culture Words maintain their “formal” meanings Relationships to other words Words loose their “real” meanings Connection to cognition, to unconscious and emotions Conceptual and emotional dissociate Concepts are sophisticated but “un-emotional” Language is easy to use to say “smart” things but they are meaningless, unrelated to instinctual life

CREATIVITY At the border of conscious and unconscious Archetypes should be connected to consciousness To be useful for cognition Collective concepts–language should be connected to The wealth of conceptual knowledge (other concepts) Unconscious and emotions Creativity in everyday life and in high art Connects conscious and unconscious Conscious-Unconcious ≠ Emotional-Conceptual Different slicing of the psyche

DISINTEGRATION OF CULTURES Split between conceptual and emotional When important concepts are severed from emotions There is nothing to sacrifice one’s life for Split may dominate the entire culture Occurs periodically throughout history Was a mechanism of decay of old civilizations Old cultures grew sophisticated and refined but got severed from instinctual sources of life Ancient Acadians, Babylonians, Egyptians, Greeks, Romans… New cultures (“barbarians”) were not refined, but vigorous Their simple concepts were strongly linked to instincts, “fused”

EMOTIONS IN LANGUAGE Animal vocal tract Human vocal tract controlled by old (limbic) emotional system involuntary Human vocal tract controlled by two emotional centers: limbic and cortex Involuntary and voluntary Human voice determines emotional content of cultures Emotionality of language is in its sound: melody of speech 16-Sep-05 116

LANGUAGE: EMOTIONS AND CONCEPTS Conceptual content of culture: words, phrases Easily borrowed among cultures Emotional content of culture In voice sound (melody of speech) Determined by grammar Cannot be borrowed among cultures English language (Diff. > Synthesis) Weak connection between conceptual and emotional (since 15 c) Pragmatic, high culture, but may lead to identity crisis Arabic language (Synthesis > Diff.) Strong connection between conceptual and emotional Cultural immobility, but strong feel of identity (synthesis) 16-Sep-05 117

SYNTHESIS People cannot live without synthesis Feel of wholeness Meaning and purpose of life Creativity, life, and vigor requires synthesis Emotional and conceptual, conscious and unconscious In every individual Lost synthesis and meaning leads to drugs and personal disintegration In the entire culture Lost synthesis and meaning leads to cultural disintegration Historical evolution of consciousness From primitive, fuzzy, and fused to differentiated and refined Interrupted when synthesis is lost Differentiation and synthesis are in opposition, still both are required Example: religion vs. science Religious synthesis empowered human mind (15 c) and created conditions for development of science (17 c) Scientific differentiation destroyed religious synthesis Evolution of our culture requires overcoming this split, and it is up to us, scientists and engineers Individual consciousness Combining differentiation and synthesis Jung called individuation, “the highest purpose in every life”

MECHANISM OF SYNTHESIS Integrating the entire wealth of knowledge Undifferentiated knowledge instinct “likelihood maximization” Global similarity Differentiated knowledge instinct Highly-valued concepts Local similarity among concepts Highly valued concepts acquire properties of instincts Affect adaptation, differentiation, and cognition of other concepts Generate emotions, which relate concepts to each other An emergent hierarchy of concept-values Differentiated emotions connect diverse concepts We need huge diversity of emotions to integrate conceptual knowledge => synthesis

DIFFERENTIATION OF EMOTIONS Historical evolution of human consciousness Animal calls are undifferentiated concept-emotion-communication-action Ancient languages are highly emotional (Humboldt, Levy-Brule) Language evolved toward unemotional differentiation Nevertheless, most conversations have little conceptual content From villages to corporate board-rooms, people talk to establish emotional contact Human speech affects recent and ancient emotional centers Inflections and prosody of human voice appeals directly to ancient undifferentiated emotional mechanisms Accelerates differentiation, but endangers synthesis Music evolved toward differentiation of emotions At once: creates tensions and wholeness in human soul

ROLE OF MUSIC IN EVOLUTION OF THE MIND Melody of human voice contains vital information About people’s world views and mutual compatibility Exploits mechanical properties of human inner ear Consonances and dissonances Tonal system evolved (14th to 19th c.) for Differentiation of emotions Synthesis of conceptual and emotional Bach integrates personal concerns with “the highest” Pop-song is a mechanism of synthesis Integrates conceptual (lyric) and emotional (melody) Also, differentiates emotions Bach concerns are too complex for many everyday needs Human consciousness requires synthesis immediately Rap is a simplified, but powerful mechanism of synthesis Exactly like ancient Greek dithyrambs of Dionysian cult

EVOLUTION vs. INTELLIGENT DESIGN Science causal mechanisms Religion teleology (purpose) Wrong! In basic physics causality and teleology are equivalent The principle of minimal energy is teleological More general, min. action (min. Lagrangian) The knowledge instinct Teleological principle in evolution of the mind and culture Dynamic logic is a causal law equivalent to the KI Causality and teleology are equivalent 16-Sep-05 122

DL/NMF THEORY OF THE MIND CONFIRMED IN EXPERIMENTS Neural mechanisms of perception and cognition bottom-up and top-down signals matching Adaptive mechanisms synaptic connections Dynamics of vagueness-fuzziness From vague to crisp Unconscious - Conscious Corresponds to vague-crisp

FUTURE DIRECTIONS research, predictions and testing of NMF/DL Mathematical development DL in Hierarchy, mechanisms of Synthesis Add emotions to computer models of language evolution Neuro-imaging and psycholinguistic experiments Similarity-KI mechanisms, models Higher cognitive functions: beautiful, sublime Language-cognition interaction Emotionality of various languages Multi-agent simulations Joint evolution of language and cognition Historical linguistics and anthropology Concurrent evolution of languages, consciousness, and cultures Music Direct effect on emotions, mechanism of synthesis Concurrent evolution of music, consciousness, and cultures Improve human condition around the globe Diagnose cultural states (up, down, stagnation), measure D, S, H Develop predictive cultural models, integrate spiritual and material causes Identify language and music effects that can advance consciousness and reduce tensions Robotic systems, Semantic Web, Cyberspace, and Interactive environment Adaptive ontologies Learn from human users Acquire cultural knowledge and enable culturally-sensitive communication Help us understand ourselves Help us understand each other 124

THE END Can we describe mathematically and build a simulation model for evolution of all of these abilities? Can we build robotic systems understanding us, collaborating with us?

PUBLICATIONS 330publications OXFORD UNIVERSITY PRESS (2000; 3rd printing) 2007 Neurodynamics of High Cognitive Functions with Prof. Kozma, Springer Sapient Systems with Prof. Mayorga, Springer 2010: Dynamic Logic With Dr. Deming, Springer

BACK UP Why the mind and emotions? NMF vs. inverse problems NMF vs. biology of eye Cognition and understanding Intelligent agents The mind: Plato, Antisthenes, Aristotle, Occam, Locke, Kant, Jung, Chomsky, Grossberg DL, KI, and Buddhism Consciousness, aesthetic emotions