- ppt download

How Ideas Evolve into Speech - A Computer Animation Derek J
How Ideas Evolve into Speech - A Computer Animation Derek J. SMITH, CEng, CITP Centre for Psychology University of Wales Institute, Cardiff

As presented to the 9th Annual Conference of the Consciousness and Experiential Psychology Section of the British Psychological Society St. Annes’s College, Oxford 18th September 2005

Copyright Notice: This material was written and published in Wales by Derek J. Smith (Chartered Engineer), Senior Lecturer in Cognitive Science and Informatics at University of Wales Institute, Cardiff. It forms part of a multifile e-learning resource, and subject only to acknowledging Derek J. Smith's rights under international copyright law to be identified as author may be freely downloaded and printed off in single complete copies solely for the purposes of private study and/or review. Commercial exploitation rights are reserved. The remote hyperlinks have been selected for the academic appropriacy of their contents; they were free of offensive and litigious content when selected, and will be periodically checked to have remained so. Copyright © 2005, Derek J. Smith (Chartered Engineer). Publication was by PowerPoint presentation on 18th September 2005, running offline with inactive hyperlinks. This online version, complete with activated hyperlinks, comes to you for follow-up private study. Paragraphs rendered feint form part of the fuller narrative but were not unduly emphasised at time of presentation.

ABOUT THE AUTHOR Derek Smith graduated as a psychologist in 1972, but is now bi-professional as both psychology lecturer and systems engineer. During the 1980s he was with British Telecom, Cardiff, where he specialised in the design and operation of very large "semantic network" databases. Since 1991 he has taught psycholinguistics and neuropsychology to Speech and Language Therapy undergraduates. The essence of a computer database under interrogation is the quasi-linguistic "linearisation" of fragmented conceptual memory. Since this is also the essence of human speech production, Derek is fascinated by the possibility that the mind is a biological database. For a gentle introduction to what goes on inside semantic network databases, see Smith (1998), Smith (2005), or click here.

PLAN OF ATTACK This paper looks at the positioning of conscious experience within the complex of processing modules involved in speech praxis, with a view to identifying the critical information flows and supporting memory types. The paper is organised into two sections, one long and introductory, and the other more focused but exploratory. In Section 1, we familiarise ourselves with the modules and stages of spoken language processing, firstly in static box-and-arrow diagram format, and secondly in computer animation. We shall be paying special attention to the role of the “speech act” in communication, and the consequent need for interaction between semantic and pragmatic command systems. In Section 2, we then take a closer look at the distribution of different memory types, encoding systems, and feedback circuits, and demonstrate how the act of animating those circuits can generate highly specific research questions. An appeal for a detailed cybernetic analysis of language production is made, for it is long overdue.

SECTION 1 THE MODULES AND STAGES OF SPOKEN LANGUAGE PROCESSING

SPEECH PRODUCTION STAGES (1) LORDAT’S (1843) FIVE-STAGE MODEL In 1843 the French neurologist Jacques Lordat identified five post-ideational processing stages within speech production, and described the first of these as “isolating” the idea to be expressed. The four subsequent processes then co-operate in shaping the final spoken output. Lordat's analysis was recently converted into box-and-arrow format by Lecours, Nespoulos, and Pioger (1987), as now shown …..

Here is Lecours et al's graphical representation of Lordat's analysis.
The initial ideational stage is shown as "THINKING IN GENERAL". Each stage receives a coded message from the one before, adds to it in some clever way, and then passes it on to the one after. BUT NOTE THE DIFFICULTY COUNTING STAGES WHEN ONE STAGE IS ALLOWED TO CONTAIN NESTED SUB-STAGES. MORE ON THIS AS WE GO.

AND NOTE ALSO THAT IF WE COUNT IDEATION AS A PROCESSING STAGE AS WELL, IT MAKES SIX STAGES IN TOTAL .....

SPEECH PRODUCTION STAGES (2) TWO "SHAPES" TO THE DIAGRAMS
Lordat’s explanatory schema was duly incorporated into a number of later 19th century aphasiological models, and his analysis is of distinct historical significance today because with surprisingly few alterations it is still with us. Unfortunately, there has never been a standard form of the supporting diagram. Different authors draw things different ways and at different levels of detail. Nevertheless, diagrams tend to come in only two basic shapes, namely “A-shaped” (with the “clever” bits – the mind’s “higher functions” - at the top) or “X-shaped” (with the clever bits at the central cross-over point). Here are two of the early A-shaped ones (we'll be seeing one of the X-shaped ones later on) …..

SPEECH PRODUCTION STAGES (3) LICHTHEIM’S (1885) TWO-LAYER MODEL Here, from the Golden Age of Aphasiology, is a three-module two-level model of the totality of language processing. Speech perception takes place in module A, speech production in module M, and understanding in module B [B = "Begriff", the German word for "understanding"]. Note that Lichtheim's B and M stages are doing exactly the same job as Lordat's six stages! AGAIN NOTE THE DIFFICULTY COUNTING STAGES WHEN ONE STAGE IS ALLOWED TO CONTAIN NESTED SUB-STAGES WITHOUT EVEN SHOWING THEM.

SPEECH PRODUCTION STAGES (4) LICHTHEIM’S (1885) TWO-LAYER MODEL Note also that all three modules are major consumers of LTM resources. Thus Module B needs to be functionally located either within, or close to, the mind's "semantic network", whilst Modules A and M need instant access to non-semantic word stores. However ..... ..... box-and-arrow diagrams like these often leave the actual memory stores implicit, and thus under-specified. NOTE THE TWO MODALITY-SPECIFIC WORD STORES (OR "LEXICONS"), BOTH SUBORDINATED TO THE CENTRE FOR HIGHER FUNCTION

THESE, TOO, ARE ALL SUBORDINATED TO A HIGHER PROCESSING CENTRE.
SPEECH PRODUCTION STAGES (5) KUSSMAUL’S (1878) FOUR LEXICON MODEL NOTE THE FOUR SEPARATE WORD STORES (OR "LEXICONS") IN THIS CONTEMPORARY ALTERNATIVE TO LICHTHEIM'S MODEL. THESE, TOO, ARE ALL SUBORDINATED TO A HIGHER PROCESSING CENTRE.

SPEECH PRODUCTION STAGES (6)
Speech production was then largely ignored until UCLA's Victoria A. Fromkin reawakened interest in it as a study area in the early 1970s (Fromkin, 1971). The most popular modern models of speech production come from the Max Planck Institute's Willem Levelt and the University of Arizona's Merrill F. Garrett. Like Lordat, Fromkin identified ideation plus five post-ideational processing stages, as follows …..

SPEECH PRODUCTION STAGES (7) FROMKIN’S FIRST THREE STAGES
Stage 1 – Pre-Lexical Semantics: Decides the meaning to be conveyed. Code not known, but preverbal. Stage 2 – Pre-Lexical Syntax: Decides the grammatical skeleton of the sentence. Code not known, but preverbal. Stage 3 – Lexical: Selects the necessary “content words” (i.e. nouns and verbs) from the mental lexicon, thus making ideas verbal for the first time. It’s worth remembering these three stages as a unit, because they give us our ability with LANGUAGE.

SPEECH PRODUCTION STAGES (8) FROMKIN’S LAST THREE STAGES
Stage 4 – Prosody: Adds in emotionality via intonation pattern. Code not known, but mediated by the hindbrain. Stage 5 – Phonology: Decides the final syntax and word morphology. Phonemic code. Stage 6 - Final Sound Production: Commits concrete sounds - "allophones“ – to the motor nerves for respiration, phonation, and articulation. We’re not really concerned with these three stages in this paper, because they are all post-semantic. They take units of language and convert them into SPEECH.

SPEECH PRODUCTION STAGES (9) NORMAN’S (1990) THREE-LAYER MODEL LICHTHEIM'S MODULE B, BUT NOW WITH VARIOUS SUBDIVISIONS OF FUNCTION THE LANGUAGE LEVEL LICHTHEIM'S MODULE A, BUT NOW WITH SUBSTAGES OF SPEECH PERCEPTION LICHTHEIM'S MODULE M, BUT NOW WITH SUBSTAGES OF SPEECH PRODUCTION

Or to put it all a little more vividly .....

The result is a mental champagne-cascade ….. ….. with ideas pouring down from the top ….. ….. words being added on the way down ….. ..... the ideas plus the words give you your language ..... ….. sounds are then added below that ..... ..... and “linear” speech emerges at the bottom.

Or more correctly .....

BECAUSE WE DON'T REALLY KNOW WHAT GOES ON UP HERE

And the big problem, of course, is .....

BECAUSE WE DON'T REALLY KNOW WHAT GOES ON UP HERE ..... what is the true nature of our "champagne" - the ideation at the top of the cascade? And who (or what) the heck is pouring it!

STATE-OF-THE-ART PSYCHOLINGUISTIC MODELING
Norman (1990) is more or less state-of-the-art amongst the A-shaped diagrams. The PALPA (Kay, Lesser, and Coltheart, 1992) is a typical X-shaped psycholinguistic diagram. It derives from the 19th century four-lexicon models, via earlier modern modeling efforts by John Morton, Andrew Ellis (e.g. Ellis, 1982), Karalyn Patterson, John Marshall, etc. It is characterised by having all the higher functions located in the middle of the diagram, not at the top. Here it is ….. To see a brief history of psycholinguistic models of this genre, click here.

The PALPA (1) http://www.smithsrisca.co.uk/PSYkayetal1992.html
All input channels are now at the top, and all output channels at the bottom. Ideation is in the centre box, and speech praxis is the bottom left processing leg. THIS IS WHERE ALL THE PHILOSOPHICALLY MYSTERIOUS THINGS HAPPEN

NEXT WE'RE GOING TO SEE THIS SPEECH PRODUCTION LEG IN CLOSE UP .....

Here is the bottom left quadrant of the full PALPA diagram. The similarity with the speech output legs of the Lichtheim, Kussmaul, and Norman diagrams should now be apparent. The focus of the present paper is this - WHAT IS IDEATION, AND HOW DO IDEAS MAKE IT OUT OF THE CENTRE BOX AND DOWN THE FIRST ARROW?

IN FACT, ONE OF THE MAIN STRATEGIC GOALS OF COGNITIVE SCIENCE HAS ALWAYS BEEN TO OPEN UP THIS BLACK BOX. THIS MEANS MODELING ALL THE HIGHER FUNCTIONS SHOWN ON THE NORMAN (1990) DIAGRAM, PLUS SPECIFYING WHERE THE MIND'S SEMANTIC NETWORK DATABASE MIGHT BE SITUATED AND HOW IT MIGHT CONTRIBUTE TO SAID HIGHER FUNCTIONS. THIS, IN TURN, REQUIRES SEPARATING OUT THE SUBSYSTEMS FOR CONSCIOUSNESS, SEMANTICS, AND PRAGMATICS.

SPEECH ACTS AND IDEATION
One of the keys to unravelling higher functions is to include speech acts in our modeling. Unfortunately, speech praxis is so complex in this respect that it has recently spawned its own science – “pragmatics” - with its own very powerful theory - Speech Act Theory (Austin, 1962; Searle, 1969). Speech Act Theory studies not just the words people use, but the units of intention – the “speech acts” which preceded those words. Pragmatics is thus the science of the first down arrow on diagrams like the PALPA ….. Each speech act is (a) calculated to achieve some discrete behavioural "perlocutionary" effect, but (b) has not yet been fully formed lexically or grammatically. The code is preverbal - perhaps “sprites” or ideograms of some sort.

ANIMATED PALPA – SMITH (2000) http://www.smithsrisca.co.uk/PALPA.avi
So what might a speech act look like? Where do these all-important sprites come from, where do they go, and what happens to them when they get there? And are they (or the feedback generated during their processing) involved in consciousness? To get a better idea of the process, we need to see the static flow diagram “in motion”. So here, from Smith (2000), is sentence production at about one third natural speed, for the specimen sentence “The Redcoats are coming” ….. Technical NB: If accessing this presentation over the Internet you should note that the latest versions of PowerPoint no longer play this video from within the presentation. To get around this problem, simply click here to download the corresponding .avi file and view it using your MS MediaPlayer or equivalent.

ANIMATED PALPA – SMITH (2000) KEY POINTS
Watch out for ..... ..... the central functional separation of awareness, understanding, and will, closely associated with affective processes. ..... the converging flow of semantic and pragmatic icons onto the primary sentence construction process, and the parallel movement of the affective icon onto the lower speech production process. ..... the need for constant signal acknowledgement and onward transmission. ..... the number of alternative feedback routes for said acknowledgements to take. ..... the need for interrupt-resend mechanisms.

SECTION 2 THE MEMORY TYPES, THE ENCODING SYSTEMS, AND THE FEEDBACK CIRCUITS IN SPOKEN LANGUAGE PROCESSING

THE BASIC PROBLEM (1) There are five related problems with modeling the cognitive system ..... Firstly, the system being modeled just won't keep still. Secondly, when it moves it moves very quickly. Thirdly, when it moves quickly we can neither see, nor conceptually keep up with, what it's doing at a reductionist level. Fourthly, when we slow it down or look closely at it we lose sight of what it's doing at a holistic level. Finally, there is no single explanatory science .....

THE BASIC PROBLEM (2) ..... in fact, the following disciplines all have something to say about where the true secrets of cognition lie ..... Anatomical Neuroscience, Artificial Intelligence, Clinical Neurology, Clinical Neuropsychology, Clinical Psychology, Cognitive Palaeontology, Comparative Ethology, Consciousness Studies, Cybernetics, Epistemology, Linguistic Philosophy, Mental Philosophy, Neuroethology, Physical Anthropology, Physiological Neuroscience, and Psycholinguistics We have highlighted the science whose voice has not yet matched its potential contribution, namely cybernetics, the study of control systems in the abstract. The following screens show where a little cybernetics might make a lot of difference .....

FEEDFORWARD CONTROL Let us look again at the PALPA's speech production leg as published. Note that all the arrows are "feedforward" information flows. They pass content, together with instructions on what to do with it, down to lower modules.

FEEDBACK CONTROL (1) There are no "feedback" information flows on this diagram. (We instantly know this because none of the arrows point up the screen.) So no module can communicate problems back to the module above it. This makes for an extremely inefficient real-time information processing architecture, so let's add some up arrows .....

FEEDBACK CONTROL (2) Now we have allowed for the "feedback" of the success or failure of any component of the speech production process. Note the multiple "concentric" feedback loops, both "antidromic" and indirect. [Antidromic = back up the down channel, and possibly even back up the down neuron.] BUT WHICH TYPE OF FEEDBACK IS BEST? OR DO WE, PERHAPS, JUST NEED AS MUCH OF IT AS WE CAN GET OUR HANDS ON?

FEEDBACK CONTROL (3) We may gain additional insight into what is involved by looking at feedback in the A-shaped diagram. It's everywhere! Even on the INPUT leg! Note especially the difference between KR and the aforementioned control interrupts.

FEEDBACK CONTROL (4) C = cacheing buffers E = efference copy
Here is the pay-off ..... FEEDBACK MECHANISMS ARE MAJOR CONSUMERS OF SHORT-TERM MEMORY, SO, IN GETTING THE CONTROL LOOPS RIGHT, YOU GET THE BULK OF THE MEMORY REQUIREMENTS RIGHT AS WELL. Here are some of the memory stores required to support the arrows already specified ..... C = cacheing buffers E = efference copy

FEEDBACK CONTROL (5) CONFIRMATION THAT THE WORLD IS REACTING AS REQUESTED ..... which presents us with a number of opportunities for both conscious and unconscious experience. HEARING YOUR OWN VOICE HEARING YOUR INNER SPEECH THE UNCONSCIOUS SENSE OF LUCIDITY WHICH COMES WHEN YOU FIND ALL THE WORDS YOUR IDEAS DEMAND A SORT OF "MOT JUSTE EFFECT" THE UNCONSCIOUS SENSE OF LUCIDITY WHICH COMES WHEN YOUR TONGUE DOES WHAT IT'S TOLD. NOTE THAT IF PHONO PROCESSING FAILS EVEN MOMENTARILY, THE INTERRUPT/RESEND MECHANISMS NEED TO BE INVOKED IN AN UPWARDS CASCADE.

FEEDBACK CONTROL (6) SPECIFIC RESEARCH ISSUES
What is the down-module STM retention time when a block of instructions is received, and what is the nature of the code used? What is the nature of the down-module processing carried out on those instructions? What is the nature of the feedback loops in force? Are there differences in the up-module processing of the antidromic and reafferant feedback types? The cacheing and efference copy activities should already be visible in the functional neuroimaging literature, but, without an adequate reference model to go by, risk being misinterpreted as artifacts.

CONCLUSION THE ARGUMENT IN A NUTSHELL
We have been looking at the positioning of conscious experience within the complex of processing modules involved in speech praxis. We began by familiarising ourselves with the modules and stages of spoken language processing, firstly in static box-and-arrow diagram format, and secondly in computer animation. In the animation we recognised the “speech act” as a major feedforward instruction stream, and considered how and where this stream would interact with our semantic and awareness systems. A specific appeal for a cybernetic analysis of language production was then made, supported by a closer look at the distribution of different memory types in the different feedback circuits. Finally, we demonstrated how the act of animating those circuits can generate highly specific research questions, such as whether copies are taken and whether feedback is direct (antidromic) or indirect.

REFERENCES Austin, J.L. (1962). How to do Things with Words. Oxford: Oxford University Press. Fromkin, V.A. (1971). The non-anomalous nature of anomalous utterances. Language, Vol. 47, pp Lecours, A.R., Nespoulos, J.L., and Pioger, D. (1987). Jacques Lordat or the Birth of Cognitive Neuropsychology. In Keller, E. and Gopnik, M. (Eds.), Motor and Sensory Processes in Language. Hillsdale, NJ: Erlbaum. Lichtheim, L. (1885). On aphasia. Brain, 7: Lordat, J. (1843). Leçons tirées du cours de physiologie de l'année scolaire Journal de la Société de médecine pratique de Montpellier, 7: ; 7: , and 8:1-17. [But reviewed in detail in Lecours, Nespoulos, and Pioger (1987).] Searle, J.R. (1969). Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press. Smith, D.J. (1998). Commentary on "Cortical Activity and the Explanatory Gap" by J.G. Taylor. Consciousness and Cognition, 7: Smith, D.J. (2000). A slow-motion video analysis of information feedback in a computer-animated psycholinguistic model. Computer-animated poster presented 10th April 2000 at the Tucson Towards a Science of Consciousness conference, University of Arizona, Tucson, AZ. Smith, D.J. (2005). On database keys, with an application to the Praxisproblem. In Callaos, N., Lesso, W., and Palesi, M. (Eds.), The 9th World Multi-Conference on Systemics, Cybernetics, and Informatics, July 10-13, Orlando, Florida, USA (Volume IV). Orlando, FL: International Institute of Informatics and Systemics.

Similar presentations

Presentation on theme: ""— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Similar presentations

Presentation on theme: ""— Presentation transcript:

Similar presentations

About project

Feedback