Presentation on theme: "COM1070: Introduction to Artificial Intelligence: week 2 Yorick Wilks Computer Science Department University of Sheffield www.dcs.shef.ac.uk/-yorick."— Presentation transcript:
COM1070: Introduction to Artificial Intelligence: week 2 Yorick Wilks Computer Science Department University of Sheffield
The Turing Test There are at least two alternative positions which criticise AI with respect to the Turing Test: i. ‘Too hard’ definition of Artificial Intelligence. Computers not likely to be able to pass the test. ii. Hollow shell criticism. Computer may pass test, but computers still won’t be able to think. As we shall see (on i) computers arent doing badly and are getting better. On (ii) the answer just begs the question as to what thinking is--which was Turing’s point in the first place!
‘.I believe that in about fifty years’ time it will be possible to program computers….to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning…… I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.’ Turing 1950.
Was Turing right about English? n My Bank’s machine thinks I only have twenty pounds left. n The computer says we’re a hundred miles of course. n It’s telling us that it’s going to blow in ten minutes.
Would you be satisfied that something which passed the TT was intelligent? What other requirements would you put on something before you considered it to be intelligent?
n Turing considered, and dismissed, possible objections to the idea that computers can think. Some of these objections might still be raised today. Some objections are easier to refute than others. Objections considered by Turing: 1. The theological objection 2. The ‘heads in the sand’ objection 3. The mathematical objection 4. The argument from consciousness 5. Arguments from various disabilities 6. Lady Lovelace’s objection 7. Argument from continuity in the nervous system (8.) The argument from informality of behaviour (9.) The argument from extra-sensory perception Turing’s own objections:
The theological objection n ‘…Thinking is a function of man’s immortal soul. God has given an immortal soul to every man and woman, but not to any other animal or to machines. Hence no animal or machine can think…’ n Why not believe that God could give a soul to a machine if He wished?
Heads in the sand objection n i.e. The consequence of machines thinking would be too dreadful. Let us hope and believe that they cannot do so. - related to theological argument; idea that Humans are superior to the rest of creation, and must stay so……... n ‘.. Those who believe in..(this and the previous objection).. would probably not be interested in any criteria..’
The mathematical objection n Results of mathematical logic which can be used to show that there are limitations to the powers of discrete-state machines. eg halting problem: will the execution of a program P eventually halt or will it run for ever? Turing (1936) proved that for any algorithm H that purports to solve halting problems there will always be a program Pi such that H will not be able to answer the halting problem correctly. i.e. Certain questions cannot be answered correctly by any formal system. n But, similar limitations may also apply to the human intellect.
Argument from consciousness n ‘…This argument is very well expressed in Professor Jefferson’s Lister Oration for 1949, from which I quote. “Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain – that is not only write it but know that it had written it. No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants”..’ Only way one could be sure that a machine thinks is to be that machine and feel oneself thinking. - similarly, only way to be sure someone else thinks, is to be that person. How do we know that anyone is conscious? solipsism. Instead, we assume that others can think and are conscious----it is a polite convention. Similarly, could assume that machine which passes Turing test is so too.
Consciousness n Thought and consciousness do not always go together. Freud and unconscious thought. Thought we cannot introspect about. (eg searching for forgotton name) Blindsight (Weiskrantz) – removal of visual cortex, blind in certain areas, but can still locate spot without consciousness of it. Arguments from various disabilities ie ‘I grant that you can make machines to all the things you have mentioned but you will never be able to make one do X’. eg be kind, resourceful, beautiful, friendly, have initiative, have a sense of humour, tell right from wrong, make mistakes, fall in love, enjoy strawberries and cream, make someone fall in love with it, learn from experience, use words properly, be the subject of its own though, have as much diversity of behaviour as a man, do something really new. These criticisms often disguised forms of argument from consciousness.
Lady Lovelace’s objection: n (memoir from Lady Lovelace about Babbage’s Analytical Engine) Babbage ( ) and Analytical Engine: general purpose calculator. Entirely mechanical. Entire contraption never built – engineering not up to it and no electricity! ‘..The Analytical Engine has no pretensions to originate anything. It can do whatever we know how to order it to perform..’ A computer cannot be creative, it cannot originate anything, only carry out what was given to it by the programmer. But computers can surprise their programmers. – ie by producing answers that were not expected. Original data may have been given to computer, but may then be able to work out its consequences and implications (cf. level of chess programs and their programmers).
Argument from continuity in the nervous system n Nervous system is continuous: the digital computer is discrete state machine. I.e. in the nervous system a small error in the information about the size of a nervous impulse impinging on a neuron may make a large difference to the size of the outgoing impulse. Discrete state machines: move by sudden jumps and clicks from one state to another. For example, consider the ‘convenient fiction’ that switches are either definitely on, or definitely off. However, discrete state machine can still give answers that are indistinguishable from a continuous machine.
n Other objections n Copeland (1993) [see ‘Artificial Intelligence: a philosophical introduction’] discusses 4 further objections to Turing Test. The first three of these he dismisses, and the fourth he incorporates into a modified version of the Turing Test. 1. Too conservative: Chimpanzee objection Chimpanzees, dolphins, dogs, and pre-linguistic infants all can think (?) but could not pass Turing Test. But this only means that Turing Test cannot be a litmus test (red = acid, not red = non acidic). - nothing definite follows if computer/animal/baby fails the test. Ie negative outcome does not mean computer cannot think. (In philosophical terms: TT gives a sufficient not a necessary condition of thought)
2. Too easy: Sense organs objection n Turing Test focuses only on verbal responses: the computer is able to use words without really knowing their meanings. (like taking a driving test that consists only of answering questions). Equip it with sense organs, and you can then test knowledge of world (remember turning the turtle over in ‘Blade Runner’ when spotting robots. n But there many words which can be investigated without computer having sense organs. Eg ‘teasing’. n Think about getting to know people through webchat--can you?
3. Simulation objection n The simulation of X is never an X. In the original Victorian parlour game, if man misleads the interrogator into thinking he is a woman, doesn’t mean he IS a woman. But there are two kinds of simulation: a) simulation1 which lacks essential features of what is being simulated. Eg in simulated death the person is not dead. b) simulation2, exactly like what is being simulated, but not produced in standard way (e.g. aircraft in windtunnels). Could claim that a program that passed the Turing Test was a simulation2
4. Black box objection. n This objection is presented by Copeland (1993), and not rejected by him. - device whose inner workings are unknown. Turing Test: relying only on outward behaviour. But might look inside computer, to see how it ‘thinks’. ie might have program which just has list of sentence to be output in response to sentences it encounters. All possibilities covered, and computer not thinking at all, just matching sentences. Later you will see more of programs like ELIZA: Joseph Weizenbaum’s program with a therapeutic interviewer. n Men are all alike. IN WHAT WAY? They’re always bugging us about something or other. CAN YOU THINK OF A SPECIFIC EXAMPLE? Well, my boyfriend made me come here. YOUR BOYFRIEND MADE YOU COME HERE?
He says I’m depressed much of the time. I AM SORRY TO HERE THAT YOU ARE DEPRESSED. It’s true that I am unhappy. DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY? I need some help, that much seems certain. WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP? Perhaps I could learn to get along with my mother. TELL ME MORE ABOUT YOUR FAMILY. My mother takes care of me. WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU? My father. YOUR FATHER You are like my father in some ways. WHAT RESEMBLANCE DO YOU SEE? You are not very aggressive, but I think you don’t want me to notice that. WHAT MAKES YOU THINK I AM NOT VERY AGGRESSIVE?
You don’t argue with me. WHY DO YOU THINK I DON’T ARGUE WITH YOU? You are afraid of me. DOES IT PLEASE YOU TO THINK I AM AFRAID OF YOU? My father is afraid of everybody. WHAT ELSE COMES TO MIND WHEN YOU THINK OF YOUR FATHER Bullies DOES THAT HAVE ANYTHING TO DO WITH THE FACT THAT YOUR BOYFRIEND MAKE YOU COME HERE? But, when we look at how Eliza works, this conversation seems less impressive. 1. Transform input to change point of view. Eg change you to I etc. If pattern matches then give stored response Otherwise give general purpose response
Pattern *need* [can you do without] Eg I need some help, that much seems certain. WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP? *Mother* [tell me more about your family] eg. Perhaps I could learn to get along with my mother. TELL ME MORE ABOUT YOUR FAMILY. Eliza lacks Means of recognizing grammatical structure of language Means of converting users query into a representation of the underlying knowledge from which inferences can be drawn.
But notice here the AI theory being snuck in……... n We don’t have any INDEPENDENT evidence that people do asign grammatical structures to what they hear in dialogue, n nor that they use a great deal of background knowledhe when carrying on a superficial conversations…..
Consider PARRY, a much better program: n PARRY had far more to say than ELIZA and wanted to tell you things n It modelled a psychiatrically disordered person (PARanoid) and hence it was easy to excuse when it didn’t make sense. n It remembered what you had said and got ‘annoyed’ and ‘frightened’ n It was claimed it had passed the TT.
Parry, the paranoid program: Colby, K.M., Weber, S., Hilf, F.D. (1971) Artificial Paranoia, Artificial Intelligence, 2, The computer is in capital letters Why are you in the hospital? I SHOULDN’T BE HERE. Who brought you here? THE POLICE. What trouble did you have with the police? COPS DON’T DO THEIR JOB.
What do they do instead? COPS ARREST THE WRONG PEOPLE. They arrest the right ones sometimes. THE LAW SHOULD GET THOSE ITALIAN CROOKS. Are you Italian? MY BACKGROUND IS BRITISH BUT I WAS BORN IN THIS COUNTRY. Do you know any Italian crooks? I TRY TO AVOID THE UNDERWORLD. Are you afraid of the underworld?
THE MAFIA ARE THE WORST GANGSTERS. Are you connected with the Mafia? YOU KNOW, THEY KNOW ME. How do they know you? THEY SPY ON ME. Why do they? THEY WANT TO GET FALSE EVIDENCE TO FRAME ME. What do they have against you? THEY ARE OUT TO GET ME.
What for? I HAVE INCRIMINATING EVIDENCE AGAINST THE MAFIA. Are you sure of all this? YOU DON’T BELIEVE ME? I am doing my best. …silence CAN’T YOU THINK OF A DECENT QUESTION TO ASK? You seem hostile to me. SOME OF YOUR REMARKS ANNOY ME.
I am sorry. FORGET THE EXCUSES, YOU GUYS ARE ALL ALIKE.
Colby’s PARRY n Perhaps best ever performance, many users, robust, but not a normal subject (I.e. a paranoid) n primitive individual models, some control of dialogue process; but it had lots to say! n Primitive simulation of intentionality n not syntax but fast pattern matching n much better than ELIZA
How is this achieved in PARRY? n WHAT IS YOUR MAIN PROBLEM _______________________________ n WHAT BE YOU MAIN PROBLEM n BE YOU MAIN PROBLEM n WHAT BE MAIN PROBLEM n WHAT BE YOU PROBLEM n WHAT BE YOU MAIN
Loebner Test [Turing test + money] Competition started in 1991 by Hugh Loebner. Judges type questions in at terminal: replied to either by human or a computer contestant. Judges rate responses according to how human-like they appear: Prize of 2000 dollars to most human-like. Grand prize of 100,000 dollars if a program does better than one of the humans.
1997 winner: Intelligent Research Ltd, designed and largely programmed by members of Natural Language Processing in this Department. Fragment of winning transcript Program: hi, my name is Catherine, what’s yours? Judge2: My name is Nathan. Program: Well, its nice to meet you Nathan. Did you see that story on CNN last night about the lesbian couple who came out at a White House party on Sunday.
Judge2: No. I’m sorry I missed that. How did the Prez take it? Program: Ellen Degeneres was one of them - she was kissing her lover in front of Bill Clinton. Do you think he was very embarassed? Judge2: You tell me, I didn’t see it. Program: Are you still a friend of Bill Clinton Judge2: Yes. I would like to talk about something else. Do you know anything about sculpture.
Program: Well I suppose he’s doing his best like most of us, but he does seem to attract an UNUSUAL amount of bad publicity, wouldn’t you say -- can it Judge2: Do you know what sculpture is? Program: Well you’re more of a fan of his than I am I guess --- though I can’t SWEAR what I’d have done in that locked hotel room, I must say. SOME DAYS IT’S BETTER THAN OTHERS AT CHANGING THE SUBJECT!
Maybe the idea of Turing Test should be abandoned. Reason 1: Unitary notion of ‘intelligence’ too simplistic. Too simplistic to think that it is useful to assess whether computers possess ‘intelligence’, or the ability to think. Better to break down this question into smaller questions. - similar to idea that unitary measure of intelligence (ie intelligence as measured by IQ tests) is not very useful - better to have tests that reveal the relative strengths and weaknesses of individuals. Could assess computers in terms of more specific abilities; eg ability of robot to navigate across a room, ability of computer to perform logical reasoning, metaknowledge (knowledge of own limitations).
Reason 2: Too anthropocenctric. Too anthropocentric to insist that program should work in same way as humans. Dogs are capable of cognition, but would not pass Turing Test. Still, producing machine with cognitive and communicative abilities of a dog would be (another) challenge for AI. But how can we NOT be anthropocentric about intelligence? We are the only really intelligent things we know, and language is closer to our intelligence than any other function we have…?
Perhaps for now (till opening heads helps) behaviour is all we have. n Increasingly complex programs means that looking inside machines doesn’t tell you why they are behaving the way they are. n Those who don’t think the TT effective must show why machines are in a different position from our fellow humans (I.e. not from OURSELVES!). Solipsism again.
Turing Test (as now interpreted!) suggests that we base our decision about whether a machine can think on its outward behaviour, and whether we confuse it with humans. Concept of Intelligence in humans We talk about people being more or less intelligent. Perhaps examining the concept of intelligence in humans will provide an account of what it means to be intelligent. What is intelligence? Intelligence is what is measured by intelligence tests.
Potted history of IQ tests Early research begun into individual differences: 1796: assistant at Greenwich Observatory recording when stars crossed the field of the telescope. Consistently reported observations eight-tenths of a second later than Astronomer Royal. Discharged! Later realized that observers respond to stimuli at different speeds – the assistant wasn’t misbehaving, he just couldn’t do it as quickly as the Astronomer Royal. Francis Galton, in latter half of 19 th century: interested in individual differences. He developed measures of keenness of senses, and mental imagery: early precursors of intelligence tests. Found evidence of genius occurring often in certain families. Stanford-Binet IQ test Alfred Binet ( ) tried devising tests to find out how “bright” and “dull” children differ. His aim was educational – to provide appropriate education depending on ability of child. Emphasis on general intelligence. Idea of quantifying the amount of intelligence a person has.
Stanford-Binet test makes use of concept of mental age versus chronological age. Intelligence quotient (IQ) produced as ratio of mental age to chronological age. Items in the test are age-graded, and mental age corresponds to level achieved in test. Bright child’s mental age is above his or her chronological age, slow child’s mental age is below his or her mental age. Move of emphasis from general to specific abilities World War 1: US test ‘Army Alpha’. Tested simple reasoning, ability to follow directions, arithmetic and information. Used to screen thousands of recruits, sorting into high/low/intermediate responsibilities. Beginning of measures of specialized abilities: Realisation that rating on single dimension not very informative. ie different jobs require different aptitudes. eg 1919 Seashore: Measures of Musical Talent. Tested ability to discriminate pitch, timbre, rhythm etc. 1939: Wechsler-Bellevue scale: goes beyond composite performance to separate scores on different tasks. eg mazes, recall of information, memory for digits etc. Items divided into performance scale and verbal scale. eg Performance item:
Block design: pictured designs must be copied with blocks; tests ability to perceive and analyse patterns. Verbal item: Arithmetic. Verbal problems testing arithmetic reasoning.
Nature of Intelligence Binet, and Wechsler, assuming that intelligence is a general capacity. Spearman: also proposed individuals possess a general intelligence factor g in varying amounts, together with specific abilities. Thurstone (1938): believed intelligence could be broken down into a number of primary abilities. Used factor analysis to identify 7 factors verbal comprehension word fluency number space memory perceptual speed reasoning Thurstone devised test based on these factors; Test of primary mental abilities. But the predictive power of Test for primary mental abilities was no greater than for Wechsler and Binet tests, and several of these factors correlated with each other..
IQ tests: provide one view of what intelligence is. History of intelligence testing shows that our conception of what is intelligence is subject to change. Change from assuming there is a general intelligence factor, to looking at specific abilities. But emphasis is still on quantification, and measuring how much intelligence a person possesses – doesn’t really say what intelligence is. Specific and general theories seem to have similar predictictive abilities about individual outcomes.
Try this right now: PICK OUT THE ODD ONE n Cello n Harp n Drum n Violin n Guitar
Limitations of ability tests: 1. IQ scores do not predict achievement very well, although they can make gross discriminations. The predictive value of tests is better at school (correlation between.4 and.6 between IQ scores on Stanford- Binet and Wechsler and school grades), but less good at university. Possible reasons for poor prediction: Difficult to devise tests which are culturally fair, and independent of educational experience. E.g. pick one word that doesn’t belong with the others. Cello harp drum violin guitar Children from higher income families chose ‘drum’; those from lower income families picked ‘cello’. Tests do not assess motivation or creativity. 2. Human-centred: Animals might possess an intelligence, in a way that a computer does not, but it is not something that will show up in an IQ test. 3. Tests only designed to predict future performance; they do not help to define what intelligence is., but again, the search for definitions is rarely helpful.