[math-fun] Turing test passed? Another sucker born every minute
Stuart Anderson: http://www.smh.com.au/digital-life/computers/super-computer-first-to-pass-tu... --software emulates a 13 year old Ukrainian boy named Eugene Goostman, supposedly convinced 33% of judges he was human. Did not say how many judges interviewed him, but at most 30. I tried clicking the "try Goostman yourself" link at the bottom of the page, leading to http://www.princetonai.com/bot/ and/or http://www.princetonai.com/bot/bot.jsp and/or http://default-environment-sdqm3mrmp4.elasticbeanstalk.com/bot/ and the result was nothing, all three just hung. It was unclear from the story whether the judges spoke Ukrainian, or whether Goostman supposedly semi-knew English, or what. Somewhat more illuminating: https://en.wikipedia.org/wiki/Eugene_Goostman http://www.independent.co.uk/life-style/gadgets-and-tech/computer-becomes-fi... I'm pretty sure a goodly number of the judges (maybe all) did not speak Ukrainian. Wikipedia also mentions some previous examples of "passing Turing tests" such as 2011 where Cleverbot tricked 59% of judges and 1991 where "PC Therapist" tricked 5 out of 10 judges. https://en.wikipedia.org/wiki/Cleverbot You can converse with cleverbot here: http://www.cleverbot.com/ I typed: Which is more important, death or a yellow clay brick? The answer came back: Truth. For confirmation I tried: Which is larger, a star or the statue of liberty? Answer: They are both about the same size. Game over... total elapsed time, about 15 seconds. I don't care what the newspapers say, I'm not going to believe any computer can fool me for 5 minutes until I actually experience it. Also it says Google has technology that can solve "CAPTCHA" so-called turing tests (image recognition for disguised alphanumeric characters) and also that blind people are infuriated at CAPTCHA anyhow since they have to rely on automated screen-reading software which is blocked by that.
On the other hand, my computer fools me all the time! R. On Mon, 9 Jun 2014, Warren D Smith wrote:
Stuart Anderson: http://www.smh.com.au/digital-life/computers/super-computer-first-to-pass-tu...
--software emulates a 13 year old Ukrainian boy named Eugene Goostman, supposedly convinced 33% of judges he was human. Did not say how many judges interviewed him, but at most 30. I tried clicking the "try Goostman yourself" link at the bottom of the page, leading to http://www.princetonai.com/bot/ and/or http://www.princetonai.com/bot/bot.jsp and/or http://default-environment-sdqm3mrmp4.elasticbeanstalk.com/bot/ and the result was nothing, all three just hung. It was unclear from the story whether the judges spoke Ukrainian, or whether Goostman supposedly semi-knew English, or what.
Somewhat more illuminating: https://en.wikipedia.org/wiki/Eugene_Goostman http://www.independent.co.uk/life-style/gadgets-and-tech/computer-becomes-fi...
I'm pretty sure a goodly number of the judges (maybe all) did not speak Ukrainian. Wikipedia also mentions some previous examples of "passing Turing tests" such as 2011 where Cleverbot tricked 59% of judges and 1991 where "PC Therapist" tricked 5 out of 10 judges. https://en.wikipedia.org/wiki/Cleverbot You can converse with cleverbot here: http://www.cleverbot.com/ I typed: Which is more important, death or a yellow clay brick? The answer came back: Truth. For confirmation I tried: Which is larger, a star or the statue of liberty? Answer: They are both about the same size. Game over... total elapsed time, about 15 seconds.
I don't care what the newspapers say, I'm not going to believe any computer can fool me for 5 minutes until I actually experience it.
Also it says Google has technology that can solve "CAPTCHA" so-called turing tests (image recognition for disguised alphanumeric characters) and also that blind people are infuriated at CAPTCHA anyhow since they have to rely on automated screen-reading software which is blocked by that.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
http://www.theguardian.com/technology/2014/jun/09/eugene-person-human-comput... This story has some actual transcripts. Suppose you had to decide whether the judge was a "serious professional Turing contest judge" or an "idiot" just based on reading the transcript...
Reading the transcripts, it seems to me that the people running this test didn't understand how it was supposed to work, as described by Turing. As described by Turing: The judge, in their half of the conversation, is doing their best to trip up the computer, saying things and asking questions which would be difficult for a computer to respond to, making it easier for them to distinguish the computer from the human. The human talking to the judge is doing their best to help the judge distinguish the human from the computer, by saying things that the judge would find it unlikely that a computer would say. As practiced, judging from the transcripts: The judge makes conversation, following the lead of the person/computer they are talking to, making it as easy as possible for the computer to operate in its "comfort zone" where it can imitate a human well. The human talking to the judge is relatively passive, participating in conversation, but making no attempt to say things a computer wouldn't say to make it easy for the judge to describe. Even given this much easier task, the transcripts I saw seem obviously computer generated: despite the fact that the judges got it wrong a third of the time, I would happily take bets at 10-1 on my ability to distinguish a human from a chatbot. I don't know whether the judges were inept, or actually voted for what they knew was the computer when they thought it did an impressive job, even if they weren't fooled, but the claim that they were experts seems like nonsense. And how is this judged, anyway? Is there a formal certification program for Human-Computer-Distingushing Judges, or should "self-proclaimed" be inserted before the word "expert" wherever it occurs? Watson didn't pass a turing test, but I find its performance truly impressive; the questions it read and understood were not written to be understood by a computer, and were in fact written in an unusual playful and punny style that Jeopardy uses, and the fact that Watson was able to understand these questions well enough to answer them is real progress in language understanding. The performance on this rigged "turing test" I find much less impressive. Andy On Mon, Jun 9, 2014 at 2:52 PM, Warren D Smith <warren.wds@gmail.com> wrote:
http://www.theguardian.com/technology/2014/jun/09/eugene-person-human-comput...
This story has some actual transcripts. Suppose you had to decide whether the judge was a "serious professional Turing contest judge" or an "idiot" just based on reading the transcript...
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
Although it's never mentioned anymore, the actual test that Turing proposed was that a man and a computer would each pretend to be a woman in a conversation with the judge. If the computer could fool the judges as well as the man could, that would be a mark of intelligence. The test was perhaps indicative of Turing's thoughts about sexual identity. Brent On 6/10/2014 7:37 AM, Andy Latto wrote:
Reading the transcripts, it seems to me that the people running this test didn't understand how it was supposed to work, as described by Turing.
As described by Turing:
The judge, in their half of the conversation, is doing their best to trip up the computer, saying things and asking questions which would be difficult for a computer to respond to, making it easier for them to distinguish the computer from the human.
The human talking to the judge is doing their best to help the judge distinguish the human from the computer, by saying things that the judge would find it unlikely that a computer would say.
As practiced, judging from the transcripts:
The judge makes conversation, following the lead of the person/computer they are talking to, making it as easy as possible for the computer to operate in its "comfort zone" where it can imitate a human well.
The human talking to the judge is relatively passive, participating in conversation, but making no attempt to say things a computer wouldn't say to make it easy for the judge to describe.
Even given this much easier task, the transcripts I saw seem obviously computer generated: despite the fact that the judges got it wrong a third of the time, I would happily take bets at 10-1 on my ability to distinguish a human from a chatbot. I don't know whether the judges were inept, or actually voted for what they knew was the computer when they thought it did an impressive job, even if they weren't fooled, but the claim that they were experts seems like nonsense. And how is this judged, anyway? Is there a formal certification program for Human-Computer-Distingushing Judges, or should "self-proclaimed" be inserted before the word "expert" wherever it occurs?
Watson didn't pass a turing test, but I find its performance truly impressive; the questions it read and understood were not written to be understood by a computer, and were in fact written in an unusual playful and punny style that Jeopardy uses, and the fact that Watson was able to understand these questions well enough to answer them is real progress in language understanding. The performance on this rigged "turing test" I find much less impressive.
Andy
On Mon, Jun 9, 2014 at 2:52 PM, Warren D Smith <warren.wds@gmail.com> wrote:
http://www.theguardian.com/technology/2014/jun/09/eugene-person-human-comput...
This story has some actual transcripts. Suppose you had to decide whether the judge was a "serious professional Turing contest judge" or an "idiot" just based on reading the transcript...
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On 9 Jun 2014 at 14:07, Warren D Smith wrote:
Stuart Anderson: http://www.smh.com.au/digital-life/computers/super-computer-first-to-pas s-turing-test-convince-judges-its-alive-20140608-zs1bu.html
--software emulates a 13 year old Ukrainian boy named Eugene Goostman, supposedly convinced 33% of judges he was human. Did not say how many judges interviewed him, but at most 30.
And there was a sort-of-funny precedecessor to this "Turing test passed" from a very long time ago. Danny never sent me a copy of the SIGART newsletter that he wrote it up in, but it did happen: There was this early incident I was involved with: Date: Sat, 28 Jan 89 0:22:59 EST From: Bernie Cosell <cosell@WILMA.BBN.COM> Subject: Re: ELIZA and Joe Weizenbaum } > Or, there's the story about the guy who falls asleep in front of his } > terminal with an ELIZA program running and his boss logs on and thinks he's } > talking to him but is actually talking to the program, and gets pissed off. } } This may have actually happened. Joseph Weizenbaum (MIT professor, author of } _Computer Power and Human Reason_) told the anecdote in a class, with himself } as one of the actors. It went something like this -- some of this is } doubtless my own memory inventing things. The dialogue is partially courtesy } of GNU Emacs' Eliza program, and the rest is made up. } } .... anecdote follows... Is that for real, that Joe is telling that story? He has a lot of anecdotes, many of which appear in CP&HR, but I didn't know he was including one like that these days (alhtough such a thing must have SURELY happened some time or other at MIT). The REAL first round of that anecdote dates publicly to a small bit Danny Bobrow wrote in the first issue of some AI journal he started in something like 1968. The thing DID happen, although not quite as the word-of-mouth has transmitted it down to the present generation. The program in question was _DOCTOR_, **NOT** Eliza, and it happened at BBN, not at MIT. I know all of this, because (Ta DAAH!) **I** wrote the original Doctor! Not _Eliza_ --- _doctor_: Weizenbaum's CACM article on Eliza had just appeared and for a variety of reasons I was looking for a neat Lisp hack to play with. The CACM article mostly told me enough, and I went off and wrote the thing. I can supply the details of the *real* "A Turing Test Passed" incident (the title of Danny Bobrow's article describing the event: it involved my version of doctor that I had left running for people to play with to help me get it debugged, the "innocent third party" -- Danny Bobrow, and the Turing Testee, a random executive (whose name I will not reveal) who thought (for reasons that it is hard to figure out) that the Mod-33 was connected through to Danny at home early on a Saturday morning. I can supply more details if anyone really cares, including (if I can dig the thing out of my archives) a copy of Bobrow's article about the incident which included the *real* typescript (danny came in later that Saturday morning and ripped it off of the terminal). Bernie Cosell, BBN Sys & Tech, Cambridge, MA 02238 -------------------------------------------------------------- Date: Mon, 30 Jan 89 02:26 EST From: KROVETZ@cs.umass.EDU Subject: ELIZA and Joe Weizenbaum Bloom's story about Weizenbaum and ELIZA is close, but the person involved was Daniel Bobrow, not Weizenbaum. The account was described in the December 1968 issue of the SIGART Newsletter, and was reprinted in the book Machines Who Think, by Pamela McCorduck (p. 225). The following is the description of the account: "One Saturday morning about 9 A.M., a BBN Vice-President in charge of selling our Telcomp commercial service arrived at our PDP-1 computer room hoping to be able to use the computer for a demonstration. He found a note on the PDP-1 teletype that read: `TYPE ALL MESSAGES INTO THIS TELETYPE, AND END THEM WITH A PERIOD'. This note was left there by one of the programming bums who had just succeeded in making run a Lisp version of the DOCTOR program (based on ELIZA as developed by Weizenbaum). However, the VP thought that this note indicated that this teletype was linked with the one in my [Bobrow's] home, since I often used the computer all night. Therefore, he started typing to `me', but in fact was typing to this version of the DOCTOR program. The following conversation ensued: VP: Do you think that I can use the computer this morning? TELETYPE: Why do you ask? VP: I might be able to make some additional telcomp sales. TELETYPE: Why aren't you sure? VP: My prospective customers have never seen a demonstration of the system. TELETYPE: Does this mean very much to you? VP: Of course it does. TELETYPE: You seem very positive of that. VP: This is ridiculous. TELETYPE: What do you mean this is ridiculous? VP: Is there any reason why I can't use the computer now? TELETYPE: Tell me more about your prospective customers. VP: Please dial me up on 491-1850 Note that after that remark the VP did not obey instructions and left out the period. Therefore, of course, the computer didn't answer him. This so infuriated the VP, who thought I was playing games with him, that he called me up, woke me from a deep sleep, and said: VP: Why are you being so snotty with me? BOBROW: What do you mean why am I being snotty to you? The VP angrily read the dialog that `we' had been having, and couldn't get any response but laughter from me. It took me a while to convince him it really was the computer". Bob Krovetz krovetz@cs.umass.edu or krovetz@umass.bitnet /Bernie\ -- Bernie Cosell Fantasy Farm Fibers mailto:bernie@fantasyfarm.com Pearisburg, VA --> Too many people, too few sheep <--
participants (5)
-
Andy Latto -
Bernie Cosell -
meekerdb -
rkg -
Warren D Smith