Thursday, February 17, 2011

Watson on Jeopardy: Game 2 errors

The second Jeopardy game between IBM's Watson computer and two human competitors (Ken Jennings and Brad Rutter) was broadcast Wednesday night, Feb 16.

As I did in a previous post, I'm providing a list of the questions that stumped Watson in one way or another. Note that I include questions where Watson failed to buzz in because of lack or confidence, but not those where it was simply late on the buzzer but otherwise had the correct answer.

Yesterday I also discussed two questions from the broadcast which provided no information on Watson's guesses, and I speculated on how well Watson might have done on those if given a chance.

In future posts, I plan to discuss some of the more interesting errors from these tables.



Watson's confidence level is given as a percentage after each potential answer.


Questions that Watson blew completely

complete confidence in the wrong answer,
and the right answer was nowhere in its top three choices:
CategoryClueCorrect AnswerWatson's Answers
Also on Your Computer Keys (400) A loose-fitting dress hanging straight from the shoulders to below the waist end !*Chemise 96%
?*Skirt 12%
?*Blouse 10%
Also on Your Computer Keys (800) It's an abbreviation for Grand Prix auto racing F1 !*gpc 57%
?*NASCAR 11%
?*QED 7%
Questions that Watson screwed up
complete confidence in the wrong answer,
but the right answer was one of its top three choices:
EU, The European Union (1000) As of 2010, Croatia & Macedonia are candidates but this is the only former Yugoslav republic in the EU Slovenia !*Serbia 61%
?*Montenegro 15%
?Slovenia 12%
Magical Mouse-tery Tour (2000)
Maurice LaMarche found his inner Orson Welles to voice this rodent whose simple goal was to take over the world the Brain !?Pinky and the Brain 63%
?*Ed Wood 10%
?*capybara 10%
Legal "E"s (800) One definition of this is entering a private place with the intent of listening secretly to private conversations eavesdropping !?eavesdropper 79%
?eavesdropping 49%
?*Private eye 10%
I am counting these last two as a wrong answers by Watson since only "the Brain" is correct, not "Pinky" in the first case, and "eavesdropper" is the wrong form of the right answer in the second. Thus, these show confidence in an incorrect answer when the right answer was at Watson's fingertips.
Questions where Watson had no clue
no confidence in any choice, and no sense of the right answer
Magical Mouse-tery Tour (1600) The samplefest "The Grey Album" & the band Gnarls Barkley are 2 projects of Brian Burton, aka this Danger Mouse ?*Cee-Lo Green 20%
?*producer 13%
?*singer 11%
What to Wear (800) A bit longer than a cocktail dress, one hemmed to end at the shins is this beverage "length" tea ?*Skirt 20%
?*Gown 13%
?*knee 10%
Breaking News (400) It was 103 degrees in July 2010 & Con Ed's command center in this N.Y. borough showed 12,963 megawatts consumed at 1 time Manhattan ?*New York City 23%
?*Brooklyn 23%
?*Queens 10%
One Buck or Less (200) On December 8, 2008 this national newspaper raised its newsstand price by 25 cents to $1 USA Today ?*The Boston Herald 23%
?*Chicago Tribune 17%
?*Los Angeles Times 12%
One Buck or Less (400) The USPS cost for mailing this, a minimum of 3 1/2 X 5 inches, is 28 cents; Wish you were here!postcard?*Envelope 18%
?*extremum 13%
?*Post office 7%
One Buck or Less (1000) A 15-ounce V05 Moisture Milks conditioner from this manufacturer averages a buck online Alberto ?*Butter 40%
?*CCM 11%
?*dry 7%
Questions where Watson didn't know it had a clue
right answer not the top choice, no confidence in any choice
What to Wear (1600) If you're wearing Wellingtons at Wimbledon, you're wearing these rainboots or galoshes ?*white clothing 20%
?*panties 14%
?boots 10%
Also on Your Computer Keys (600) Football position that can be split or tight end ?*Linebacker 20%
?*Fullback 13%
?End 12%
Nonfiction (1200) The New Yorker's 1959 review of this said in its brevity & clarity it is "unlike most such manuals, a book, as well as a tool" The Elements of Style ?*Dorothy Parker 14%
?*Monster 14%
?The Elements of Style 10%
What to Wear (400) This plain-weave, sheer fabric made with tightly twisted yarn is also used to describe a pie or cake chiffon ?*Twill 37%
?Chiffon 31%
?*Silk 10%
Also on Your Computer Keys (1000) An additional section placed within the folds of a newspaper insert ?*Broadsheet 16%
?insert 12%
?*supplement 9%
EU, The European Union (400) The Schengen agreement removes any controls at these between most EU neighbors national borders ?*passport 33%
?Border 14%
?*Austria 8%
Dialing for Dialects (400) Dialects of this language include Wu, Yue & Hakka Chinese ?*Cantonese 41%
?Chinese 20%
?*Xiang 10%
Questions where Watson almost had a clue
right answer was the top choice, but not enough confidence:
One Buck or Less (600) In 2002 Eminem signed this rapper to a 7-figure deal, obviously worth a lot more than his name implies 50 Cent ?50 Cent 39%
?*Marshall Mathers 20%
?*Dr. Dre 14%
One Buck or Less (800) 99 cents got me a 4-pack of Ytterlig coasters from this Swedish chain Ikea ?Ikea 39%
?*Blimpie 10%
?*Ninety-nine 9%
U.S. Geographic Nicknames (1600) It's known as both "The Steel City" & "The Iron City" Pittsburgh ?Pittsburgh 40% ~59%
?*Port 20%~?*Jamshedpur 27%
Bethlehem 18%~11%
Also on Your Computer Keys (200) Proverbially, it's "where the heart is" home ?Home is Where the Heart Is 20%
?*delete key 11%
?*encryption 8%

No comments:

Post a Comment