Team Human vs. Team Watson Round III

Well it seems the machines have won, select your pod early, you’ll want to get a good view of the energy harvesting machines.

We were talking about Watson around the ol’ AI research lab today and someone pointed out that Watson is yet another highly tailored solution to a particular problem just like DeepBlue (click it, its Arcade Fire, just CLICK IT) was for chess. It’s using a lot of brute force and some reasoning but it’s still not solving the same problem humans are and the domain is somewhat restricted.

Now the interesting difference is that whereas chess is a deterministic game where you can search for an optimal strategy, jeopardy has layers of uncertainty hidden behind human language and the behaviour of other players. So while it’s not a very realistic setting for general AI, and doesn’t claim to be, it has stepped over an important threshold from deterministic, logic based problems to ones that require reasoning under uncertainty and statistics.

This is very fitting as the field of AI research itself has gone through the same change in focus in the past 20 years as outlined very well by Peter Norvig recently. When I took my undergraduate AI classes in the 90s I fell in love with prolog and logical planning. That’s why I went into AI research later.

When I got to grad school I found out that during my undergrad AI courses I had been missing a renaissance that had been occurring which led to modern machine learning and probabilistic AI. Watson’s achievement is only possible with these new methods and the raw computing power increases we have had over the same period.

But  apparently it did also have one other advantage. As many people have speculated, the machine did seem to have a buzzer advantage. According to op-ed by Ken Jennings himself, Watson’s speed with the buzzer was decisive in making up for questions it got wrong.  Is this just sour grapes? Maybe just a little, you need some ego to be an intense competitor like Jennings, but I think he has a point. As I pointed out yesterday the quick reaction time between making the decision to buzz and registering a button press is something a machine can clearly be faster at. Is this what winning at Jeopardy means?

It shouldn’t be.

Winning should mean the ability to answer complex questions, with ambiguous meanings, under time pressure while making the best strategic betting choices.  That is the task Watson performed admirably at. It could have had a buzzer delay and read the screens with computer vision rather than receiving a text file to parse and perhaps it still would have won.

But we’ll never know now.

So you won this round Watson. And you’re impressive (well, the engineering team that built ‘you’ is impressive actually). Hopefully everyone has learned a bit about AI and hopefully some young girls or boys will be inspired to consider computer science or engineering that otherwise wouldn’t have.

But next year…next year you should come back and put it all the table.  Play it our way, the human way, you have the capability to at least try. And may the best machine, be they biological or electronic, win.

Team Human vs. Team Watson : Round II

Since Watson is doing so well there has been some confusion about what is actually going on as we watch the game. There’s been some talk that the Jeopardy challenge is unfair, and that’s true it is, but both sides have some unfair advantages.  Here’s how it is as far as I understand it.

Team Human Advantages (#teamhuman)

  • Using the most advanced computational system ever encountered, which has been under intense development for millions of years, the Human Brain. It has more raw computational power than Watson, can handle almost infinitely more parallelization and dynamic linking, is incredibly robust to new information and has pattern recognition heuristics which we are only barely beginning to comprehend. Its hard to underestimate how big an advantage this is and its hard to judge it, this is why it seems like a good problem for AI research.
  • They understand language – Watson does not understand language at all. It knows some things about language patterns and has learned how to match words and phrases for answers to other words and phrases which are questions for Jeopardy, and only Jeopardy. If a topic shows up which is has not seen much then it does not know what to do. Since it doesn’t understand language it can’t make the kind of leaps of reasoning that the humans can. This is why the topics are restricted to types of topics that have shown up on Jeopardy before, nothing that requires Trebek to explain the meaning of the question.

Watson Advantages (#teamwatson)

  • A huge memory database of facts which are relevant to jeopardy questions – its hard to say if this is more facts than Ken Jennings has in his head, it’s represented very differently but humans have amazing heuristics for accessing data quickly and linking it together. But perhaps Watson has an edge here.
  • A totally focussed system designed and optimized for years just to play Jeopardy (against most of us this is an advantage, against Ken Jennings and … its questionable who has spent more time training for these games)
  • Question sent as a text file – this really could be a bit of an advantage, the computer still needs to scan the text, parse it and analyze it.  The humans need to analyse the visuals and simultaneously listen to Alex Trebek describe the question, of course humans are very good at that kind of thing, Watson is not.
  • No video questions – ya, that’s just not on, you want to see a machine fail at something? I’ll show you my toaster trying to cook lasagna, we’re just not there yet.
  • Button pressing – it’s not clear exactly how Watson’s button actuator works. And how does Watson know it is allowed to press the button if its not listening to Trebek? It’s possible it has an ‘unfair’ fast reaction time to the humans, I don’t know.

So while this is a fascinating challenge and should demonstrate to everyone how far Artificial Intelligence research has come, everyone should keep in mind that Watson is not Data or Skynet, in fact it’s not even Wall-E.

Watson has been trained to play Jeopardy. It has the ability to answer questions and find data in a much more natural manner than was possible even 5 years ago. But it is not playing the same game that Ken Jennings and Brad Rutter are. Maybe next year.

My advice for next year’s challenge

Oh you know there will be one, Jeopardy ratings are through the roof! The engineers at IBM should make efforts to remove these complaints of unfairness in the following ways:

  • (1) Let us see Watson’s button – Ahem, you know what I mean. Put a robot out there or something more visual to let Watson press the button. Also, do some work with people who understand the human body to make sure it isn’t unfairly fast. How long does it take for a human being to physically press a button from the moment they ‘decide’ to press it? It may seem like a handicap to add this delay to Watson but it really would seem more fair. After all, we want to know that Watson is winning on the questioning answering part of the game, not these physical details, so remove them as issues.
  • (2) Give Watson a camera – Watson really could visually parse the questions to know what they are. At the beginning of the round it would scan the topics visually and build a database to start planning.  This shouldn’t be hard as visual text analysis is quite advanced, it wasn’t added because it was a needless complication of an engineering problem. But the optics, excuse the pun, are not good in terms of fairness.
  • (3a) Get rid of all talking – Lock each player in a room, they wouldn’t hear Alex Trebek, they’ll all just read the questions.  When another contestant answers this should be sent via text file to all the other players. Then it would be fair…but boring and weird.
  • OR
  • (3b) Give Watson speech recognition – This will be a real problem as speech recognition is one of those areas that really has turned out to be harder than anyone imagined. Vision? No Problem. Text analysis? Give me enough text and I will move the Earth. Robotic control? Are you kidding? Easy. But understanding human speech transmitted through a vibrating air column, damn that’s hard.

    But they should do it, they could use the latest technology available for this, which IBM is generally recognized to lead anyways, and just let it be the machine’s achilles heal.  It could train on Alex Trebek’s Canadian accent (no French Alex!) It could train on its opponents and it could nail the common and simple prompts that the host gives when it is it someone’s turn to play or to break for commercials. It could just ignore Trebek reading the question out except for the cue that it is time to press the button. It may still not do much better at taking advantage of  wrong answers by opponents and it would likely make some entertaining mistakes, but it would make it more realistic. Paradoxically it might get more credit for losing in this way than it will in winning the way it is currently set up.

So if humanity loses tonight, don’t fret. This machine is just a step along the way and sometimes things aren’t as smart as they first seem. Then again, you could also say its holding by not even trying to do everything at once. Would you be more scared/impressed if Watson really did everything its human opponents did and still faired well?

On to round three.

 

 

 

 

 

 

 

Team Human vs Team Watson : Round I

Just in case watching a computer play humans on Jeopardy wasn’t your idea of a romantic evening last night, here’s my summary of what happened.

They spent a lot of time explaining the way Watson was built and even gave a high level discussion of the fact that the system maintains a belief distribution over possible answers.  When the computer answers a question we see its top three picks with the probability weight on each answer and the threshold for buzzing in.

The first round was impressive with Watson dominating the first few minutes.  It answered quickly and flawlessly until the first commercial break. It seemed that the humans just weren’t fast enough.

But after the commercials it got interesting. Ken Jennings seemed to modify his strategy to simply press the button as early as possible.  It was clear he buzzed in several times having no idea what the answer was, then stalled for a moment and guessed, usually correctly.  This was a smart adaptation to Watson’s algorithm.  The computer won’t answer unless it is confident in its answer. But a human can keep thinking after they buzz in and gamble that they can come up with something.

Interestingly when Ken buzzed in very early the answers showing on the screen for Watson seemed to be lower quality, it still hadn’t converged on a good answer and froze at the buzzer.

At one point Ken buzzed in and got it wrong, then Watson buzzed in. We could see that its top answer was the same wrong answer Ken just gave, but it repeated it anyways.  This seems to indicate the algorithm does not keep computing after the buzzer and can’t take into account the answers of other players. A minor thing to change really.

The round ended with Watson answering the last question, correctly identifying the Event Horizon of a black hole. Fitting I think.

We’ll see what happens tonight.

What do you think about the challenge, leave your observations in the comments. If you are on twitter make sure to pick a side: Are you rooting for #teamhuman or #teamwatson?

The Robot and the Hound

This term I’m TAing for a fourth year course on Artificial Intelligence at UBC so I’m going to have lots of links to news in AI coming in front of me.  I might as well do something with it so over the next few months I’ll post up comments on my thoughts about the current state of AI research and where its going both in theory and in real world applications.

I’ll add interesting links to my AINews list on the side of this page but for a fantastic source of all things AI check out this list from AAAI which is actually maintained using AI algorithms to combine reader reviews and natural language searches of the web.

First up, Robots.

Robert Silverberg recently decided to pine about the good ‘ol days with his Science Writer breathren about what they thought the ‘robots of the future’ would look like and how its really turned out.  It’s an interesting read from a great SF writer and gives a wonderful overview of the history of robots in fiction.

His conclusion essentially comes down to reminding roboticists to be sure to instill Asimov’s Three Laws of Robotics into all their machines lest they suffer the dire threat of lawsuits in the future.

The notion of  responsibility is indeed an important one that AI researchers will need to deal with.  Right now all robots and machines are pretty dumb no matter how smart they seem.  There is no computer on Earth with anything remotely resembling free will or self-awareness. Thus there is no way any machine could be blamed for its actions at this time so any errors are either the result of improper usage, bad planning by engineers or the random outcome of rules that no one had a problem with.

You can already begin to see this in the way Google, for example, responds to criticism about what shows up on their search results.  Their algorithms are complex and seeks patterns amongst huge amounts of data that no human can see.  They use a lot of AI algorithms and the particular results that show up when you search for something really aren’t the cause of what any one person at Google has done.  Google put up the system which allows those results to be generate but whether the results are offensive for one particular query isn’t something any human being made a decision about.
So I think Google would have a strong argument that some result which was offensive and somehow caused harm wasn’t their fault as long as they’ve taken reasonable precautions.

I think the next stage of moral responsibility for machine behaviour will be something similar to the responsibility of a dog owner for the actions of their dog. A dog owner is at the park walking their dog and the dog is running free and hurts a child or another dog.  The dog did the action and will likely be punished or put  down, but you can’t sue the dog.  However, depending on how the situation arose the owner has some responsibility. If the dog is usually nice and friendly, it was a small dog, it was a leash free park etc, then perhaps the owner would have no legal liability at all.  On the other extreme an abusive dog owner who raises a mean spirited dog that often attacks others and then released it in a leash-only park would face serious penalties although still likely less than if they performed the attack themselves.  I don’t know what the laws are but there is a reasonable way to write such laws that the owner can be held responsible but where the independent will of the dog itself is considered.

Someday, Isaac willing, someone will create a computer that begins to approach the level of complexity that we could say the engineers who made it can only be held as responsible for its actions as they are for their dog. While the Three Laws of Robotics are a nice idea they may be no easier to hardwire than how you train your dog not to bark at others and to always come when you call.  

So lets keep the intent of the three laws but lets make sure that when that day comes we’ve raised a nice, friendly dog rather than an uncontrollable pit bull.

I’ll Take Spectacle for $1000 Alex.

News now that the next big public spectacle in the battle between Man vs Machine will be….Jeaopardy?

Update: more detail here

You may remember that computer’s have now defeated the greatest human players of chess inspiring endless punditry and loose discussion about ‘thinking’ machines as well as inspiring awesome Arcade Fire songs. Computers are also now quite good at playing poker, have solved Checkers completely (no point playing that anymore…), provide us with frustrating ‘automated phone help’ bots and regularly vacuum the floors of geeks fairly adequately.

Sigh. Perhaps this is why the New York Times article, which is otherwise pretty clear and non-hyperbolic about the next spectacle, felt the need to throw this in:

Despite more than four decades of experimentation in artificial intelligence, scientists have made only modest progress until now toward building machines that can understand language and interact with humans.

Now, I’m an Artificial Intelligence researcher, so I’ll try to be rational about this sentence.

The first half of the sentence refers to the common observation that four decades of research into AI has not produced walking,m talking androids trying to take over the world and consume us for power.  Instead it had provided tremendous research gains and advances in technology that underly  many aspects of our modern world from google to space probes, from self-driving cars to face detecting auto-focus cameras, from management of complex energy systems to medical diagnostic tools.  The second half the sentence points out that on the problem everyone on the street really cares about, walking talking androids that can ‘think’ like us and understand what we’re saying…that progress has been below society’s ridiculously high expectations.

Granted. Voice recognition has got a lot better over the years but not up to say, the 4 year old child level. But you know, we don’t even really understand how our own brains work, that makes simulating one in a less complex computing machine that the one between our ears, you know, tricky.  (A separate approach that may outflank current AI might in fact be just building an equally complex simulation of a brain and letting it go, but that’s another post.)

But I love these public spectacles, they provide a chance to explain the current level of AI and open up some of the ideas of computation in the problem that are used in more relevant applications all around us.  Having a computer up on t.v. with Alex Trebek and other contestants will be fun and we probably won’t even have the embarrassing situation Mr. Kasparov was in of the computer beating the human, not yet anyways.

It will be entertaining, some of it will be funny and hopefully some of it will be informative to viewers who live in an increasingly computational world.  Playing jeopardy well is a much harder problem than playing chess well.  The challenges it requires in terms of understanding language, meaning, searching databases, forming sentences and making strategic decisions about bids and questions are all very rich domains that have more real world application than the way chess playing programs work which is generally some kind of brute force search.

I just hope when the computer loses, the show is over and they ship the computer back to IBM labs we don’t hear another round of  “why such modest progress”?  This ain’t rocket science people, its a lot harder than that.

%d bloggers like this: