Saturday, March 19, 2011

Watson

Share |
In 2005, Nico Schlaefer, a grad student at Carnegie Mellon University, built a statistical query system and wrote a thesis he called Statistical Thought Expansion later named Ephyra. IBM was impressed. Nico worked three summers on Watson. He is now a PhD candidate at CM and an IBM PhD Fellow.
In the Tourette syndrome example given below, Watson was unable to answer until they included more of the symptoms and signs in the database. Q & A as done with Watson seems analogues to Clinical Data & Differential Diagnosis. To make Watsons job easier, enter clinical data in a relational database in simple consistent terms. Likewise, list the sum total of medical diagnostic information in the same simple consistent terms. The relational database can correlate and list the match ups as diagnostic possibilities. A statistical program -- and here is where Watson comes in -- can list the probility of each. Furthermore, a statistical program can conduct an ongoing adjustment to the probable diagnosis based on realtime outcome as determined by subsequent information.
This is not to say that the computer makes the diagnosis, but it does give, at a glance, all of the possibilities. In fact quite the reverse, the statistical program improves its selections and statistics based on the clinician’s evolving and final diagnosis.
In practice, this computer directed diagnostics can be done on an off the shelf database program. Watson may be too hard to move around, and I imagine that the off the shelf database on the clinician’s own computer will be a bit less expensive. The important aspect, however, is still the statistical application. I guess that the articles about the development of Watson do not divulge all of the statistical mechanism, which makes up the AI of Watsons prenominal performance.
Simplicity, however, is the thing that works best with clinicians, and I would bet that there is already a simple statistical application that will function with a relational database. Schlaefer describes source expansion, and for us that source is medical information, all of it -- in simple database terms with criteria of diagnosis.
Found in Probably Irrelevant, from an interview. “Information Retrieval in IBM’s Watson: An interview with Nico Schlaefer,”  Posted on March 17th, 2011 by Jon Elsas
“Nico Schlaefer: Here is a question for which source expansion helped:
What is the name of the rare neurological disease with symptoms such as: involuntary movements (tics), swearing, and incoherent vocalizations (grunts, shouts, etc.)?
This is a question from the TREC 8 evaluation [pdf], but if written as a statement (”This rare neurological disease has symptoms such as …”) I think it could also pass as a Jeopardy! question. The answer is “Tourette syndrome”.
We first tried to answer this question using Wikipedia as a source, and there is indeed an article about “Tourette syndrome” in our copy of Wikipedia, but unfortunately it doesn’t mention most of the keywords in the question and Watson wasn’t able to get the answer. We then expanded Wikipedia, and “Tourette syndrome” was one of the topics that was automatically selected. The expanded article contains the following text passages which, by the way, all come from different websites:
·         Rare neurological disease that causes repetitive motor and vocal tics
·         The first symptoms usually are involuntary movements (tics) of the face, arms, limbs or trunk.
·         Tourette’s syndrome (TS) is a neurological disorder characterized by repetitive, stereotyped, involuntary movements and vocalizations called tics.
·         The person afflicted may also swear or shout strange words, grunt, bark or make other loud sounds.
These passages jointly almost perfectly cover the question keywords. I think the only content word that is not in there is “incoherent”. This made it very easy for Watson to find the answer.”

No comments:

Post a Comment