What Cognitive Computing can Teach us About How we Learn

Analysis and evalution of hypotheses

The need for a more evidence-based approach to complex decisions has catapulted computing beyond the rigid logic tree, which breaks under the strains of high volumes of information. We're now squarely in the world of cognitive computing, heralded by IBM's Watson. 

    We need the power of cognitive computing to put into context the volumes of information we deal with daily in many professions. It's easy to see how doctors and financial analysts benefit from systems that help them derive value by enhancing their expertise, but we could argue that many other fields are increasingly dependent and interdependent on knowing more, faster.

    To understand how these systems work, we first want to gain an appreciation of the steps we take as humans to make decisions:

  1. we observe visible phenomena and bodies of evidence
  2. we draw on what we know to interpret what we see to generate hypotheses about what it means
  3. we evaluate which hypotheses are right or wrong
  4. we decide choosing the option that seems best and act accordingly

    In the same ways we go through this process to reason about information and make decisions, cognitive systems can also do this at mass speed and scale. To learn how they do it, we examine the process IBM's Watson uses.

    While conventional computers can analyze organized and structured information, Watson can analyze unstructured data, which is 80% of information from social media, chats, tweets, articles, research, blogs, etc. Another way of thinking about the definition of unstructured is all the information humans generate for other humans to consume.

    Structured data has well-defined fields to categorize information. To analyze unstructured data, Watson relies on natural language, which has rules of grammar, context, and culture. It's implicit, ambiguous, complex, and a challenge to process.

    The difficulty in parsing human language includes tone, use of irony and sarcasm, as well as idioms and expressions specific to certain industries and groups. For example, “we feel blue” when “it's raining cats and dogs,” and “we fill in a form” someone asked us to “fill out.”

     Watson doesn't just look for keywords matches and synonyms like a search engine, it reads and interprets text like a person. It does it by breaking down a sentence grammatically, relationally, and structurally discerning meaning from the semantics of the written material. It understands context, which is very different than speech recognition — how computers translate human speech into a set of words.

    The difference includes also an attempt to understand the real intents behind what a person says uses it to possibly extract logical responses and draw inferences to potential answers through a broad array of linguistic models and algorithms.

    In similar ways as when we go to work in a particular field, Watson learns its specific language, jargon, and the mode of thought of that domain.

    For example, cancer. There are many types of cancer and each has different symptoms and treatments. However, the symptoms can also be associated with diseases other than cancer. Treatments can have side effects, and affect people differently depending on many factors. Watson evaluates standards of care practices and thousands of pages of literature that capture the best science in the field and from all that, it identifies the best therapies that offer the best choices for the doctor to consider in their treatment of the patient.

    The system uses the guidance of human experts to collect the knowledge it needs to have literacy in a specific domain. In technical terms, they call this a corpus of knowledge. Collection starts by loading the body of relevant information into the system, plus human intervention to cull out of date, poorly regarded, and immaterial information to the problem domain. We call this process curation.

    Then the system pre-processes the information to build meta data about the content to make working with it easier, just like we would when we're preparing for some presentation or article. The technical term for this phase is ingestion. Watson also creates a knowledge graph to assist in answering more precise questions.

    Once it has ingested the corpus of knowledge, the system needs training by a human expert on how to interpret the information. To learn the best possible responses and acquire the ability to find patterns, Watson partners with experts who train it using an approach called machine learning. 

    It works like this. An expert uploads training data in the form of questions/answers pairs that serve as ground truths. This doesn't give the system explicit answers for everything it will be asked, but teaches it the linguistic patterns of meaning in the domain.

    After the training through learning pairs, it continues to learn through interactions with users. Experts conduct a periodic review of these interactions to feed better information into the system. They also update it with new information and publications to keep its knowledge of the field current.

    With this process, Watson can answer complex questions in a domain and provide a range of potential recommendations that are backed by evidence. it's also prepared to identify new insights and patterns locked away in the information.

    Human experts use Watson to uncover new possibilities in data and make better decisions based on evidence. The system's approach to identify answers to inquiries:

  • it identifies the part of the speech with the question
  • then generates hypotheses
  • and looks for evidence to support or refute the hypothesis
  • it scores each based on statistical modeling for each piece of evidence
  • this is known as weighed confidence scores
  • it estimates its scores based on how high evidence is rated during evidence scoring and ranking

    This means Watson is able to run analytics against a body of data to clean insights, which it can turn into inspirations allowing experts to make more informed decisions. Across an organization, it scales and democratizes expertise by surfacing accurate responses and answers to an inquiry or question. It also accelerates expertise by surfacing a set of possibilities from a large body of data saving valuable time.

    Current applications are in the legal, medical, financial fields, and even cooking. Its speed and ability to deal with volumes of data are also helping us discover patterns we had not seen before. The system continues to learn, adapt, and get smarter, gaining value with age by learning through interactions with humans and from its successes and failure.

    Just like we do (or should). Ego, personal culture, and a vested interest in certain answers may prevent us from culling out of date, poorly regarded, and immaterial information. The way we process is also different. As young as five years old, we decide from whom we should learn based on our experience. 

    Kathleen Corriveau and Katelyn Kurkel in the October, 2014 issue of Child Development examined whether children can use the quality of the explanations people give them to determine whose input they should trust. We're able to make good judgement by the time we're five based on quality.

    This is a crucial skill, because it helps us avoid bad input or knowledge. “Human memory does not allow us to erase facts that turn out to be false,” says Dr. Ark Markman#:

Instead, when we learn that something is false, we have to mark it as being untrue so that we explicitly ignore it later. That is one reason why we often continue to be influenced by information that we have been told in the past was not true.

    The better quality the information, the less energy we'll need to expend ignoring certain memories. Systems can help us navigate human biases in predictions by cross referencing larger volumes of data and information than we would ever be able to tackle and learning from failures at scale.

    For a visual illustration of cognitive computing see IBM Watson: how it works.