Historical Data can make a big, hairy mess in Machine Learning

Publish date:

Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed.

Machine learning is development of computer programs from historical data called training data. Let us consider the most successful examples in machine learning first.

We have access to a lot of texts from official governments, embassies, NGOs, and United Nations Organization that publish texts in English and hundreds of other languages. Machine translation tools such as Google Translate use these translated library of documents as their training data. For example, the English–German translation training happens using English and German versions of various documents. When the user wants to translate a specific text from English to German, it happens successfully by choosing the right <English, German> pair of sentences.

IBM and Memorial Sloan Kettering are training Watson in Oncology using the massive amount of patient medical records across the world. For each of the patients the historical data has the detailed records of symptom parameter values and the diagnosis and treatments given. Watson learns from the historical data of <Symptoms, Diagnosis> and <Treatment, Results>.

The following instance shows that learning through the historical data has the potential risk of misguiding us resulting in dangerous consequences.

Every morning a priest used to go with fruits, flowers, and a jug of milk and open the temple for Morning Prayer. He used to keep all these on the steps of a pool inside the temple, take a bath in the pool and then start the prayer. In the temple, there were rats and they started to disturb the fruits and others. After struggling with the mischief of the rat for a few months, one of the devotees of the temple brought a cat to protect the offerings from the rat. The presence of cat controlled the rats but after a few days, the cat started the mischief; tried to drink the milk and created a mess.

In order to manage the issue, the following process was agreed upon.

  1. Priest enters the temple with fruits, flowers and milk.
  2. Priest ties the cat to a pillar using a rope.
  3. Priest takes a bath and then conducts the prayer.
  4. At the end of the prayer, the cat is untied.

Years rolled by, several priests changed, the popularity of the morning prayer increased but the above process was strictly followed as a mandatory practice. So, tying the cat to a pillar became a traditional custom before the priest can take a bath and pray.

One day after several years, the cat died. Now the temple management committee and the priest were so sad and said: “we cannot do the prayers without having a new temple cat.”

The above illustrates that learning and decision making purely based on historical data patterns is not always successful.  It has the threat of making grave mistakes. So, the data scientists have to consider the context and the detailed analysis to understand the cause and effect of the data to make the machine learning complete and successful. I shall elaborate it in my next blog.

Related Posts

AI

Digital transformation impacting your job

Sicco Maathuis
Date icon November 16, 2018

Millennials and Generation Z have a different value system. Born with the conviction that...

Artificial Intelligence

Augmented artificial intelligence: Will it work?

Reinoud Kaasschieter
Date icon October 10, 2018

Augmented Intelligence seems to be the way to push the AI revolution forward. But will it...

Artificial Intelligence

RPA and AI across the intelligent automation spectrum

Lee Beardmore
Date icon October 4, 2018

RPA and its expansion into AI is helping to drive a new era of business and IT alignment.

cookies.

By continuing to navigate on this website, you accept the use of cookies.

For more information and to change the setting of cookies on your computer, please read our Privacy Policy.

Close

Close cookie information