Historical Data can make a big, hairy mess in Machine Learning

Publish date:

Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed.

Machine learning is development of computer programs from historical data called training data. Let us consider the most successful examples in machine learning first.

We have access to a lot of texts from official governments, embassies, NGOs, and United Nations Organization that publish texts in English and hundreds of other languages. Machine translation tools such as Google Translate use these translated library of documents as their training data. For example, the English–German translation training happens using English and German versions of various documents. When the user wants to translate a specific text from English to German, it happens successfully by choosing the right <English, German> pair of sentences.

IBM and Memorial Sloan Kettering are training Watson in Oncology using the massive amount of patient medical records across the world. For each of the patients the historical data has the detailed records of symptom parameter values and the diagnosis and treatments given. Watson learns from the historical data of <Symptoms, Diagnosis> and <Treatment, Results>.

The following instance shows that learning through the historical data has the potential risk of misguiding us resulting in dangerous consequences.

Every morning a priest used to go with fruits, flowers, and a jug of milk and open the temple for Morning Prayer. He used to keep all these on the steps of a pool inside the temple, take a bath in the pool and then start the prayer. In the temple, there were rats and they started to disturb the fruits and others. After struggling with the mischief of the rat for a few months, one of the devotees of the temple brought a cat to protect the offerings from the rat. The presence of cat controlled the rats but after a few days, the cat started the mischief; tried to drink the milk and created a mess.

In order to manage the issue, the following process was agreed upon.

  1. Priest enters the temple with fruits, flowers and milk.
  2. Priest ties the cat to a pillar using a rope.
  3. Priest takes a bath and then conducts the prayer.
  4. At the end of the prayer, the cat is untied.

Years rolled by, several priests changed, the popularity of the morning prayer increased but the above process was strictly followed as a mandatory practice. So, tying the cat to a pillar became a traditional custom before the priest can take a bath and pray.

One day after several years, the cat died. Now the temple management committee and the priest were so sad and said: “we cannot do the prayers without having a new temple cat.”

The above illustrates that learning and decision making purely based on historical data patterns is not always successful.  It has the threat of making grave mistakes. So, the data scientists have to consider the context and the detailed analysis to understand the cause and effect of the data to make the machine learning complete and successful. I shall elaborate it in my next blog.

Related Posts

AI

How a threat to nuclear submarines now protects against floods

Robert Engels
Date icon May 8, 2020

The technology that was used to track nuclear submarines can now also be used to assess water...

Artificial Intelligence

AI and people – the benefits of symbiosis

Lee Beardmore
Date icon May 7, 2020

By growing accustomed to the extra performance bandwidth that AI enables, we will be inspired...

Artificial Intelligence

Introducing AI – getting started

Priya Ganesh
Date icon April 20, 2020

When it comes to getting the best ROI from your new AI process, having a plan is where you...

cookies.

By continuing to navigate on this website, you accept the use of cookies.

For more information and to change the setting of cookies on your computer, please read our Privacy Policy.

Close

Close cookie information