Skip to Content

Testing of chatbots – the art of conversation, interpretation and validation


Hello Siri, may I have a strategy to test you?”

“Chatbots,” are essentially software that simulate human behavior and conduct conversations as a human would. Chatbots can be classified according to industry, and include banking chatbots, medical chatbots, personal finance chatbots, etc.

Depending on the industry, the mechanism to test the functionality will differ, but certain principles always remain the same. Below are important considerations for QA professionals to keep in mind as they build their testing strategy:

  1. Start with identifying use cases for the chatbot. List questions and potential responses for every scenario and prioritize them according to importance.
  2. Two aspects are important from a testing standpoint: the conversational capability of the chatbot, and the degree of intelligence the user expects from it. Most chatbots allow different types of data and these should be clearly identified and documented. For each use case, clearly define the testable requirement and the key performance indicator (KPI).
  3. From a technology vantage, chatbot KPIs include number of steps to perform a request as well as an average number of users. Business KPI examples include self-service rates (i.e., the extent to which a chatbot can resolve a request without human interaction), the average customer rating, and the sales conversion rate (i.e., the extent to which a chatbot can convert an online conversation into business).
  4. Once the testable requirements are defined, understand the underlying architecture and technology that the chatbot will use for each use case. Essentially, Chatbots are built on natural language processing (NLP). NLP is a way for computers to analyze and derive meaning from human language. For example, one use case could be an architecture that integrates an AI engine (like IBM Watson) with a custom NLP speech engine; another could be leveraging an existing API such as Google Cloud Natural Language. Understanding the architecture will be the crux for designing test cases.
  5. Test scenarios should encompass conversation and voice testing (i.e., the ability to recognize speech patterns and interpret non-verbal cues). Test scenarios should be designed with variations of the same input. Scenarios to handle multiple instructions in a request, conversations with background noises, different dictions, and localization needs are a must. In addition, tests to validate the ability of the chatbot to help in user navigation and the ability to handle errors are also vital. Omni channel compatibility tests to ensure the same look and feel and responses are required if the chatbot is expected to be used across multiple channels.
  6. From a non-functional standpoint performance testing i.e., the speed at which the chatbot responds and security testing including authentication, authorization, encryption of conversations, and adherence to compliance are key.

In summary, Chatbots are poised to exponentially grow in use and the technology to build them will rapidly evolve. Quality from a usability and functionality standpoint is a must and will be key for success.