Read our previous blogs in this series: Why Chatbots, Maturity levels and Intelligence

Chatbots are like any other application; they have a need for integration. In basic form the Chatbot integrates with the channels, the intelligence-providing systems, backend systems  and it offloads usage information for later improvements and intelligence determination.

All four integration parts have their usage patterns and (non functional) requirements for usage in the Chatbot solution. 

Here we go through the details of each integration part and considerations that need to be made during the design of a Chatbot solution.

Channel integration
Integration with channels firstly is about the job of acting as a funnel for many devices all communicating with the bot engine. As human interaction is a fraction of the speed of the ability for the engine to process the dialogue, we need a more contemporary integration model that doesn’t tie up resources such as threads for each stream of interaction such as the models that Kafka and Node.js support.

While funneling or pipelining these dialogues the integration needs to at least ensure that the communication is contextual, identifiable, and should also capture other contextual information that could help the dialogue; e.g.: determining the location of the dialogue origin in terms of geography (e.g. GPS on a phone, fixed connections IP-based information may help). The basic context of an identifiable endpoint is critical for the engine to then retrieve the prior context to understand the latest communication. As we are funneling the communication the context is necessary to allow the middleware to take the engine’s response and route it back to the correct recipient.

Most of this functionality will be found in bot adapters and frameworks such as those provided by the likes of Facebook’s Messenger, but if you want to incorporate a bot into your own app then this is something that needs to be considered.

This brings us to the second aspect of the Channel integration. As we progress through the maturity model integration will support multiple sources of dialogues for example Facebook, Twitter, and WhatsApp as far as the engine is concerned these solutions are no different.

Bot technologies will progress through the hype cycle and probably as we reach the stability end of the cycle we may we’ll see evolutions where bot dialogues may need to continue across devices. For example, a dialogue may start via Alexa in the home, but then switch to continuing through Siri as someone moves from the home to a travel context. Such a capability is probably going to be hybrid of middleware and pre-process or for the bot engine.

Backend integration
Bots are only really of value if the dialogue is actionable, without it you will only have something that can take on the Turing Test. To be actionable you need logic. To achieve this, you’re best using a middleware tier to help provide decoupling.

Ideally to achieve the best decoupling we work through an API layer with APIs previously standardized so that they can completely mask the implementation(s). In fact, it would be reasonable to suggest that the most successful bots are those that can work with standardized APIs across multiple services. Let’s illustrate this point; in the consulting arena, it is pretty common to have to complete time recording solutions for both your customer and your employer using two different systems. If there was an industry standard API for booking time, then imagine how easy it would be to translate “book seven hours to Acme Inc.’s bot project.” With many solutions supporting the API, the ability to realize the actions would be easy through integration and extending it for new clients.

The number of events or transactions behind the engine will be far smaller than those in front as you will see at least a couple of transactions between the user and engine for a single backend process. Even if the events are down to a user instruction (e.g. “transfer $500 from savings to current account”), the Chatbot response is to verify the interpretation of the request, and send the user a confirmation of the correct interpretation. Integration behind the engine can look more like traditional integration or through more contemporary Microservices models. Which route you take depends on many factors such as volumes, elasticity, technology skills, and complexity. In an employee-to-business (e2b) your loads are more predictable and you may wish to exploit existing integrations. 

Perhaps the most important aspect of this backend integration is the ability to translate the intent, entities and values identified to an API call. In the simplest model the engine will have some sort of configuration to guide the conversation into effectively attaining the values, which can complete some code to call an endpoint (possibly a managed API into an integration process).

At the most sophisticated end, the engine will be able to translate and use language similes in the intent directly to an integration call using metadata and use the metadata to determine the values needed from the user, so no structured dialogue/exchanges.

Integrating Intelligence-providing systems
As was described in the blog “Chatting with the Chatbots? – How intelligence makes the conversation” we want the Chatbot to behave as closely to a human interaction as possible. In order to deliver a response to a question by a customer the Chatbot needs information from different sources, after which it will determine based upon statistical models the best-suited answer for the customer. The below picture is described in more detail in the aforementioned blog related to Intelligence. The Chatbot response system is the component where all information is gathered to create the answer.

Clockwise turning and starting with Events and Chats is the place where the question is raised towards the Response system.

The received sentence is sent to the Natural Language Processing component (NLP) where the intent of the sentence and the important entities are sliced and diced from the sentence. This information is provided back to the Response system.

The Context is the area where all conversation information is stored, and is created at the start of a conversation. After every chat line relevant conversation information is stored, such as account info. At the same time information from the Context can be used to enhance and improve the response back to the end user.

The Training model is used to help the NLP understand natural language and be able to make a educated guess on the intent of a sentence. This is executed as a batch interface and is only being executed when improved NLP information is provided.

CRM information is captured at the start of the conversation and helps in knowing the end user; what earlier conversations did take place, did the end user order products, did he/she has complaints, etc. This all helps in determining how the end user should be approached.

Historical information helps determining the flow of the conversation from previous conversations.

Some of these integrations are required for the conversations and need to be available. The NLP integration for instance is crucial, without this integration the conversation is not able to continue. If for whatever reason this integration takes longer then expected, a message should be send to the end user that the conversation will continue a bit later. Other integrations are not crucial for the conversation. For instance, the CRM integration is a good add-on for managing information about the mood of the end user, in relation to previous calls. This kind of enriching integration should be set up using a circuit-breaker pattern where is ensured that a failure in this integration does not cascade as a failure into the calling application, and in the event of the circuit breaker triggering the AI will proceed based on the information that is already available.

The below table summarizes the (non batch) interfaces and whether these are required for the continuation of the conversation. 

Storing the usage information
Implementing a chatbot will require continuous learning and improvement. The conversation is designed upfront on best effort, but as always the proof is in the pudding. How will the end user communicate with the chatbot?

  • Is the chatbot engine able to relate the entries to the intents?
  • Is it clear what the entities in the sentence are?
  • Is the flow of the conversation running as expected?
  • Are the end users referring to a trending topic that was not anticipated at the time the design of the conversation was done?

These questions, and many more, can only be answered if the conversation can be evaluated later, which means that information about the conversation needs to be stored for later usage. Depending on regulations and laws the usage information may need to be anonymized. As part of the General Data Protection Regulation (GDPR) user related data may need to be removed when the end users asks for removal and after a specific amount of time. Since this information is only used in evaluations, as part of the continuous learning, no direct availability is required of the data. This implies that during huge loads of conversations the usage data can be stored in temporary storage (such as queues) before being moved to the usage data storage.
Read more about Capgemini’s Oracle GDPR offering.

Chatbots are no different then any other application; Multiple integrations support the application, all with different dynamics. As a chatbot is required to interact fast with an end-user for every integration it needs to be clear whether the information, provided by the different integrations, is crucial for the conversation or can be delayed untill a later moment. Standardization in the integration via APIs not only helps the flexibility but also in building performant conversations.

This blog is written with Phil Wilkins (Twitter)