Ga direct naar inhoud

AI fuelled Character Recognition solutions pave the way for RPA

Robin den Boer

This blog article is part of our RPA (Robotic Process Automation) blog series and discusses the RPA opportunities arising from character recognition solutions, specifically the potential of Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR) in an ERP ecosystem.

What is OCR and ICR?

Optical Character Recognition (OCR) is a well-known technology and enables you to extract text from images and transform it into electronically usable format. Intelligent Character Recognition (ICR) adds artificial intelligence (AI) e.g. machine learning to the OCR engine and hereby broadens the range of services which can be offered.

The following actions can be performed by using current ICR engines: reading the document, classifying the document, extracting the relevant data from the document, validating the data, and providing the data in a structured form for further processing.

Figure 1 – The components of an integrated Character Recognition solution

By adding AI functionalities to classic OCR engines, great improvements have been achieved. These build-in offerings show advances in the accuracy rate (data is correctly extracted into the database, e.g. the B is actually an 8) and passing rate (the percentage of documents that the engine is able to extract without human intervention). Also, it is an enabler for end-to-end automation.

Not all OCR and ICR solutions are alike

Capgemini’s performed an extensive vendor analysis. Our comparison on the process of classifying the documents and extracting the data indicates that , AI support is used as a main differentiator and drives increased accuracy and passing rates. Also, the ability to handle a great variety of inputs, especially handprinted or handwritten, should be one of the main considerations when selecting a tool.

Looking at document extraction and data classification, existing solutions differ in the exact methods utilized. Two general approaches emerge. First, it is possible to use historical data of the organization as a training dataset. This way the engine learns from the historical data to improve classification and extraction, thus ensuring minimal manual intervention . Moreover, some vendors use artificial intelligence technologies to self-learn from operations. That is, generally, the engine calculates whether the certainty level is below the pre-defined threshold. In this case, a document is rerouted to an employee and handled as an exception. The engine utilizes this process to continuously learn from these interventions. In both approaches or a combination of the two, vendors differentiate themselves using propriety AI/machine learning algorithms.

Another improvement area we encountered during our study is the accuracy of the extraction itself. Thanks to machine learning, vendors already show increased accuracy results in extracting even hand-printed information from images. AI has allowed these engines to make great strides towards processing handwritten, cursive writing. This ability is a feature not every vendor is currently providing. Moreover, the accuracy level in capturing handprinted or written information and successfully pass through are variable and highly vendor dependent. Besides the accuracy of the extraction, we also see improvements in the number of languages that vendors can extract.

Use case example of OCR and ICR

These technological advancements has opened the door to new opportunities in the field of image recognition complemented with RPA. Let’s make it more tangible with a concrete client example of a finance process. Within the accounts payable process, a multitude of invoices, from multiple suppliers and all having their own invoice layout, need to be processed in the business system. Some suppliers will even invoice by physical post (yes that is still happening nowadays), while others send copies via mail. The ICR engine would be able to extract all relevant data, e.g. supplier name, value amounts, quantities ordered, and collect it in a structured database.

While most companies already have an OCR solution available, existing solution might be unable to deliver the required inputs for further automation of the AP process. High error levels in the output of outdated OCR technologies might even result in manual, line by line, checks to ensure that all the extracted data is correct. As the accuracy rate can be increased using AI functionalities, these kinds of manual checks may become redundant. Moreover, they create opportunities to automate the end-to-end process – from invoice to the General ledger.

This is where we can leverage benefits by combining ICR and RPA. Considering our example, when the paper-based invoice is transformed to an electronically readable format and there is a one on one match between the data extracted and the information required by the ERP system, the RPA solution can transfer the data to the business systems without human intervention.

Long story short; AI is here to stay

An AI-driven solution, to process unstructured, imaged input and create structured output, is not only applicable in our accounts payable example. Other candidates, with incorporation of RPA, are transferring (written) complaint letters, entering/changing bank account numbers via copies of official bank statements, entering/changing data from a copied driver license, filling copies of receipts into expense systems and so on… Surely, you can think of a few more use cases in your own environment. As the technology is advancing, I expect more and more use cases for combining AI with RPA!