Back to the listing

OCR : what is it?

OCR (Optical Character Recognition) is a software by which any text in an image is transformed into an editable file.

When you scan a document it is like taking a picture of it. This results in an image either in jpeg, tiff or a file in PDF. The text which is therefore found on the document is static. It cannot be changed since it is not even strictly speaking considered to be a text, but as an image.

With the OCR system, it is possible from an image to extract alphanumeric characters in order to have a word processing document such as a word or an Excel table for example.

How does it work ?

OCR is a complex task that can be summarized in a quite simple process. Indeed, the program analyzes the structure of your document and pides the page into several distinct elements: image, tables, text, numbers etc. It defines the lines in words then in characters. The system then recognizes each character, and converts it to ASCII (American Standard Code for the Interchange of Information) text. OCR can recognize different types of fonts and characters, and even handwriting in some cases.

This technology is very useful for automatically reading documents such as identity cards, certificates and forms. Today, many companies are using this technology. This system is able to “read” the content, extract structured data and reprocess this data for different purposes such as validity checks. OCR is a real time saver and avoids hours of unnecessary paperwork.

What are the benefits of OCR for my business?

OCR helps streamline processes and makes a document usable.

This process also allows you not only to validate that a document submitted by a user is the right document from the right inpidual but also to search for a word within a document and reprocess it automatically into another document (such as a contract for example). In addition, it is possible to integrate data extracted from the document into another program (accountant, CRM, ERP, GED etc.).

The benefits of OCR in the onboarding process

OCR is one of the numerous features from CheckHub. Our users can easily set up validation rules based on the extracted data from a document. Thanks to this, CheckHub automatically checks if a submitted document from a customer is the correct one. If not, the system will notify the inpidual immediately and ask for another document. It also allows to check the quality of documents’ pictures taken via a smartphone. All extracted data can also be used for prefilling new documents such as forms; making it easier for your customers to fill them in in a second step.