Enhancing Your App Functionality with Custom OCR Models in Zoho Creator

27.06.23 02:29 PM - By CreatorScripts

Hello everyone!, we're going to discuss how Optical Character Recognition (OCR) can boost your app's performance by recognizing and extracting text from images.
The OCR model is a powerful text recognition tool capable of identifying text from various digital formats, such as documents, invoices, forms, ID cards, shipping container numbers, or vehicle number plates.
It operates primarily using a diet classifier and CRF entity extractor to gather necessary information. Zoho Creator, the platform we are using, offers two OCR options: pre-built and custom.

Pre-built versus Custom OCR Models

The pre-built OCR model extracts all the text from an image, while the custom OCR model allows you to train the model with specific data to scan and extract particular text from an image.

Building a Custom OCR Model

For this tutorial, we'll focus on constructing a custom OCR model, designed to extract specific information like the amount, address, invoice date, and due date from an invoice. This simple process involves six straightforward steps.

First, go to the Micro Services section and click on the 'Create New' button. Select 'AI models' and under custom models, choose OCR. Name your model and select the data type for training.

Zoho Creator provides a list of common data types used by businesses. If your desired data type doesn't fall within these options, select 'Others' and define your data type. For our purpose, we are extracting text from invoices, hence, we choose 'Invoices' from the drop-down menu.

Next, specify the fields and the data type of the values to extract. In this case, we're focusing on specific fields, choosing the relevant data type for each from the drop-down menu. Note that the 'Date' field type should match the format in your training data.

Training the Model

Once your fields are set, it's time to train the model with sample invoices. Add at least five images with a similar layout, making sure they are of specified file formats and do not exceed 5 MB in size.

Tagging comes next. This involves linking the text from the images to the previously specified fields. You can create a bounding box or choose highlighted texts to tag. Repeat this process for all your images.

Upon completion, a model summary will be displayed for review. Train the model—this duration depends on the number of images and fields added.

Testing and Deploying the Model

After training, you can test the model to check its accuracy. Once satisfied with its performance, publish the model, making it available across all your apps. Note that a model, once published, cannot be unpublished; it can only be retrained or deleted.

There are two ways to deploy the model. You can click on the 'Use Model' button, select the application and form name to add the field, or drag and drop the OCR field from the AI Field section into the form builder and select the model. Make sure there is an image field on the form to store the input image.

Finally, select the field types in which the extracted text should be displayed. If there are fields that are not relevant to your application, you can choose not to extract them.

Seeing the Custom OCR Model in Action

Now, let's see how this custom OCR model performs in practice. Upon accessing the app and uploading an invoice image, the tagged text is extracted and displayed in the specified fields.

Conclusion

This tutorial serves as an instance of how the optical character recognition field can enhance your apps. This feature can be customized to extract text from various images according to your business needs.

We hope this post has given you a comprehensive understanding of creating custom AI models for optical character recognition in Zoho Creator. Stay tuned for more tutorials on AI fields in Zoho Creator.

Don't forget to subscribe to our blog to stay updated with our latest posts.

Thanks for reading, and we look forward to connecting with you in our next post!

Get Started Now

What is OCR technology used for?

OCR technology, or Optical Character Recognition, is used for converting different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera, into editable and searchable data.

Is OCR an AI technology?

Yes, OCR is a part of Artificial Intelligence (AI) technology. It involves machine learning, intelligent character recognition, and a series of processes for the identification and conversion of printed or written text into machine-encoded text.

Where is OCR device used?

OCR devices are commonly used in various industries such as banking for check processing, in law firms for legal document scanning, and in healthcare for patient record management.

What is better than OCR technology?

While OCR technology is highly efficient for various tasks, emerging technologies like Intelligent Character Recognition (ICR) can be considered as an advancement. ICR incorporates machine learning for improved accuracy with handwritten content.

What are examples of OCR technology?

Examples of OCR technology include scanning applications that convert printed documents into digital formats, software that converts PDF files into editable formats, and mobile apps that convert business cards into digital contact lists.

What is the difference between OCR and PDF?

OCR and PDF are distinct but can be related. OCR is a technology used to convert different types of documents into editable and searchable data. A PDF, on the other hand, is a type of file format. OCR technology can be used on a PDF to extract text and convert it into an editable format.