Coding Guide to Create Optical Character Recorder Recording (OCR) in Google Colab using OpenCV and TesSeract-OCR

nimda March 17, 2025

0 27 3 minutes read

Coding Guide to Create Optical Character Recorder Recording (OCR) in Google Colab using OpenCV and TesSeract-OCR

Recognition of optical character (OCR) The powerful technology that converts text images into mechanical content. With a growing demand for data, OCR tools have turned into an integral part of many programs, from digital documents to extract information from checked photos. In this lesson, we will create an OCR app that work hard at Google Colab, landmarks such as OpenCV processing, tesseract-OCR text recognition, and the matplotlib. At the end of this guide, you can download a picture, do not imagine, issue a text, and download the results, all within the colab notebook.

!apt-get install -y tesseract-ocr
!pip install pytesseract opencv-python numpy matplotlib

To set OCR Nature on Google Colab, we start installing tesseract-OCR, open source monitor, using APT-Get. Also, we include the important libraries of the Python such as PESSESTECT (TESSECACT), OpenCV (Photographic), Incer (Number performance), and the matplotlib (s).

import cv2
import pytesseract
import numpy as np
import matplotlib.pyplot as plt
from google.colab import files
from PIL import Image

Next, import the required libraries for photo processing and photos and OCR. OpenCV (CV2) is used for reading and starting photos, and PytesCract provides a template for the Tesseract OCR engine for publication of the text. NUPPY (NP) helps trick, and the matplotlib (PLT) with Visual Images processed. Google Colab files allow users to upload pictures, and pile (photo) helps the conversion of the images to be considered by the OCR.

uploaded = files.upload()


filename = list(uploaded.keys())[0]

To process the OCR image, first we need to download it to Google Colab. File.upload () Work from Google Colab files that enable users to select Release image file in their local system. The loaded file is saved in the dictionary, in the file name as the key. We remove the file name using the list (loaded.KE ())[0]Allowing us to reach and use the uploaded picture on the steps next.

def preprocess_image(image_path):
    image = cv2.imread(image_path)
   
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
   
    _, thresh = cv2.threshold(gray, 150, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
   
    return thresh


processed_image = preprocess_image(filename)


plt.imshow(processed_image, cmap='gray')
plt.axis('off')
plt.show()

Improving the accuracy of the OCR, using the priority function that improves image quality to extract the text. Presprocess_image () Work Previews the uploaded image using OpenCV (CV2.imread ()) and convert you greycale using GV2.Cvvcolor Next, we use binary limit in the way of Otsu We use CV2.Shristi (), helping partition from the background by converting the image into the highest format and white format. Finally, the photo used is displayed using the matplotlib (PLT.Mshow ()).

def extract_text(image):
    pil_image = Image.fromarray(image)
   
    text = pytesseract.image_to_string(pil_image)
   
    return text


extracted_text = extract_text(processed_image)


print("Extracted Text:")
print(extracted_text)

Extract_text () Work performs an OCR in the displayed photo. As Tesseract-OCR requires pil photography, we begin to change the nunpy array (photo). Photo passes. of an OCR from a loaded image.

with open("extracted_text.txt", "w") as f:
    f.write(extracted_text)


files.download("extracted_text.txt")

Ensure that the issued text is readily available, we save it as a text file using the built-in Psyth file management. Open (“Released_text.txt”, “” w “) is a” “w”) file. After the file is issued by the OCR.

In conclusion, by combining OpenCV, tesseract-OCR, insuctlib, and successfully create the OCR application that can process the images and issue text to Google Colab. These work travel provides a simple but effective way to change inscriptions, printed text, or handwritten content into a digital text file. The preceding key steps confirm the better accuracy, and the ability to save and download these results facilitate the further analysis.

Here is the Colab Notebook. Also, don't forget to follow Sane and join ours Telegraph station including LinkedIn Grtopic. Don't forget to join ours 80k + ml subreddit.

Asphazzaq is a Markteach Media Inc. According to a View Business and Developer, Asifi is committed to integrating a good social intelligence. His latest attempt is launched by the launch of the chemistrylife plan for an intelligence, MarktechPost, a devastating intimate practice of a machine learning and deep learning issues that are clearly and easily understood. The platform is adhering to more than two million moon visits, indicating its popularity between the audience.

Source link

nimda March 17, 2025

0 27 3 minutes read