Optical Character Recognition (OCR) in Python

Current Status

Price

Free

Get Started

What you will learn

Use Tesseract, EAST and EasyOCR tools for text recognition in images and videos
Understand the differences between OCR in controlled and natural environments
Apply image pre-processing techniques to improve image quality, such as: thresholding, inversion, resizing, morphological operations and noise reduction
Use EAST architecture and EasyOCR library for better performance in natural scenes
Train an OCR from scratch using Deep Learning and Convolutional Neural Networks
Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)
License plate reading

Requirements

Programming logic
Basic Python programming

Description

Within the area of Computer Vision is the sub-area of Optical Character Recognition (OCR), which aims to transform images into texts. OCR can be described as converting images containing typed, handwritten or printed text into characters that a machine can understand. It is possible to convert scanned or photographed documents into texts that can be edited in any tool, such as the Microsoft Word. A common application is automatic form reading, in which you can send a photo of your credit card or your driver’s license, and the system can read all your data without the need to type them manually. A self-driving car can use OCR to read traffic signs and a parking lot can guarantee access by reading the license plate of the cars!

To take you to this area, in this course you will learn in practice how to use OCR libraries to recognize text in images and videos, all the code implemented step by step using the Python programming language! We are going to use Google Colab, so you do not have to worry about installing libraries on your machine, as everything will be developed online using Google’s GPUs! You will also learn how to build your own OCR from scratch using Deep Learning and Convolutional Neural Networks! Below you can check the main topics of the course:

Recognition of texts in images and videos using Tesseract, EasyOCR and EAST
Search for specific terms in images using regular expressions
Techniques for improving image quality, such as: thresholding, color inversion, grayscale, resizing, noise removal, morphological operations and perspective transformation
EAST architecture and EasyOCR library for better performance in natural scenes
Training an OCR from scratch using TensorFlow and modern Deep Learning techniques, such as Convolutional Neural Networks
Application of natural language processing techniques in the texts extracted by OCR (word cloud and named entity recognition)
License plate reading

These are just some of the main topics! By the end of the course, you will know everything you need to create your own text recognition projects using OCR!

Who this course is for

Anyone interested in OCR (Optical Character Recognition)
Undergraduate students who are studying subjects related to Artificial Intelligence, Digital Image Processing or Computer Vision
Data Scientists who want to increase their knowledge in Computer Vision
Professionals interested in developing professional optical character recognition solutions
People interested in creating their own custom OCR

Course Content

Lesson Content

0% Complete 0/3 Steps

Course content

Introduction to OCR

Course materials

Lesson Content

0% Complete 0/10 Steps

Introduction to Tesseract

Preparing the environment

First text recognition

Support for other languages

Page segmentation mode (PSM)

Selection of texts 1

Selection of texts 2

Selection of texts 3

Search using regular expressions

Detections in natural scenarios

Lesson Content

0% Complete 0/16 Steps

Grayscale

Thresholding – intuition

Simple thresholding

Thresholding with Otsu method

Adaptive thresholding

Gaussian adaptative thresholding

Color inversion

Resizing – intuition

Resizing – implementation

Morphological operations – intuition

Morphological operations – implementation

Noise removal – intuition

Noise removal – implementation

Text recognition with OCR

HOMEWORK

Homework solution

Lesson Content

0% Complete 0/6 Steps

EAST – introduction

Preprocessing the image

Loading the neural network

Decoding the image 1

Decoding the image 2

Text recognition

Lesson Content

0% Complete 0/18 Steps

Importing the libraries

MNIST 0-9 dataset

Kaggle A-Z dataset

Joining the datasets

Preprocessing the data

Building the neural network

Training the neural network

Evaluating the neural network

Saving the neural network

Testing with images

Preparing the environment

Preprocessing the image

Contour detection

Processing the detections 1

Processing the detections 2

Character recognition

Problems with 0 and O, 1 and l, 5 and S

Problems with undetected texts

Lesson Content

0% Complete 0/5 Steps

Preparing the environments

Text recognition

Writing the results on the image

Other languages – French and Chinese

Text recognition (background)

Lesson Content

0% Complete 0/5 Steps

Preparing the environment

Video settings

Processing the video

OCR with EAST and Tesseract

OCR with EasyOCR

Lesson Content

0% Complete 0/7 Steps

Preparing the environment

Text recognition

Searching for texts

Word cloud

Named entity recognition

Search for texts in images

Saving the results

Lesson Content

0% Complete 0/6 Steps

Preparing the environment

Contour detection

Perspective transformation

OCR with Tesseract

Improving image quality

Putting all together

Lesson Content

0% Complete 0/3 Steps

Preprocessing the image

Text recognition

Improving image quality

Lesson Content

0% Complete 0/8 Steps

Biological fundamentals

Single layer perceptron

Multilayer perceptron – sum and activation functions

Multilayer perceptron – error calculation

Gradient descent

Delta parameter

Updating weights with backpropagation

Bias, error, stochastic gradient descent, and more parameters

Lesson Content

0% Complete 0/5 Steps

Introduction to convolutional neural networks

Convolutional operator

Pooling

Flattening

Dense neural network

Lesson Content

0% Complete 0/1 Steps

Final remarks

Ratings and Reviews

4.8

Avg. Rating

59 Ratings

What's your experience? We'd love to know!

Review posted on Udemy

Posted 6 months ago

by Kaiweng Phoon

The course was easy to follow with a user-friendly UI, and it covered exactly what I wanted to learn, especially techniques for improving image quality.

Review posted on Udemy

Posted 8 months ago

by Aishwarya R

The course explains traditional OCR tools like Tesseract, EasyOCR, and EAST very well and is great for learning the fundamentals. One suggestion: it would really help learners if the course included a section on modern vision-LLM OCR methods (GPT-based OCR, Qwen2-VL, GOT-OCR2, LLaVA, etc.). These newer approaches are widely used now and give much better results for scanned or blurry documents. Adding a comparison between classical OCR and LLM-based OCR would make the course even more complete.

Review posted on Udemy

Posted 9 months ago

by Tiago Gonçalves Dias

Super clear to understand

Review posted on Udemy

Posted 9 months ago

by Roberto Rodriguez Apolinar

Really good explanation about the topics. Highly recommended course.

Review posted on Udemy

Posted 9 months ago

by Ahmed Abdullah Aafaq

The course is well structured and very well organized. It provides essential concepts that are required for individuals of various areas and expertise.

Review posted on Udemy

Posted 10 months ago

by 牟田賢

臨んでいたコースです。

Review posted on Udemy

Posted 10 months ago

by Abdurrahman Karadeniz

Nice course

Review posted on Udemy

Posted 12 months ago

by Josip Rajčanji

+++++

Review posted on Udemy

Posted 1 year ago

by Al Sheikh Aminul Islam

nice

Review posted on Udemy

Posted 1 year ago

by John Kent

Very well explained

Show more reviews

What's your experience? We'd love to know!