Real-time OCR and Text Detection with Tensorflow, OpenCV and Tesseract

Start Date: 01/24/2021

Course Type: Common Course

Course Link: https://www.coursera.org/learn/ocr-text-detection-tensorflow-opencv-tesseract

Coursera Plus banner featuring three learners and university partner logos

Course Tag

Related Wiki Topic

Article Example
Real-time text Real-time text is used in closed captioning and when captions are being streamed live continuously during live events. Transcription services including Communication Access Real-Time Translation and TypeWell frequently use real-time text, where text is streamed live to a remote display. This is used in court reporting, and is also used by deaf attendees at a conference. Also, real-time text provides an enhancement to text messaging on mobile phones, via real-time texting apps.
Real-time text Real-time text is frequently used by the deaf, including IP-Relay services, TDD/TTY devices, and Text over IP. Real-time text allows the other person to read immediately, without waiting for the sender to finish composing his or her sentence/message. This allows conversational use of text, much like a hearing person can listen to someone speaking in real-time.
Tesseract (software) In a July 2007 article on Tesseract, Anthony Kay of "Linux Journal" termed it "a quirky command-line tool that does an outstanding job". At that time he noted "Tesseract is a bare-bones OCR engine. The build process is a little quirky, and the engine needs some additional features (such as layout detection), but the core feature, text recognition, is drastically better than anything else I've tried from the Open Source community. It is reasonably easy to get excellent recognition rates using nothing more than a scanner and some image tools, such as The GIMP and Netpbm."
Real-time text Collaborative real-time editing is the utilization of real-time text for shared editing, rather than for conversation. Split screen chat, where conversational text appears continuously, is also considered real-time text. Some examples that provide this as a service are Apache Wave, Etherpad, and most notably Google Docs.
Real-time text While standard instant messaging is not real-time text (the message is only sent at the end of a thought, not while it is being composed), a real-time text option is found in some instant messaging software, including AOL Instant Messenger's "Real-Time IM" feature. Real-time text is also possible over any XMPP compatible chat networks, including those used by Apple iChat, Cisco WebEx, and Google Talk, by using appropriate software that has a real-time text feature. When present in IM programs, the real-time text feature can be turned on/off, just like other chat features such as audio. Real-time text programs date at least to the 1970s, with the talk program on the DEC PDP-11, which remains in use on Unix systems.
TensorFlow TensorFlow is Google Brain's second generation machine learning system, released as open source software on November 9, 2015. While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA extensions for general-purpose computing on graphics processing units). TensorFlow is available on 64-bit Linux, macOS, and mobile computing platforms including Android and iOS.
Real-time text Real-time text protocols include Text over IP (ToIP) designed around ITU-T T.140, IETF RFC4103, RFC5194, and XMPP Extension Protocol XEP-0301.
Real-time text During 2012, the Real-Time Text Taskforce (R3TF) designed a standard international symbol to represent real-time text, as well as the alternate name Fast Text to improve public education of the technology.
Real-time text Certain real-time text applications have a feature that allows the real-time text to be "turned off", for temporary purposes. This allows the sender to pre-compose the message as a standard IM or text message before transmitting.
Real-time text Real-time text is used for conversational text, in collaboration, and in live captioning. Technologies include TDD/TTY devices for the deaf, live captioning for TV, Text over IP (ToIP), some types of instant messaging, captioning for telephony/video teleconferencing, telecommunications relay services including ip-relay, transcription services including Remote CART, TypeWell, collaborative text editing, streaming text applications, next-generation 9-1-1/1-1-2 emergency service. Obsolete TDD/TTY devices are being replaced by more modern real-time text technologies, including Text over IP, ip-relay, and instant messaging.
Real-time text Real-time text is also historically found in the old UNIX talk (software), BBS software such as Celerity BBS, and older versions of ICQ messaging software.
Viola–Jones object detection framework The Viola–Jones object detection framework is the first object detection framework to provide competitive object detection rates in real-time proposed in 2001 by Paul Viola and Michael Jones. Although it can be trained to detect a variety of object classes, it was motivated primarily by the problem of face detection. This algorithm is implemented in OpenCV as codice_1.
Real-time text According to ITU-T Multimedia Recommendation F.703, Total Conversation defines the simultaneous use of audio, video and real-time text. An instant messaging program that can enable all three features simultaneously, would be compliant. Real time text is an important part of it.
Tesseract (software) Tesseract is suitable for use as a backend and can be used for more complicated OCR tasks including layout analysis by using a frontend such as OCRopus.
Tesseract (software) Tesseract is considered one of the most accurate open-source OCR engines currently available.
OpenCV OpenCV ("Open Source Computer Vision") is a library of programming functions mainly aimed at real-time computer vision. Originally developed by Intel's research center in Nizhny Novgorod (Russia), it was later supported by Willow Garage and is now maintained by Itseez. The library is cross-platform and free for use under the open-source BSD license.
Tesseract The regular tesseract, along with the 16-cell, exists in a set of 15 uniform 4-polytopes with the same symmetry. The tesseract {4,3,3} exists in a sequence of regular 4-polytopes and honeycombs, {p,3,3} with tetrahedral vertex figures, {3,3}. The tesseract is also in a sequence of regular 4-polytope and honeycombs, {4,3,p} with cubic cells.
OpenCV Officially launched in 1999, the OpenCV project was initially an Intel Research initiative to advance CPU-intensive applications, part of a series of projects including real-time ray tracing and 3D display walls. The main contributors to the project included a number of optimization experts in Intel Russia, as well as Intel’s Performance Library Team. In the early days of OpenCV, the goals of the project were described as:
Tesseract (software) Tesseract up to and including version 2 could only accept TIFF images of simple one-column text as inputs. These early versions did not include layout analysis, and so inputting multi-columned text, images, or equations produced garbled output. Since version 3.00 Tesseract has supported output text formatting, hOCR positional information and page-layout analysis. Support for a number of new image formats was added using the Leptonica library. Tesseract can detect whether text is monospaced or proportionally spaced.
Tesseract (software) Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been sponsored by Google since 2006.tesseract-ocr.html>Announcing Tesseract OCR - The official Google blog