Skip to content

OCR Operator

The OCR LOP extracts text from images using Optical Character Recognition. It leverages the SideCar server to run either the EasyOCR or PaddleOCR library for the actual text detection and recognition. This allows for offline OCR processing (after initial model setup in SideCar) and offloads the computational work from the main TouchDesigner process.

OCR UI

  • SideCar Server: The SideCar server application must be running. See the SideCar Guide for setup instructions.
  • SideCar Dependencies: The Python environment used by the SideCar server needs the relevant OCR library installed:
    • For EasyOCR: easyocr
    • For PaddleOCR: paddleocr, paddlepaddle-gpu (or paddlepaddle for CPU)
  • ChatTD Operator: Required for asynchronous communication with the SideCar server and logging. Ensure the ChatTD Operator parameter on the ‘About’ page points to your configured ChatTD instance.
  • Input TOP (Inputtop parameter): Connect the image (TOP) from which you want to extract text.
  • Results Table (results_table DAT): An internal table containing detailed results for each detected text block, including the text, confidence score, and bounding box information (u, v, width, height).
  • Output Text (Outputtext parameter): A string parameter displaying the combined extracted text (if Combine Text is On) or the text from the first detected block.
  • Viewer TOP / Output 2 (out2_top): Displays the input image with bounding boxes drawn around the detected text regions. The appearance (color, line width) can be customized.
Process (Process) op('ocr').par.Process Pulse
Default:
None
Input TOP (Inputtop) op('ocr').par.Inputtop TOP
Default:
None
Status (Status) op('ocr').par.Status String
Default:
None
Active (Active) op('ocr').par.Active Toggle
Default:
Off
Model Type (Modeltype) op('ocr').par.Modeltype Menu
Default:
paddle
Options:
paddle, easy
Min Confidence (Minconfidence) op('ocr').par.Minconfidence Float
Default:
0
Range:
0 to 1
Slider Range:
0 to 1
Image Scale (Imagescale) op('ocr').par.Imagescale Float
Default:
0
Range:
0.1 to 1
Slider Range:
0.1 to 1
Combine Text (Combinetext) op('ocr').par.Combinetext Toggle
Default:
Off
Output Text (Outputtext) op('ocr').par.Outputtext String
Default:
None
Display / TOP Out2 Header
Line Width (Width) op('ocr').par.Width Float
Default:
0
Range:
0 to 1
Slider Range:
0 to 1
Color R (Colorr) op('ocr').par.Colorr RGB
Default:
0
Color G (Colorg) op('ocr').par.Colorg RGB
Default:
0
Color B (Colorb) op('ocr').par.Colorb RGB
Default:
0
Callbacks Header
Callback DAT (Callbackdat) op('ocr').par.Callbackdat DAT
Default:
ChatTD_callbacks
Edit Callbacks (Editcallbacksscript) op('ocr').par.Editcallbacksscript Pulse
Default:
None
Create Callbacks (Createpulse) op('ocr').par.Createpulse Pulse
Default:
None
onComplete (Oncomplete) op('ocr').par.Oncomplete Toggle
Default:
Off
Textport Debug Callbacks (Debugcallbacks) op('ocr').par.Debugcallbacks Menu
Default:
Full Details
Options:
None, Errors Only, Basic Info, Full Details
Available Callbacks:
  • onComplete
Example Callback Structure:
def onComplete(info):
# Called after OCR processing completes and results are received.
# info dictionary contains details like:
# - op: The OCR operator instance
# - text: List of detected text strings
# - results: List of detailed result dictionaries (text, confidence, u, v, width, height)
# - processing_time: Time taken by the SideCar server in seconds
# - status: 'success' or 'error'
# - error: Error message if status is 'error'

if info.get('status') == 'success':
  num_results = len(info.get('results', []))
  combined = " ".join(info.get('text', []))
  print(f"OCR finished successfully: Found {num_results} text blocks.")
  # print(f"Combined text: {combined[:100]}...") # Example: Print first 100 chars
else:
  print(f"OCR Error: {info.get('error')}")

# Example: Trigger another operator based on results
# if num_results > 0:
#   op('downstream_logic').par.Process.pulse()
pass
Bypass (Bypass) op('ocr').par.Bypass Toggle
Default:
Off
Show Built-in Parameters (Showbuiltin) op('ocr').par.Showbuiltin Toggle
Default:
Off
Version (Version) op('ocr').par.Version String
Default:
"" (Empty String)
Last Updated (Lastupdated) op('ocr').par.Lastupdated String
Default:
"" (Empty String)
Creator (Creator) op('ocr').par.Creator String
Default:
"" (Empty String)
Website (Website) op('ocr').par.Website String
Default:
"" (Empty String)
ChatTD Operator (Chattd) op('ocr').par.Chattd OP
Default:
"" (Empty String)
1. Ensure the SideCar server is running and has the desired OCR library (`easyocr` or `paddleocr`) installed.
2. Connect an image TOP containing text to the `Input TOP` parameter.
3. Select the desired engine using the `Model Type` parameter (e.g., 'EasyOCR').
4. Adjust `Min Confidence` if needed (e.g., 0.3 to capture more uncertain text).
5. Pulse the `Process` parameter.
6. Monitor the `Status` parameter. Extracted text will appear in the `Output Text` parameter and detailed results in the internal `results_table` DAT.
7. The second TOP output will show the input image with detected text areas highlighted.
1. Perform basic OCR setup as above.
2. Create a Text DAT (e.g., 'ocr_callbacks').
3. Paste the example `onComplete` function code into the DAT.
4. Set the `Callback DAT` parameter on the OCR operator to point to 'ocr_callbacks'.
5. Ensure the `onComplete` toggle is On.
6. Pulse `Process`.
7. When OCR finishes, the `onComplete` function in your DAT will execute, printing information to the Textport.
  • SideCar Dependency: This operator relies entirely on the SideCar server for OCR processing. Ensure the SideCar is running and the selected Model Type library is installed in its environment.
  • Asynchronous Operation: Image data is sent to the SideCar, processed, and results are returned asynchronously via ChatTD, preventing TouchDesigner from freezing during processing.
  • Performance: Processing time depends on image size/complexity, the chosen Model Type, the Image Scale, and the performance of the machine running the SideCar. PaddleOCR often requires more resources (especially GPU) than EasyOCR.
  • Visualization: The second TOP output provides a visual representation of the detected text regions using bounding boxes.
  • SideCar: The backend service required for this operator to function.
  • ChatTD: Provides core services like asynchronous task execution and logging.
  • Florence-2 Operator: A more comprehensive vision foundation model that can also perform OCR among other tasks.

Research & Licensing

EasyOCR & PaddleOCR Teams

EasyOCR is developed by JaidedAI, focusing on making OCR technology accessible to developers. PaddleOCR is developed by Baidu's PaddlePaddle team, providing a comprehensive OCR framework with state-of-the-art performance across multiple languages and scenarios.

CRAFT + CRNN (EasyOCR) and PaddleOCR Framework

This operator combines the strengths of EasyOCR and PaddleOCR frameworks. EasyOCR uses CRAFT for text detection and CRNN for recognition, while PaddleOCR provides a complete OCR solution with advanced preprocessing and post-processing capabilities. Both frameworks offer robust multilingual text recognition suitable for diverse applications.

Technical Details

  • Text Detection: CRAFT algorithm for accurate character region detection
  • Text Recognition: CRNN architecture for sequence-to-sequence text recognition
  • Multilingual Support: Extensive language coverage including Asian languages

Research Impact

  • Open Source OCR: Democratizing access to high-quality text recognition
  • Production Ready: Widely adopted in commercial and research applications
  • Multilingual Capabilities: Breakthrough support for diverse writing systems

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

@article{shi2016end,
  title={An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition},
  author={Shi, Baoguang and Bai, Xiang and Yao, Cong},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={39},
  number={11},
  pages={2298--2304},
  year={2016}
}

@misc{cui2025paddleocr,
  title={PaddleOCR 3.0 Technical Report},
  author={Cui, Cheng and Sun, Ting and Lin, Manhui and Gao, Tingquan and Zhang, Yubo and Liu, Jiaxuan and Wang, Xueqing and Zhang, Zelun and Zhou, Changda and Liu, Hongen and Zhang, Yue and Lv, Wenyu and Huang, Kui and Zhang, Yichao and Zhang, Jing and Zhang, Jun and Liu, Yi and Yu, Dianhai and Ma, Yanjun},
  journal={arXiv preprint arXiv:2507.05595},
  year={2025},
  url={https://arxiv.org/abs/2507.05595}
}

Key Research Contributions

  • CRAFT: Character-level text detection with region awareness
  • CRNN: End-to-end trainable network for sequence recognition
  • PaddleOCR: Production-ready OCR framework with multilingual support

License

EasyOCR: MIT License, PaddleOCR: Apache 2.0 License - This model is freely available for research and commercial use.