Python

ExtractTextFromSlide

Python tool for intelligent PDF text extraction using hybrid native+OCR approach. Optimized for academic documents with mathematical notation. Batch processing, automatic cleaning, preserves STEM symbols. Built with PyPDF2 and Tesseract OCR.

PythonEmerging

GitHub

Stars

—

Forks

—

Contributors

Last push

5mo ago

Recent commits

Latest commits.

fix stuff
1f10637DanyR20015mo ago
init
517ac05DanyR20015mo ago
Initial commit
be14b08Daniele Russo5mo ago

Top contributors

Builders behind this project.

DanyR2001

3 commits