Updated readme

This commit is contained in:
Matt McWilliams 2022-07-26 16:53:44 -04:00
parent 625dfb4bba
commit ff6ac190e2
1 changed files with 7 additions and 0 deletions

View File

@ -1,6 +1,12 @@
# NOTES ON OPTICAL PRINTER TECHNIQUE
Reproduction on the guide written by Dennis Couzin.
Loses some of the charm of the photocopied original floating around the internet, but this reproduction is done for the sake of readability/searchability of the text.
Tesseract does a majority of the heavy lifting, making about a 85% transcription with minor changes needed to spelling and slightly more effort formatting it into markdown for rendering.
Pre-processing using OpenCV and tuning tesseract for the typewritten font may produce even better text.
Preserving alternate spellings not created in the OCR process.
### PDF Dependencies
@ -16,6 +22,7 @@ bash compile.sh
* OpenCV 2
* Tesseract
* PIL
* PyMuPDF
```bash
cd extract