Updated readme
This commit is contained in:
parent
625dfb4bba
commit
ff6ac190e2
|
@ -1,6 +1,12 @@
|
|||
# NOTES ON OPTICAL PRINTER TECHNIQUE
|
||||
|
||||
Reproduction on the guide written by Dennis Couzin.
|
||||
Loses some of the charm of the photocopied original floating around the internet, but this reproduction is done for the sake of readability/searchability of the text.
|
||||
|
||||
Tesseract does a majority of the heavy lifting, making about a 85% transcription with minor changes needed to spelling and slightly more effort formatting it into markdown for rendering.
|
||||
Pre-processing using OpenCV and tuning tesseract for the typewritten font may produce even better text.
|
||||
|
||||
Preserving alternate spellings not created in the OCR process.
|
||||
|
||||
### PDF Dependencies
|
||||
|
||||
|
@ -16,6 +22,7 @@ bash compile.sh
|
|||
* OpenCV 2
|
||||
* Tesseract
|
||||
* PIL
|
||||
* PyMuPDF
|
||||
|
||||
```bash
|
||||
cd extract
|
||||
|
|
Loading…
Reference in New Issue