Add extraction scripts and initial work on markdown
This commit is contained in:
commit
90dff1d7a4
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
title: NOTES ON OPTICAL PRINTER TECHNIQUE
|
||||
author: Dennis Couzin
|
||||
date: "March 1983"
|
||||
...
|
||||
\pagenumbering{gobble}
|
||||
\newpage
|
||||
\pagenumbering{arabic}
|
||||
::: {.indexTable}
|
||||
|
||||
| | | | |
|
||||
|-----|-----|-----|-----|
|
||||
| Magnification | 1 | Fades in Original | 14 |
|
||||
| Blowup & Reduction | 2 | Chart C: Neutral Density | |
|
||||
| Blowup Sharpness | 2 | and Equivalent Shutter | |
|
||||
| Printer Lenses | 3 | Angle | 15 |
|
||||
| Optical Zoom | 3 | Image Superposition | 16 |
|
||||
| Lens Aperture | 3 | Gamma & Bipack | 16 |
|
||||
| Focusing | 4 | Incidentally | 16 |
|
||||
| Focusing Aperture | 4 | Exposure Compensation | 18 |
|
||||
| Focusing Precision | 4 | Special Originals | 18 |
|
||||
| Focusing Target | 4 | Texturing | 18 |
|
||||
| Depth of Field | 4 | Multi-Exposure | 19 |
|
||||
| Bolex Prism | 4 | Multi-Pack | 19 |
|
||||
| Bolex Groundglass | 4 | Natural Superposition | 19 |
|
||||
| Defocus | 4 | Flashing | 19 |
|
||||
| X-Y Adjustment | 4 | Contrast Adjustment | 19 |
|
||||
| Exact 1:1 | 5 | Color Image Superposition | 20 |
|
||||
| Aimframe | 5 | Weighted Double Exposures | 20 |
|
||||
| Framelines | 6 | Dissolves | 21 |
|
||||
| Emulsion Position | 7 | Effects Dissolves | 21 |
|
||||
| Time | 8 | Fades from Negative | 21 |
|
||||
| Fancy Freeze | 8 | Color Exposure | 22 |
|
||||
| Fancy Slow | 8 | Testing | 22 |
|
||||
| Diffusers | 8 | CC Pack Reduction | 25 |
|
||||
| UV Filter | 9 | High Contrast Prints | 25 |
|
||||
| IR Filter | 9 | Hicon Exposure | 26 |
|
||||
| Green Filter | 3 | Contrast Building Steps | 26 |
|
||||
| Filter Location | 9 | Hicon Speckle | 26 |
|
||||
| Exposure | 9 | Tone Isolation | 27 |
|
||||
| Exposure Adjusters | 9 | Logic of Mask Combination | 27 |
|
||||
| Specifying Exposure | 11 | Image Spread and Bloom | 27 |
|
||||
| Film Speed | 11 | Mask and Countermask | 28 |
|
||||
| Right Exposure | 11 | Reversal/Negative Fitting | 28 |
|
||||
| Generations | 12 | Feathered Maska | 29 |
|
||||
| Bellows Formula | 13 | Image Marriage | 29 |
|
||||
| Fades | 13 | Mask Blackness | 30 |
|
||||
| Log Fade | 14 | Hicona from Color Originals| 30 |
|
||||
| Bolex Variable Shutter | 14 | Hicon Processing | 30 |
|
||||
| Linear Fade | 14 | Optical Printed Release Prints | 31 |
|
||||
| Other Fades | 14 | Ritual and Art | 31 |
|
||||
:::
|
|
@ -0,0 +1,24 @@
|
|||
# NOTES ON OPTICAL PRINTER TECHNIQUE
|
||||
|
||||
Reproduction on the guide written by Dennis Couzin.
|
||||
|
||||
### PDF Dependencies
|
||||
|
||||
* pandoc
|
||||
|
||||
```bash
|
||||
bash compile.sh
|
||||
```
|
||||
|
||||
### Text extraction dependencies
|
||||
|
||||
* Python3.7
|
||||
* OpenCV 2
|
||||
* Tesseract
|
||||
* PIL
|
||||
|
||||
```bash
|
||||
cd extract
|
||||
python3 pdf.py > ../ocr/pdf_output.txt
|
||||
python3 ocr.py > ../ocr/tesseract_output.txt
|
||||
```
|
|
@ -0,0 +1,5 @@
|
|||
#!/bin/bash
|
||||
|
||||
mkdir -p pdf
|
||||
|
||||
pandoc --css=noopt.css -o pdf/NOTES_ON_OPTICAL_PRINTER_TECHNIQUE.pdf NOTES_ON_OPTICAL_PRINTER_TECHNIQUE.md
|
|
@ -0,0 +1,20 @@
|
|||
#pip install pytesseract PyMuPDF Pillow opencv-python
|
||||
|
||||
import fitz
|
||||
import io
|
||||
from PIL import Image
|
||||
import pytesseract
|
||||
import cv2
|
||||
|
||||
pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'
|
||||
file = "../original/NOTES_ON_OPTICAL_PRINTER_TECHNIQUE.pdf"
|
||||
|
||||
pdf_file = fitz.open(file)
|
||||
|
||||
for page in pdf_file:
|
||||
pix = page.get_pixmap(dpi=300)
|
||||
filePath = "pages/page-%i.png" % page.number
|
||||
pix.save(filePath)
|
||||
image = cv2.imread(filePath)
|
||||
text = pytesseract.image_to_string(image, lang='eng', config='--psm 6 --oem 3')
|
||||
print(text)
|
|
@ -0,0 +1,12 @@
|
|||
#pip install pytesseract PyMuPDF
|
||||
|
||||
import fitz
|
||||
import io
|
||||
|
||||
file = "../original/NOTES_ON_OPTICAL_PRINTER_TECHNIQUE.pdf"
|
||||
|
||||
pdf_file = fitz.open(file)
|
||||
|
||||
for page in pdf_file:
|
||||
text = page.get_text().encode("utf8")
|
||||
print(text.decode("unicode_escape"))
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue