does the following:
scanned pdf file -> images -> text -> gpt-4o -> translated word doc
see test.ipynb for details
install requirements
pip install -r requirements.txt
process the entire PDF:
python main.py attention.pdf --language "Chinese (Traditional)"
process a single page:
python main.py attention.pdf --language "Chinese (Traditional)" --single-page --page-number 1