Are there any tools I can use for translating a ~400 pages scanned book?

morto@piefed.social · edit-2 3 months ago

Are there any tools I can use for translating a ~400 pages scanned book?

andrew0@lemmy.dbzer0.com · 2 months ago

Be wary that their docs are so and so. Nanonets OCR, Mistral OCR and MinerU will also extract formulas and images.

One other model I forgot to mention is Docling. This one is quite quick to set up in a docker container, and will have a web interface ready to go where you can upload documents. This sort of follows the PaddleOCR pipeline, but also allows you to use vLMs.

Good luck!