I’ve had some trouble with PDFs that were just images of pages of text (easy way to tell, assuming you’re on linux, is run pdftotext on it and see if you get anything). There’s a utility called pdfsandwich that will use Tesseract to OCR the images and add text to the PDF.
I’ve had some trouble with PDFs that were just images of pages of text (easy way to tell, assuming you’re on linux, is run
pdftotexton it and see if you get anything). There’s a utility calledpdfsandwichthat will use Tesseract to OCR the images and add text to the PDF.That might help too.
Thx crude scans are the only way you can get a lot of the more fringe books on https://annas-archive.li/ etc.