I have a PDF article (not created by me). However, I can not search for text in the PDF. All PDF viewers I've tried return zero results for words that are obviously in there. I've tried with Adobe Acrobat Professional 8, SumatraPDF and Google Chrome.
How can I find out why the document is not searchable?
Things I've checked:
- The PDFproducer is reported as 'pdftopdf' and PDf version is reported as 1.3. However, it seems to have been created in something like MSWord or OpenOffice (but not *TEX).
- It is definitely not a scanned document, as the font is crisp-clear at all zoom levels, and text is selectable.
- If I look at the security settings (ctrl-D in Adobe Acrobat), everything is allowed (like printing, copying, ...).
- my search options do not have 'match case' turned on
- I can not turn it into a searchable document using Acrobat's 'Recognize text using OCR' as it reports: 'This page contains renderable text'.
So, what else could be the reason for the DPF not being searchable? And how to make it text-searchable?
35 Answers
It may have a custom font encoding that assigns code points to characters in a way that is incompatible with established encodings such as ASCII or UTF-8/Unicode.
It may render characters individually out of sequence
It may have had characters flattened to paths
See Stack Overflow questions How do you debug PDF files? and the now deleted PDF Font encoding — why can't I copy text from a PDF?
To make it text searchable, the best way may be to go back to the original source (e.g. a Word document) and use a different process to produce the PDF. Alternatively you could try rendering your current PDF as a bitmap and then using OCR, but this will be tedious and produce poor results.
1I found a way around this problem. I did tools -> edit document text, then for each page, I hit Control-A (select all), then right-clicked and went to properties, and changed the font to something else. After I did this, the text was searchable and I could copy the text!
2I was having the same problem, and in frustration, googled to find an answer. It turns out that for me, the problem was simply that I was using Preview on my iMac to view and search the PDF. In most cases, searching works in Preview. But for a large book downloaded from Google Books, it didn't.
What worked was simply opening the PDF in Adobe Reader. (Duh, what a concept, I know.) Now I can search. This probably won't work for everyone with a Mac, but it might help someone.
2go to Edit / preferences - select 'search' from the left hand side of preferences screen - then 'Purge Cache Contents' - select OK then close and reopen the document
So after trying a lot of things that didn't work. Here's how I actually got this done:
Find yourself a PDF to Word converter or something. (I recommend )
Follow al the necessary steps to convert BUT before that--
Find the button that says something like 'optical character recognition' and click that
Convert your file and you should be golden.