How to Create Indexable Colored PDF File
Can you create indexable text in PDF files?
There is a misconception among many people that PDF files can’t/aren’t indexed by Google and the other search engines during searches. Although PDF files are not ideal for SEO in many cases, they can be indexed if properly optimized for SEO. There are times when you may want to use a PDF file on your website such as for material that is meant to be down loaded like brochures, product manuals and spec sheets. Proper PDF SEO optimization will ensure that Google does include the PDF in its search, and in some cases it will rank quite high. For your PDF to achieve a decent ranking make sure there is indexable text in PDF files.
How to create indexable text in PDF files
There is more than one type of PDF file. The PDF you create may be:
PDF Image Only – Any text that you might see in a PDF Image Only file is not searchable. This is because rather than text it is actually an image of text. Completing document properties, such as title, author and subject make this type of PDF somewhat searchable but as the content can’t be read by search engine crawlers it won’t be ranked very high. But this format is excellent when you want to make text private, like in invoices and transaction documents.
PDF Searchable Image (Exact) – This PDF file type stores image information on one layer that is seen, and a text version of the document on another layer that is hidden, making files easy to index, although files will be larger. This is the format normally used for searchable PDF scanning and is often referred to simply as PDF Searchable. In this case colors are saved within the PDF file as 8-bit to 24-bit files.
PDF Searchable Image (Compact) – Similar to PDF Searchable Image Exact but uses a color-segmentation process to make files smaller. This format is often used when the document has 2 types of content in it: color and text. When you choose to save a file in this format, image parts are stored in the PDF file as JPEG data. At the same time, text is stored as G4 or Zip compressed data.
PDF Formatted Text and Graphics (PDF Normal) – The usual PDF output produced from a text processing or authoring environment, such as Microsoft Word. This format does not use bit-mapped images but has real computer-generated text and graphics, using only one layer. Documents are fully searchable. Instead of a scanned image, you see computer-generated text and graphics that scale without loss of definition.
PDF indexibility will be increased a great deal if you use a PDF format that can be read by Google and other search engines. Basically this will be any PDF format except for PDF Image Only. Unfortunately many designers will use Photoshop or Dreamviewer at once and create an image only file for things like your company brochure which will then not be indexed when you add them to your site.
Ensuring text graphics in PDF files are indexable
In some cases, such as with brochures, a PDF file may have very little text other than that contained in graphics. To make sure you have indexable text in PDF graphics, the PDF should be created using Adobe Illustrator as opposed to Photoshop. Illustrator is vector based so there is no distortion when image sizes are changed and there is an option to save as a PDF file that allows text in graphics to be read. So everything you need to do is just convert your existing graphics to Adobe Illustrator (in case it was initially created in Photoshop) and save it as PDF file.
To learn how to create PDF invoices feel free to browse our blog.