Exchange Search: Searching Adobe PDF

by Bharat Suneja

Exchange Search is the little known, and mostly unappreciated feature of modern Exchange Servers— yes, we mean Exchange 2010 and Exchange 2007. It’s different. It’s fast. It mostly works under the hood, without any sexy admin interfaces that would expose its features, functionality, and performance.

In these days of ever-growing, large mailboxes (think 10 Gigabytes or larger), Exchange Search becomes super-important to wade through the tens of thousands or millions of messages one may have in a mailbox. And if you’re using Exchange 2010’s Personal Archives, add another few years’ worth of messages in archive mailboxes to the mix. If you use the new Discovery features (aka “Multi-Mailbox Search“) in Exchange 2010, these also use the content indexes generated by Exchange Search.

Besides indexing the text in your email messages, Exchange Search also indexes supported attachments. To index different type of file attachments, Exchange Search (or, to be more precise, the underlying Windows Search service) must be able to parse the file. This is done through components called IFilters. Microsoft ships IFilters for some popular file formats, including its own file formats for Microsoft Office apps – Word, Excel, PowerPoint, OneNote, Visio, etc. IFilters for other file formats are also available from third-parties.

Installation of Office 2007 Filter Pack is a pre-requisite on Exchange 2010 RTM Mailbox and Hub Transport servers and you must register the filters by running a PowerShell script.

Update: Exchange 2010 SP1 requires the Microsot Office 2010 Filter Packs and Exchange setup registers the filters automatically.

You can install IFilters for specific file formats not indexed by Exchange Search by default (or using IFilters included in the Office Filter Pack). When you install a new IFilter, it’s registered with Windows Search. You must still register the IFilter with Exchange Search. For details, see Register Filter Pack IFilters with Exchange 2010.

Indexing Adobe PDF

Adobe’s Portable Document Format (PDF) is a platform for secure, multi-platform distribution of documents. Given the popularity of PDF file format, the Adobe PDF IFilter is perhaps one of the more commonly installed IFilters on Exchange Servers.

You can download Adobe PDF IFilter v6.0 from the Adobe web site. Current versions of the PDF IFilter are included in the free Acrobat Reader according to Adobe (and not available as standalone IFilter downloads).

Update 7/19/2011: While updating this post, I noticed Adobe has released an updated version of the PDF IFilter. Download it from Adobe PDF iFilter 9 for 64-bit platforms.

Adobe’s download pages don’t include a date, so I’m not quite sure when this was released. The download page does include a link to Configuring PDF iFilter for MS Exchange Server 2007.pdf – perhaps an indicator that the IFilter was released before Exchange 2010.

Paul Robichaux ran into an issue with Adobe’s PDF IFilter recently. As Paul notes:

I hope that Adobe fixes its IFilter to work properly; it’s a shame that Adobe’s poor implementation is making Exchange search look bad.

Head over to Exchange Search Indexing and the Problem with PDFs on Windows for the details.

{ 0 comments… add one now }

Leave a Comment

{ 1 trackback }

Previous post:

Next post: