Search Imaged Text

ContentCrawler Overview

Learn more about



ContentCrawler is an integrated analysis, processing and reporting framework that provides document management professionals with the peace of mind of knowing that their content is 100% searchable.

The automated end-to-end process intelligently assesses image-based documents in the content repository for conversion to a searchable format, which is then re-profiled.

Key Benefits

- Increase organizational productivity
- Simplify management of image-based documents
- Reduce non-compliance risks
- Increase efficiency through automation
- Leverage investment in DMS and search technology
- Reduce costs managing OCR technology

Key ContentCrawler Features

Search and Find

ContentCrawler finds image-based documents, even ones within email attachments stored in a content repository. These documents will be OCR’ed and converted to text-searchable PDFs.

Super Smart Technology

ContentCrawler will not OCR documents that have a text layer, or documents that have been identified as having little or no text. The “text threshold” can be set by the Administrator to ignore documents with minimal text.

Monitoring Modes

Processing can run in one of two (or both) modes: Convert Backlog or Active Monitoring. Convert Backlog converts all legacy documents to text-searchable PDFs, while Active Monitoring converts documents as soon as they are profiled into the content repository.

Reprofiling Documents

Documents are converted to text-searchable PDFs and automatically saved as New Versions, Attachments or Related Documents in the DMS. These documents are now text-searchable and ready to be found by your DMS search technology.

Automated or Manual Process

Converting image-based documents to text-searchable PDFs can be an automated end-to-end process or a manual one with built-in “Hold for Review” stages before Convert to PDF and/or Save Back into the DMS.

ContentCrawler Integrations

ContentCrawler is an integrated analysis, processing and reporting framework that gives document management professionals peace of mind knowing their content is 100% searchable.

ContentCrawler integrates with the following systems:

- Autonomy iManage
- OpenText
- MS SharePoint
- OpenText Content Server
- OpenText eDOCS DM
- OpenText Livelink
- ProLaw
- Worldox