DocFetcher
DocFetcher is an Open Source desktop search application that enables you to quickly locate content in your documents based on a keyword search.
Unlike many other desktop search tools, the program is not designed to index your entire hard drive, but only the folder locations that you actually want to be included in the search, thereby reducing overhead an increasing search performance. DocFetcher works by extracting the text content from supported document types and storing it in a searchable index. The search results return the extracted text content and highlight the search term for easy reference. You can further customize the indexing process by including or excluding files based on a RegEx formula. Other features include Windows Explorer right-click integration, special support for HTML files, automatic index updates and more. DocFetcher supports the following file formats: MS Office, OpenOffice.org, PDF, HTML, plain text, RTF, AbiWord and CHM files.
Features:
-
DocFetcher creates so-called index files on which the searches are performed. You can either create permanent indexes for large document repositories that change infrequently, or temporary ones for quick full-text searches in small folders.
-
Temporary indexes can be created by rightclicking on a folder and selecting the menu item "Search With DocFetcher", and they are automatically disposed of after program termination. (You can of course tell DocFetcher to keep them.)
-
The indexing process might take a few minutes for larger repositories, and is usually a matter of seconds for small folders (200 documents = about 1 min.).
-
After index creation, you can type keywords into DocFetcher's search box, e.g. "fourier analysis" and hit Enter. Then DocFetcher will list all documents inside the selected folders that contain these words – most of the time in less than a second.
But There's More To It…
-
Automatic index updates: Indexes are updated automatically when files in the corresponding folders are modified, even when DocFetcher isn't running. This is done via a daemon that waits in the background and watches all indexed folders. The daemon has very low CPU usage, because, rather than indexing files itself, it only remembers which indexes need to be updated the next time DocFetcher is launched.
-
A portable version: Runs on both Windows and Linux. You can put all your documents in it and then freely move the entire folder around (i.e., DocFetcher + indexes + documents). Possible destinations include other computers, encrypted volumes (TrueCrypt), CD-ROMs and USB drives. The portable version can also be used for sharing an indexed document repository across a local area network, or across the OS'ses of a Windows/Linux dual boot system.
-
Detection of HTML pairs, e.g. "foo.htm" and a folder named "foo_files". Each pair will be treated as a single document. This feature may seem rather useless on first sight, but it turned out that this dramatically increases the quality of the search results when you're searching for HTML files, since all the "clutter" inside the HTML folders disappears from the results.
-
Search in source code files: The file extensions by which DocFetcher recognizes plain text and HTML files can be fully customized. Therefore you can use DocFetcher to search in any kind of source code.
Other Notable Features:
- Regular expression based exclusion of files from indexing.
- Various file operations on the document repository can be performed through DocFetcher's interface (e.g. creating folders, inserting new files).
- Preview panel with search-term highlighting and a simple built-in web browser.
- Search results can be sorted and filtered by different criteria (filetype, filesize, path, etc.).
- Global hotkey to bring DocFetcher to the front.
Supported Document Formats:
- HTML and plain text (both customizable)
- Portable Document Format (pdf)
- Microsoft Office (doc, xls, ppt)
- Microsoft Office 2007 (docx, xlsx, pptx)
- OpenOffice.org Writer, Calc, Draw and Impress (odt, ods, odg, odp)
- Rich Text Format (rtf)
- AbiWord (abw, abw.gz, zabw)
- Microsoft Compiled HTML Help (chm)
- Microsoft Visio (vsd)
- Scalable Vector Graphics (svg)
Required: Java Runtime Environment
A Java Runtime Environment (JRE), version 1.6.0 or higher, is required. To find out what JRE version you have, open a command prompt and type in "java -version".
You can download Java for Windows from here. If you're on Linux, have a look at the official software repository of your distro.
Binary Downloads For Current Version Of DocFetcher
Note: If you have a 64-bit OS, you might have to replace an installed 64-bit Java Runtime with its 32-bit counterpart in order to make DocFetcher work. 64-bit Java is currently not supported. There is also a portable version you can download over here
OS Requirements:
Win 2000/Win 2003/Win Me
Related posts:
- InSight Desktop Search
- Index Your Files
- 7sDoc Lite
- Copernic Desktop Search Home Edition
- Everything search engine



