Fast File Search
Current version:
Stable: 1.0.14
Development: 1.1.13

Introduction
Demo (static database)

Installation
Upgrade
ChangeLog

Translation
Contributing
Browse CVS repository

Comments
Frequently Asked Questions

Download

LiveCD
Information about LiveCD
Download

FFSearch @ SourceForge
FFSearch @ FreshMeat


SourceForge Logo

Valid XHTML 1.0!



What is Fast File Search?

Fast File Search is a crawler for FTP servers and SMB shares that can be found on Windows or UNIX systems running Samba. It provides a web interface for searching files. It is optimized for searching files by a wildcard when there are some normal (not '*' or '?') chars specified in the beginning or in the end of the mask (for example '*.iso').

What do I need to run it?

Fast File Search crawler runs on UNIX (currently only Linux has been tested but I do not now any reasons why it should not work on other UNIXes). Fast File Search uses MySQL database, web interface needs a web server with PHP >= 4.0.3 and the crawler needs some perl modules (for details check the installation page).

How does it work?

The crawler (ffsearch.pl) crawls the network (FTP servers from the list and all reachable SMB hosts on the local network) and stores the information about files into database. It is invoked at certain times each day via crontab entries. There are two modes of operation of the crawler: complete crawl and incremental crawl. The crawler expects a command line argument that tells crawler which mode to run (-c or --complete for complete crawl, -i or --incremental for incremental crawl). Both modes retrieve a list of the active SMB hosts in all workgroups.
The complete crawl tries to scan all active hosts and all hosts that are listed in database. The complete crawl should be run once a day.
The incremental crawl tries to scan active hosts and hosts listed in database that have not been scanned since the last complete crawl because they were unreachable. The incremental crawl should be run several times a day, for example each 3 hours.
How does the crawler get know whether the host has been crawled since the last complete crawl? Each time the complete crawl is executed, the expire count is incremented first. When the host is crawled, expire count is set to zero. So all hosts whose expire count > 0 were not reachable since the last complete crawl. Moreover, when expire count reaches value specified in configuration (i.e. it was unreachable during the time period of <expire count> complete crawls) the information about files on the "expired" host is deleted from database.

Web interface is used to search the files in database, details how to search are described in the Help section of the search page.
You can also add a FTP server to a FTP server list, edit FTP server in the list or delete FTP server from the list through the web interface. So that anybody could not do anything with the server list only the record about abcdef is editable from host abcdef. There are also admins who can edit all records in the server list. The admins login through the web interface.