Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Panning for Gold: Page 4 of 18

Panoptic supports its own Java-based crawler, called FunnelBack. When you set up a Web collection, you define how a crawler will gather data for the search engine to index. In the advanced settings, you can directly edit a collection configuration file that contains the options for FunnelBack. For example, you can limit the length of time the crawler runs. You can also configure a maximum number of pages to store, limit the number of clicks (links) away from the home page and define many other settings. We excluded a file type to disregard Netgravity links. All the crawlers have a similar feature that excludes certain directories or files from a crawler's scrutiny. This is in addition to following the directives in a robots.txt file.




Search Engine Features

click to enlarge


FunnelBack took just less than nine hours to crawl our production Web site and index 34,720 documents--more than any other participant. Once it completed the crawl, Panoptic made the results to the collection immediately available to the default search form.

Because Panoptic does not provide a preview or prepublishing database--Kanisa or MondoSearch do--to test before going live, it has two options that protect you from putting a partially collected database into production. A changeover-percentage option specifies a minimum size to make a newly gathered collection available vis--vis the collection it is replacing. In addition, Panoptic has a "vital_servers" option, which prevents an update from overwriting your production database if a server is down during the collection process.

Panoptic's easy-to-use administrative interface set it apart from Kanisa and MondoSearch. In addition to setting parameters, you can use a form to update collections using the crontab file; this is a multistep process for Kanisa and MondoSearch. Panoptic also has extensive log files, but does not provide the reporting that Kanisa does.

Panoptic Enterprise Search Engine, CSIRO (Commonwealth Scientific and Industrial Research Organisation). +61-2-6216-7060. www.panopticsearch.com


Kanisa Site Search put its best foot forward as a question-and-answer system. Users can enter a question using a sentence or phrase they would use in natural speech. Kanisa answers that question with a question, guiding the user to the most appropriate answer by grouping relevant Web-site content with the guidance question. Compared with that of our other participants, Kanisa's method of getting the end user to the appropriate answer was the most complicated and costly process. Although the result can be rewarding, that did not justify the means and the cost.