crawler.secondary_store_root
Location of secondary (previous) store - used in incremental crawling.
Key: crawler.secondary_store_root
Type: String
Can be set in: collection.cfg
Description
This parameter is used to specify the location of a secondary storage area e.g. data stored from previous crawls. This data can then be used in incremental crawling, where the webcrawler will check the secondary store and not download content from the web which hasn't changed.
For example, this can be pointed at a "live" data directory and the crawler will make use of this data when crawling into the offline data area.
Note: Path should not include protocol or server names e.g. use something like /opt/cache/data
and NOT /opt/cache/data/http
When a web collection is created the Funnelback administration interface will insert the correct location for this parameter.
Default Value
crawler.secondary_store_root=$SEARCH_HOME/data/$COLLECTION_NAME/live/data
Examples
For example, location of live data area for a given collection:
crawler.secondary_store_root=/opt/funnelback/data/collection/live/data