crawler.incremental_logging (collection.cfg setting)
This parameter controls whether information on different URL types seen during an incremental crawl is logged out. If this setting is enabled then the following log files will be created:
- new_urls.log: A new URL is defined as one which was not stored in the previous crawl.
- copied_urls.log: All URLs whose content was copied from the previous crawl, as they had not changed and so were not downloaded again.
The logs will be located in the log directory in the relevant "view" for the collection in question and can be viewed using the log viewer.
The default behaviour is false i.e. do not perform this logging.
Turn on incremental logging: