Skip to content

crawler.store_empty_content_urls (collection.cfg setting)

Description

This parameter can be used to tell the webcrawler to store URLs even if, after they are filtered, they contain no content. Such URLs may be useful to store if, for example, they are PDF documents containing only images which can be returned on the basis of anchor text or words in the URL alone.

When enabled, any URLs which are stored despite having no content will be listed in the url_no_content.log file.

Default value

crawler.store_empty_content_urls=false

See also

top

Funnelback logo
v15.16.0