Skip to content

crawler.store_empty_content_urls

Specifies if URLs that contain no content after filtering should be stored.

Key: crawler.store_empty_content_urls
Type: Boolean
Can be set in: collection.cfg

Description

This parameter can be used to tell the webcrawler to store URLs even if, after they are filtered, they contain no content. Such URLs may be useful to store if, for example, they are PDF documents containing only images which can be returned on the basis of anchor text or words in the URL alone.

When enabled, any URLs which are stored despite having no content will be listed in the url_no_content.log file.

Default Value

crawler.store_empty_content_urls=false

See Also

top

Funnelback logo
v15.22.0