Skip to content

crawler.max_dir_depth

Specifies the maximum number of sub directories a URL may have before it will be ignored.

Key: crawler.max_dir_depth
Type: Integer
Can be set in: collection.cfg

Description

This option sets the limit for the number of directories in a valid URL. The crawler will ignore all URLs that have more than this number of directories. Typically if there are too many directories, it is likely to be a crawler trap, so this limit should not be set too high.

Note: this limit is not checked for dynamic URLs, e.g. ones containing a '?'.

Default Value

crawler.max_dir_depth=15

Examples

crawler.max_dir_depth=2

Will have the following effect:

http://host/one/two/ok
http://host/one/two/three/fails

See Also

top

Funnelback logo
v15.22.0