Skip to content

crawler.accept_files (collection.cfg setting)

Description

This is a comma-separated list of file extensions that will be downloaded by the crawler. It is normally left empty, so that the crawler will accept all valid content regardless of the suffix.

Default value

This means there are no restrictions on what files will be downloaded.

Examples

crawler.accept_files=htm,html,asp,php,txt,stm,jsp,xml,cfm,pdf

In this example a specific list of filetypes (based on suffix) is listed - only files of these types will be downloaded.

See also

top

Funnelback logo
v15.16.0