filter.ignore.mimeTypes
Specifies a list of MIME types for the filter to ignore.
Key: filter.ignore.mimeTypes
Type: List<String>
Can be set in: collection.cfg
Description
This parameter allows you to specify an optional comma separated list of MIME types that the filter should ignore.
Default Value
filter.ignore.mimeTypes=
Examples
If some .mov video files are being served using the MIME type application/octet-stream
then if we
want to store them as is (without filtering):
filter.ignore.mimeTypes=application/octet-stream
You may also need to add the relevant suffix (in this case ".mov"), to the crawler.non_html parameter, and remove it from crawler.reject_files. You may also need to consider what type of crawler.classes.URLStore to use e.g. MirrorStore will store the content as separate files.