Skip to content

crawler.reject_files

Do not crawl files with these extensions.

Key: crawler.reject_files
Type: List<String>
Can be set in: collection.cfg

Description

This is a comma-separated list of file extensions to reject. The crawler will not download any file whose URL ends with an extension in this list.

Default Value

crawler.reject_files=asc,asf,asx,avi,bat,bib,bin,bmp,bz2,c,class,cpp,css,deb,dll,dmg,dvi,exe,fits,fts,gif,gz,h,ico,jar,java,jpeg,jpg,lzh,man,mid,mov,mp3,mp4,mpeg,mpg,o,old,pgp,png,ppm,qt,ra,ram,rpm,svg,swf,tar,tcl,tex,tgz,tif,tiff,vob,wav,wmv,wrl,xpm,zip,Z

See Also

top

Funnelback logo
v15.24.0