Skip to content

filter.jsoup.classes (collection.cfg setting)

Description

This setting specifies a list of Java/Groovy classes that are run by the Jsoup filter.

Values

The value of this setting is expected to be a comma separated list of filter class names to be run in the order specified (left to right).

The names given in this configuration option should be fully qualified Java/Groovy class names, or simple class names which are then assumed to exist within the com.funnelback.common.filter.jsoup package. Groovy classes will be loaded from $SEARCH_HOME/lib/java/groovy or the collection's @groovy directory, and where they are declared within a package, their location in the directory structure below must reflect that.

Default value

filter.jsoup.classes=ContentGeneratorUrlDetection,FleschKincaidGradeLevel,UndesirableText

Example

Add an additional custom Jsoup filter (com.example.CustomFilter) that will process the HTML after all the default Jsoup filters have run:

filter.jsoup.classes=ContentGeneratorUrlDetection,FleschKincaidGradeLevel,UndesirableText,com.example.CustomFilter

See also:

top

Funnelback logo
v15.16.0