Stemming
Introduction
Stemming is the process of reducing words to a common stem and allowing the search to match different variants of the word based on the common word stem.
The default Funnelback configuration supports automatic stemming so that queries match closely related words e.g. "parties" may also match "party". However, stemming may sometimes harm retrieval effectiveness e.g. returning documents containing "Hawk" or "Hawkins" for the query "Hawking".
The stemming is controlled with the query processor option -stem
.
Light stemming
This is the default for Funnelback. Light stemming stems words to singular and plural forms of the same word. Support is provided for English and French words. E.g. dog/dogs, worry/worries.
Light stemming is applied by setting the query processor option:
query_processor_options= -stem=2
Heavy stemming
Heavier stemming designed as a limited extension to cover subject/professional matching - science/scientist, biology/biologist. It does not do stemming of participles, so bullying will not be considered equivalent to bully, in the same way that Hawking is not equivalent to Hawk or Hawks.
Heavy stemming is applied by setting the query processor option:
query_processor_options= -stem=3
Disable stemming
Stemming is disabled by setting the query processor option:
query_processor_options= -stem=0
Note: -stem=1
is a discontinued option and has the same effect as setting -stem=0
.