Controlling metadata field weighting
Funnelback's ranking algorithm includes settings that control metadata weightings. These settings can be used to upweight or downweight results where the query terms appear within specific metadata fields. This is achieved by setting the
wmeta ranking options.
Scoring mode 2
-sco=2 ranking option allows specification of the metadata fields that will be considered as part of the ranking algorithm.
By default link text, clicked queries and titles are included (
-sco=2[k,K,t]). The list of metadata fields to use with
sco=2 is defined within square brackets when setting the value.
E.g. apply scoring to the default metadata fields as well as customField1 and customField2.
Once scoring mode 2 is enabled separate weightings can be assigned to each defined field using a corresponding
wmeta ranking option.
A default weighting of 1.0 is applied to all listed metadata fields except for anchor text (k) and click information (K) which both receive a default weighting of 0.5.
A larger value provides a bigger upweight.
Individual weights can be applied. For example, reduce the default upweighting to the
t metadata field:
Assume that the following metadata is mapped for a collection (in the metamap.cfg):
description,1,dc.description author,0,dc.author section,0,site.section datePublished,0,dc.date.published dateModified,0,date.modified articleText,1,article.content articleTitle,1,article.title articleKeywords,1,article.subjects articleAbstract,1,articleAbstract
The following ranking options (set as part of the
query_processor_options within collection.cfg) could be used to upweight the text within the articleTitle, articleAbstract and articleText metadata classes.
-sco=2[articleText,articleTitle,articleAbstract] -wmeta.articleText=0.3 -wmeta.articleAbstract=0.75 -wmeta.articleTitle=1.0
This tells Funnelback to apply metadata weightings to the articleText, articleTitle and articleAbstract fields (the
-sco=2 parameter) then apply non-default weightings to articleText, articleAbstract and articleTitle.