Skip to content

Padre Cooler Options

Description

This page describes the possible options for tuning the ranking using the cool query processor option. For more information about how raking works, see Funnelback_Ranking_Algorithms.

Those options can either be set in Query processor options (collection.cfg) or using CGI parameters (e.g. ...&cool.2=12&cool.3=34...).

List of cooler options

Number Description
0 content: content weight
1 onlink: onsite link weight
2 offlink: offsite link weight
3 urllen: URL length weight
4 qie: external evidence (qie) weight
5 date_proximity: proximity to current date weight
6 urltype: URL attractiveness (Homepages favoured. Copyright pages and URLS with lots of punctuation deprecated.)
7 annie: annotation weight (annie)
8 domain_weight: weight associated with this domain
9 geoprox: geographical proximity to origin
10 nonbin: non-binariness (1 for html, xml, txt, 0 otherwise)
11 no_ads: freedom from ads
12 imp_phrase: implicit phrase match score
13 consistency: consistency of evidence. (Extra reward for docs with non-zero scores on both content and annie.)
14 log_annie: logarithm of annotation weight (log(annie))
15 anlog_annie: absolute-normalised logarithm of annotation weight.
16 annie_rank: annotation rank = (k - rank)/ k. where k = 2 x highest rank requested - if rank > k, rank = k
17 BM25F: field-weighted Okapi score
18 an_okapi: absolute-normalised Okapi score.
19 BM25F_rank: field-weighted Okapi rank.
20 mainhosts: bias in favour of principal servers (web search only).
21 comp_wt: component collection weighting. (meta collections only).
22 document_number: document number in the crawl. An early position in the crawl may correlate with importance
23 host_incoming_link_score
24 host_click_score
25 host_linking_hosts_score
26 host_linked_hosts_score
27 host_rank_in_crawl_order_score
28 host_domain_shallowness_score
29 doc_matches_regex: document matches administrator supplied regex
30 doc_does_not_match_regex: document does not match administrator supplied regex
31 titleWords: number of words in title
32 contentWords: number of indexed words in document
33 compressionFactor: compressibility of document text
34 entropy: entropy of document
35 stopwordFraction: fraction of stopwords in the document
36 stopwordCover: fraction of stopword list present in the document
37 averageTermLen: average term length
38 distinctWords: number of distinct words in the document
39 maxFreq: frequency of most frequently occurring term
40 titleWords_neg: Neg number of words in title
41 contentWords_neg: Neg number of indexed words in document
42 compressionFactor_neg: Neg compressibility of document text
43 entropy_neg: Neg entropy of document
44 stopwordFraction_neg: Neg fraction of stopwords in the document
45 stopwordCover_neg: Neg fraction of stopword list present in the document
46 averageTermLen_neg: Neg average term length
47 distinctWords_neg: Neg number of distinct words in the document
48 maxFreq_neg: Neg frequency of most frequently occurring term
49 titleWords_abs: Abs number of words in title
50 contentWords_abs: Abs number of indexed words in document
51 compressionFactor_abs: Abs compressibility of document text
52 entropy_abs: Abs entropy of document
53 stopwordFraction_abs: Abs fraction of stopwords in the document
54 stopwordCover_abs: Abs fraction of stopword list present in the document
55 averageTermLen_abs: Abs average term length
56 distinctWords_abs: Abs number of distinct words in the document
57 maxFreq_abs: Abs frequency of most frequently occurring term
58 titleWords_abs_neg: Abs number of words in title
59 contentWords_abs_neg: Neg abs number of indexed words in document
60 compressionFactor_abs_neg: Neg abs compressibility of document text
61 entropy_abs_neg: Neg abs entropy of document
62 stopwordFraction_abs_neg: Neg abs fraction of stopwords in the document
63 stopwordCover_abs_neg: Neg abs fraction of stopword list present in the document
64 averageTermLen_abs_neg: Neg abs average term length
65 distinctWords_abs_neg: Neg abs number of distinct words in the document
66 maxFreq_abs_neg: Neg abs frequency of most frequently occurring term
67 lexical_span_score
68 doc_matches_cgscope1: Documents which match gscope defined by -cgscope1 (if defined)
69 doc_matches_cgscope2: Documents which match gscope defined by -cgscope2 (if defined)
70 doc_does_not_match_cgscope1: Documents which do not match gscope defined by -cgscope1 (if defined)
71 doc_does_not_match_cgscope2: Documents which do not match gscope defined by -cgscope2 (if defined)
72 raw_annie: Untransformed annie score linealry scaled to 0..1

Values

Values are unbounded, but typical weights range from 0-100.

Example

To set the query processor to ignore URL length, but give a high weight to phrase matches implied by the query:

 query_processor_options=-cool.3=0 -cool.12=100

See also

top

Funnelback logo
v15.24.0