Padre Cooler Options

Description

This page describes the possible options for tuning the ranking using the cool query processor option. For more information about how raking works, see Funnelback_Ranking_Algorithms.

Those options can either be set in Query processor options (collection.cfg) or using CGI parameters (e.g. ...&cool.2=12&cool.3=34...).

List of cooler options

Number	Description
0	content: content weight
1	onlink: onsite link weight
2	offlink: offsite link weight
3	urllen: URL length weight
4	qie: external evidence (qie) weight
5	date_proximity: proximity to current date weight
6	urltype: URL attractiveness (Homepages favoured. Copyright pages and URLS with lots of punctuation deprecated.)
7	annie: annotation weight (annie)
8	domain_weight: weight associated with this domain
9	geoprox: geographical proximity to origin
10	nonbin: non-binariness (1 for html, xml, txt, 0 otherwise)
11	no_ads: freedom from ads
12	imp_phrase: implicit phrase match score
13	consistency: consistency of evidence. (Extra reward for docs with non-zero scores on both content and annie.)
14	log_annie: logarithm of annotation weight (log(annie))
15	anlog_annie: absolute-normalised logarithm of annotation weight.
16	annie_rank: annotation rank = (k - rank)/ k. where k = 2 x highest rank requested - if rank > k, rank = k
17	BM25F: field-weighted Okapi score
18	an_okapi: absolute-normalised Okapi score.
19	BM25F_rank: field-weighted Okapi rank.
20	mainhosts: bias in favour of principal servers (web search only).
21	comp_wt: component collection weighting. (meta collections only).
22	document_number: document number in the crawl. An early position in the crawl may correlate with importance
23	host_incoming_link_score
24	host_click_score
25	host_linking_hosts_score
26	host_linked_hosts_score
27	host_rank_in_crawl_order_score
28	host_domain_shallowness_score
29	doc_matches_regex: document matches administrator supplied regex
30	doc_does_not_match_regex: document does not match administrator supplied regex
31	titleWords: number of words in title
32	contentWords: number of indexed words in document
33	compressionFactor: compressibility of document text
34	entropy: entropy of document
35	stopwordFraction: fraction of stopwords in the document
36	stopwordCover: fraction of stopword list present in the document
37	averageTermLen: average term length
38	distinctWords: number of distinct words in the document
39	maxFreq: frequency of most frequently occurring term
40	titleWords_neg: Neg number of words in title
41	contentWords_neg: Neg number of indexed words in document
42	compressionFactor_neg: Neg compressibility of document text
43	entropy_neg: Neg entropy of document
44	stopwordFraction_neg: Neg fraction of stopwords in the document
45	stopwordCover_neg: Neg fraction of stopword list present in the document
46	averageTermLen_neg: Neg average term length
47	distinctWords_neg: Neg number of distinct words in the document
48	maxFreq_neg: Neg frequency of most frequently occurring term
49	titleWords_abs: Abs number of words in title
50	contentWords_abs: Abs number of indexed words in document
51	compressionFactor_abs: Abs compressibility of document text
52	entropy_abs: Abs entropy of document
53	stopwordFraction_abs: Abs fraction of stopwords in the document
54	stopwordCover_abs: Abs fraction of stopword list present in the document
55	averageTermLen_abs: Abs average term length
56	distinctWords_abs: Abs number of distinct words in the document
57	maxFreq_abs: Abs frequency of most frequently occurring term
58	titleWords_abs_neg: Abs number of words in title
59	contentWords_abs_neg: Neg abs number of indexed words in document
60	compressionFactor_abs_neg: Neg abs compressibility of document text
61	entropy_abs_neg: Neg abs entropy of document
62	stopwordFraction_abs_neg: Neg abs fraction of stopwords in the document
63	stopwordCover_abs_neg: Neg abs fraction of stopword list present in the document
64	averageTermLen_abs_neg: Neg abs average term length
65	distinctWords_abs_neg: Neg abs number of distinct words in the document
66	maxFreq_abs_neg: Neg abs frequency of most frequently occurring term
67	lexical_span_score
68	doc_matches_cgscope1: Documents which match gscope defined by -cgscope1 (if defined)
69	doc_matches_cgscope2: Documents which match gscope defined by -cgscope2 (if defined)
70	doc_does_not_match_cgscope1: Documents which do not match gscope defined by -cgscope1 (if defined)
71	doc_does_not_match_cgscope2: Documents which do not match gscope defined by -cgscope2 (if defined)
72	raw_annie: Untransformed annie score linealry scaled to 0..1

Values

Values are unbounded, but typical weights range from 0-100.

Example

To set the query processor to ignore URL length, but give a high weight to phrase matches implied by the query:

 query_processor_options=-cool.3=0 -cool.12=100

Padre Cooler Options

Description

List of cooler options

Values

Example

See also

Contents