Query Logs

Introduction

The Funnelback query logs provide an important record of the queries that have been used on the search service. The query logs are vital for purposes such as generating reports on how the search service is being used.

Log Locations

The Funnelback query log is located at:

data_root/collection_name/live/log/queries.log (e.g. /opt/funnelback/data/shakespeare/live/log/queries.log)

This log file is updated for each query.

When the collection is updated, these files are then archived by moving them to the directory:

data_root/collection_name/archive

The archived file is named according to the current date:

Example: /opt/funnelback/data/shakespeare/archive/queries.log.20040715

Log Formats

Query log files have the following format: one line in the logfile per query processed and each line contains fields separated by commas. Some example lines from a query log:

...
Mon Sep  8 14:51:13 2008,100.200.300.400,gdp graph,g"0",,1,10,2x,0,1,28,info
Mon Sep  8 14:51:26 2008,99.88.77.66,public holidays,g"0",,1,10,2x,21677,425628,7,
Mon Sep  8 14:51:51 2008,4.3.2.1,tax,,,1,10,2x,129974,0,4,
...

The comma-separated fields are described in the following table:

Field no.Field nameNotes
1date_timeThe format is: Fri Feb 22 12:44:22 2002
2requester_IPThe IP address of the request (note: it may be a proxy, not the end-user's workstation).
3queryThe canonical query as actually processed by PADRE.
4include_scopeThis will either be a number indicating an fscope value or an expression: g"" where is a valid gscope expression. An optional textual scope will be appended after a '''
5exclude_scopeSee include_scope
6start_rankThe result rank for the first item. For example, page 2 may have start_rank = 21 if the first page contained 20 results.
7num_ranksThe number of results per page.
8codesQuery processing settings, represented by single characters as shown in the next table.
9full_matchesNumber of items matching all query conditions.
10partial_matchesNumber of partial matches.
11elapsed_timeTime taken by PADRE to process the query (in milliseconds).
12profileThe profile used by the query.
13user_idUnique identifier of a user, if Search session and history is enabled. If not, then a dash "-" will be written to this field..

The following table gives an explanation of the query processing codes:

Field CodeNotes
CResults came from query cache.
IResults used to initialise query cache.
GPADRE run directly from CGI.
SAutomatic query word stemming in force.
ZExpired or killed documents included in ranking.
rTop documents reranked by combination of score, recency and "homepageness".
uTop documents reranked by URL.
tTop documents reranked by title.
dTop documents reranked by recency only.
?Unknown reranking method.
hDirectly generated HTML.
wOld style result formatting (search.cgi).
xXML result format.
*Other result presentation format.
0-9Scoring mode

Logging of IP addresses

An administrator can control how search request IP addresses are logged via the "user ID to log" collection setting.

See also

top