Result Collapsing

Overview

Result Collapsing is the ability to collapse similar results into one, when displayed on the search results page. Results are considered similar when:

The list of fields to consider for similarity is controlled by the indexing.collapse_fields setting.

Workflow

Every time the collection is updated a signature file is generated. This file contains one or more signatures for each document, depending on the list of fields which has been configured. This signature file can then be used at query time to control how the query processor will collapse similar results.

Because the signature file is generated at indexing time, any change to the indexing.collapse_fields setting requires re-indexing the collection to take effect.

Presentation

At query time the query processor will detect which results should be collapsed together. The most relevant result for the current query constraints will be chosen as the "main" result, and other similar results will be collapsed with it. Collapsing is enabled with the -collapsing=on query processor option. This could be specified as a setting in the collection's collection.cfg file, or as a per-request CGI parameter e.g. collapsing=on.

The display of collapsed results can be controlled from the search form by using custom FreeMarker tags and inspecting the Data Model. Display options range from simply displaying the number of collapsed results next to the "main" result, to displaying a simplified view of each collapsed result as a "sub-result" of the main one. Display options can be controlled with the -collapsing_sig, -collapsing_num_ranks and -collapsing_SF query processor options.

To set up result collapsing on your collection, please follow the instructions below. As an example, we will be considering a collection containing job offers, on which:

  • The X metadata field is mapped to the state where the job is advertised.
  • The a metadata field is mapped to the employer offering the job.

This guide will explain how to configure the collection so that results can be collapsed on their content similarity, by state, or by employer.

Configure the collection

Navigate to Administration Home -> Administer Tab -> Edit Collection Settings and make the following changes:

  • Under the Interface tab, add -collapsing=on to Query processor options. This will enable result collapsing at query time.
  • Under the Indexer tab, set the Result collapsing fields to: [$],[a],[X]. This will generate a signature file based on the document content, the a and X metadata classes.

Update or re-index the collection so that the signature file gets generated.

Configure the form file

Collapsing-UI-Simple.png

Collapsed results can be displayed with the <@fb.Collapsed /> tag. In its simplest form this tag just displays the number of collapsed results with a link to access them:

Query the collection

Result collapsing has been enabled in the previous steps and should be active, however by default results will be collapsed on the similarity of their content. To collapse results on a specific metadata field, use the collapsing_sig parameter, either as a CGI parameter (http://server/s/search?collection=...&collapsing_sig=[a]) or as a query processor option (-collapsing_sig=[a]).

With collapsing_sig set to [a], 1 job offer for the same employer is collapsed with our example result:

Collapsing-UI-Simple.png

With collapsing_sig set to [X], 6 job offers in the same state are collapsed with our example result:

Collapsing-UI-More.png

Use different labels for different metadata fields

The <@fb.Collapsed /> can be configured to use a different label depending on which metadata field is used for collapsing:

<@fb.Collapsed labels={ "X": "{0} results in the same state", "a": "{0} results from the same employer"} />

When collapsing on [a]:

Collapsing-UI-employer.png

...and on [X]:

Collapsing-UI-state.png

Display each collapsed result

By default a link is generated to access the collapsed result. This link uses a special query syntax to return all the documents sharing the same signature.

The form can also be configured to directly display each collapsed result. The number of results to show is controlled by the -collapsing_num_ranks query processor option, and the metadata fields to show is controlled via -collapsing_SF.

Edit the collection settings, and on the Interface tab set the following Query processor options: -collapsing_num_ranks=2 -collapsing_SF=[a,X].

Then, in your form file, add the following snippet after the <@fb.Collapsed /> tag:

<#if s.result.collapsed??>
  <#list s.result.collapsed.results as r>
   <p><a href="${r.indexUrl?html}">${r.title}</a> by ${r.metaData.a} in ${r.metaData.X}</p>
  </#list>
</#if>

This will cause the first 2 collapsed results to be displayed. For each result, its title, employer (a) and state (X) will be shown.

When collapsing on [a]:

Collapsing-UI-Complex-employer.png

...and on [X]:

Collapsing-UI-Complex-state.png

Note that even if there are 6 collapsed results, only the first 2 will be shown due to -collapsing_num_ranks=2.

Advanced usage

The signature file can be configured to combine multiple fields together. For example, setting indexing.collapse_fields=[a],[a,X],[X,Y,Z] will generate 3 different signatures:

  • A signature on the sole a field value,
  • A signature on the concatenation of the a and X field values,
  • A signature on the concatenation of the Y, X and Z field values.

The -collapsing_sig parameter is then used in a similar fashion to collapse results on those combinations: -collapsing_sig=[a], -collapsing_sig=[a,X], -collapsing_sig=[X,Y,Z].

See also

top