Skip to content

Facet categories

Description

Facet category values are the individual filter choices that appear beneath a facet.

Each facet contains one or more facet categories. Selecting a facet category will filter the search results on include only results that include the category (subject to any other selected categories and the matching logic).

Category sources

Facet categories are sourced from a number of different things. Understanding how the facets and facet categories relate to the source data is important for creating effective faceted navigation (and for planning site redevelopments to make use of facets).

Facets based on metadata

A facet that has category values sourced from predefined values contained within a metadata or XML field (metadata) has a 1:1 relationship between the facet and a Funnelback metadata class.

The categories that can appear beneath a metadata facet are all the unique values found within the metadata class. The count for each category is the number of unique times that the metadata value is found within the result set.

Category values based on URL patterns or generalised scopes

A facet that has category values sourced from predefined URL sets (gscope) or a set of URL patterns (URL pattern) has a 1:1 relationship between each category and a defined URL pattern set or generalised scope.

The category label is determined by the label assigned when creating the facet.

The count for each category is the number of items in the result set that belong to the defined URL pattern set or gscope.

Category values based on URL path or directory structure

A facet that has category values sourced from the item's location in a URL path or directory structure (URL fill) has a 1:1 relationship between the facet and a defined root URL or folder.

The categories are determined by splitting the URL on the folder structure (with folder names as the possible categories) to a determined depth. The depth defines a hierarchical structure to apply to the facet.

The count for each category is the number of results that sit beneath the folder.

Note: URL based categories should not be used for large collections with a flat structure. For example if every URL PATH was 1/index.html, 2/index.html, n/index.html then the facet would shown numbers 1,2,...,n which are not helpful for user to drill down on. Additionally, when gathering from web collections, the hostname cannot be used as a category within a URL based facet - only directories (path-elements) can be used in this way.

Category values based on date

A facet that has category values sourced from groupings based on the item's date (dynamic date fill) has a 1:1 relationship between the facet and the document's indexed date value.

Dates are grouped into the following categories:

  • Coming year
  • Coming month
  • Coming week
  • Tomorrow
  • Today
  • Yesterday
  • Past week
  • Past fortnight
  • Past month
  • Past 3 months
  • Past 6 months
  • Past year
  • YEAR (e.g. 2017, 2014)
  • Uncertain

Additional categories can't be defined, but categories can be removed using the blacklisting options or renamed.

Category values based on results from a query

A facet that has category values sourced from documents which match a particular query has a 1:1 relationship between each category and the set of URLs retuned for the defined query.

The category label is determined by the label assigned when creating the facet.

The count for each category is the number of items in the result set that are in the set of results returned for the defined query.

Category values based on source collection

A facet that has category values sourced from documents which belong to a particular collection has a 1:1 relationship between each category and membership to sets of collections.

The category label is determined by the label assigned when creating the facet.

The count for each category is the number of items in the result set that belong to each defined set of collections.

Category values based on numerical range

A facet that has category values sourced from documents whose metadata value lies within a numerical range has a 1:1 relationship between each category and the number of results returned that belong to each defined range.

The category label is determined by the label assigned when creating the facet.

The count for each category is the number of items in the result set that are in the each of the pre-defined ranges.

The following content and configuration must be performed before a numerical range facet can be created:

  • Documents must include a metadata or XML field containing a numeric value (e.g. <meta name="price" content="30.0">).
  • This field needs to be added to the collection's metadata mappings as a numeric (type 3) metadata field in either metamap.cfg or xml.cfg.
  • The collection index must have been rebuilt since the metadata mappings were defined.

Once these prerequisites are met a numeric range facet can be created, with ranges defined for the category values.

Example

  • Create a numeric range facet called price. This requires a metadata field, price, containing numeric values to exist within the indexed content.

  • Add several ranges to capture the following price ranges:

    • under $10 corresponding to 0 <= price < 10
    • $10-50 corresponding to 10 <= price < 50
    • $50-100 corresponding to 50 <= price < 100
    • $100+ corresponding to price >= 101

Category counts

The counts displayed for category values are estimates that are calculated based on the result set. Because of this the accuracy will reduce as the search index becomes larger. The numbers may also change when a facet is selected as the estimates are recalculated every time the result set is produced.

The accuracy of the counts can be increased by altering the -daat query processor option to consider more documents before producing the estimate. However increasing the daat limit will have an impact on the response time of the search results so the decision needs to balance performance against accuracy.

e.g. to increase the daat limit to 100,000 documents add -daat=100000 to the query processor options, either via the edit collection settings option on the administer tab in the administration interface, or by adding the option to the query_processor_options containing within the collection's collection.cfg.

See: Document at a time

Sorting category values

Facet category values can be sorted by a number of different attributes:

  • count: the category values are sorted by the category count, either largest to smallest, or smallest to largest.
  • label: the category values are sorted by the label, either alphabetically (A-Z) or reverse alphabetically (Z-A).
  • numeric label: the category values are sorted by the label, but with numbers interpreted numerically. e.g. a numeric label sort would sort as follows: 1, 10, 11, 100, compared with a label sort which would sort: 1, 10, 100, 11.
  • date: the category values are sorted by date, either most recent to oldest, or oldest to most recent.
  • selected values first: selected category values are placed above unselected category values.
  • configuration order: for facet types where the individual category values are defined, the order in which they appear in the configuration will be the order that is used to display the facets. The order can be changed by dragging the items around in the configuration.
  • custom sort logic: custom sort logic, implemented as a Groovy comparator, is used to sort the category values. Can be used to apply custom sort ordering to metadata based facets.

Renaming category values

Renaming of faceted navigation category values, though not formally supported by Funnelback, can be accomplised using a post process hook script.

If you previously used the unofficial faceted navigation v2 code available via GitHub then the hook script for renaming category values should continue to work.

The following is a pared down version of the script that will just rename category values based on renames defined in the collection.cfg.

To rename faceted navigation categories:

  1. Add the following to the collection's post process hook script (hook_post_process.groovy):

    	// Rename faceted categories as configured in collection.cfg
    	// config format: faceted_navigation.FACETNAME.rename.CATEGORYOLDNAME=CATEGORYNEWNAME
    	// e.g. faceted_navigation.State.rename.NSW=New South Wales
    	if ( transaction.response != null && transaction.response.facets != null
    	        && transaction.response.facets.size() > 0 ) {
    	  transaction.response.facets.each() {
    
    	    def facetname=it.name;
    
    	    // Rename then sort by reverse count (default)
    	    it.categories.each() {
    	      renameCategory(it,facetname);
    	    }
    	  }
    	}
    
    	// Perform category rename if configured
    	def renameCategory(category,facetname) {
    	    category.values.each() {
    	        if (transaction.question.collection.configuration.value("faceted_navigation."+facetname+".rename."+it.label) != null) {
    	            it.label = transaction.question.collection.configuration.value("faceted_navigation."+facetname+".rename."+it.label);
    	        }
    	    }
    	}
    
  2. Add rename definitions to the collection.cfg. Definitions follow the format faceted_navigation.FACETNAME.rename.CATEGORYOLDNAME=CATEGORYNEWNAME.

    	# Rename the 'NSW' category of the 'State' facet to 'New South Wales'
    	faceted_navigation.Location.rename.NSW=New South Wales
    	# Rename the 'Last week' category of the 'Date' facet to 'Last 7 days'
    	faceted_navigation.Date.rename.Last week=Last 7 days
    

Blacklisting and whitelisting category values

Two collection.cfg options can be used to restrict the category values that can be displayed within a facet.

  • faceted_navigation.black_list: category values listed will be removed from the facet if they are returned in the list of facets.
  • faceted_navigation.white_list: only the category values listed will be displayed if they are returned in the list of facets. Other category values are removed.

See also

top

Funnelback logo
v15.12.0