Skip to content

Funnelback 11.0.0

Release notes for Funnelback 11.0.0

Released: 26 August 2011

New Features

  • Redeveloped query processing layer for more efficient query processing and improved search presentation customisation.
  • New Push collection type for feeding non-web content into a funnelback index from a remote system over time, without the scalability limitations of instant updates.
  • New Directory collection type for searching Active Directory and LDAP repositories.
  • Administrator search tuning system allowing search ranking factors to be optimised for specific collections.
  • Content optimisation system which provides detailed guidelines for content authors on how to improve a specific result's ranking.
  • Preview and publish system for developing search form files without affecting production search presentation.
  • Ability to blend result sets for multiple queries from spelling suggestions, synonyms and other sources into a single result list.
  • Assorted web crawling improvements including support for revisiting infrequently changing content less often.

Upgrade Issues

  • Result summaries aren't highlighted by default anymore so that form authors have complete control over the query highlighting. You'll need to use the <s:boldicize /> tag on your existing forms to have the summaries highlighting back. - When upgrading trim collections from version 10, a full update of the collection is required to update the URLs of records to support the new instant update functionality.

  • The <s:boldicize /> and <s:italicize /> tags now use <strong> and <em> HTML tags instead of <b> and <i> previously. If you were using these tags in your CSS stylesheet you'll need to update it.

  • Using the Crawler form interaction system no longer disables cookie support by default. If a collection is using the form interaction system and can't crawl password protected sites successfully after the upgrade, please explicitly disable cookie support by setting crawler.accept_cookies=false.
  • The default treatment of nepotistic links has been changed to limit their effect. This will reduce indexing time, and should have a positive effect on the ranking in most web collections, particularly large ones covering multiple domains. This change can be reverted by setting the -nep_action indexer option value to zero.
  • The isolated mode filter has been renamed IsolatedFilterProvider (Previously IsolatedPublishorFilterProvider) and is now able to use any filter classes.
    • It will use the Tika filter provider by default, so you'll need to update your collection configurations if you want to continue using the Davisor filters in isolated mode.
  • The <s:Truncate> tag no longer supports the stripMiddle attribute.
  • The default behaviour for the web crawler is now to skip revisiting a proportion of infrequently changing pages during each crawl. This behaviour can be configured through the crawler revisit policy.
  • Data reports are now specific to web collections and are no longer available for other collection types.

Selected improvements and bugfixes

  • Increased permitted number of meta collection components.
  • Added ability to analyse URLs remaining in a web crawl frontier.
  • Support for gathering multiple Exchange mail boxes through the EntropySoft connector in a single collection.
  • Added ability for web crawler to read cookies from a file on startup.
  • Improved crawler form interaction cookie handling.
  • Improved handling of non UTF-8 web content.
  • Improved query highlighting in results, especially with UTF-8 characters.
  • Corrected handling of UTF-8 form files.
  • Support for collection profiles when tuning search quality.
  • Added ability to index HTTP header and Facebook Opengraph protocol metadata.
  • Fixed incorrect addition of collection name to C metadata by default.
  • Reworked query completion JavaScript to avoid conflicts with other JavaScript libraries.
  • Support for multiple facets per tag in freemarker templates.
  • Added distance from origin to XML output when searching geospatial data.
  • Reduced warning messages from result transforms on missing metadata.
  • Added support for resolving relative links within the IncludeURL form tag.
  • Better handling of special characters in indexer options.
  • Added spelling whitelist file for words which should be provided as spelling suggestions.
  • Changed boldicize tag to use HTML strong tags rather than bold tags.
  • Changed query processing ordering to apply spelling suggestions after synonym expansion.
  • Introduced ability to execute custom code during query processing.
  • Eliminated log files produced by inactive crawler threads.
  • Fixed incorrect permission settings on init.d scripts.
  • Improved layout and display of the Funnelback administration interface.
  • Fixed handling of column names with special characters during database gathering.
  • Added setup documentation for IIS 7.5.
  • Automated installation of 64bit versions of search indexing and query processing components.
  • Improved crawler tolerance for timeouts on seed pages.
  • Improved index 'warm up' scripts.
  • Fixed sorting of results when early binding security is used.
  • Added headers to CSV exports from the analytics dashboard.
  • Added support for instant updates on TRIM collections.
  • Improved Javascript link extraction logic to avoid some invalid link cases.
  • Improved ordering of collections in Funnelback's administration interface.
  • Added tools for managing WARC archive files.
  • Fixed collection configuration cache clearing under mod_perl.

top

Funnelback logo
v15.22.0