Funnelback 8.0.0
Release notes for Funnelback 8.0.0
Released : 30th May 2008
Upgrade issues
- Database collections have changed in layout, and now require an additional 'primary key' parameter. Please see the version 8 database collection upgrade guide for details.
- Perl 5.8.8 is strongly recommended for all platforms:
- Some features do not work out of the box under Perl 5.10 and Solaris.
- Perl 5.8.5 and earlier have a bug in
HTML::Entities
, which may lead to incorrect encoding of apostrophes in the Funnelback system.
- Queries are now logged in their expanded form, not their pre-expansion form.
New features
- Document gathering from Microsoft Sharepoint and Lotus Domino
- Faceted navigation
- User tagging of results
- User feedback on results
- Basic Chinese / Japanese / Korean / Thai (CJKT) support
- Feeds API
- Crawling of content behind web forms
- Automatically generated "support package"
Improvements
- Allow pre/post commands to use collection.cfg parameters
- Broken link detection script for featured pages
- Capability for fetching resources at query time for multiple collection types (databases, filecopy, TRIM)
- Context sensitive help links open in new pages, not the current page
- Display real-time collection update status on the admin UI home page
- Import and export of featured pages and query expansions
- Instant updates support filecopy collections
- Instant update support for more collection types
- Java is bundled with Funnelback
- Logs for a collection go in a collection specific log dir, not the "system logs" dir
- Log text on the "view file" page is more readable
- Numerous improvements to form parsing (fixes for nested tags, res* tags that contain curly braces, etc)
- Option to remove all data during uninstall
- Reporting uses much less memory
- Reports are viewable while they are generating, and a reporting error will no longer leave the reports unusable
- Significantly improved database search, with "workflow" interface, incremental gathering and compressed storage
- Support for extracting links from Javascript generated web pages
- Updates for all collection types may now be halted (the halt may not occur until the end of the current update phase for some collection types)
- When upgrading an installation, the license key is preserved
Selected bug fixes
- Add support for filtering .dot (MS Word Template) files
- Admin UI should include crawler.reject_files in its processing of the "file types to crawl" checkboxes
- Allow collection parameter editing security model (parameter whitelists) to be applied on a per collection basis
- Allow / ignore whitespace in various collection parameters
- Ampersands in query* parameters are not parsed correctly
- cache.cgi displays "XML parsing error" for pages in funnelback_documentation
- cache.cgi does not perform security checks
- cache.cgi links do not get properly URL encoded parameters
- cache.cgi should strip meta refresh from its displayed contents to avoid sending users to incorrect locations
- Cached XLS files don't display correctly in IE6
- Can enter empty featured page and query expansion
- Can't map the same xpath to multiple metadata classes
- Change crawler to use MIME type rather than URL suffix when storing binary files
- Check windows password is valid in installer
- .ckpt index files should be removed by default
- click.cgi links does not properly URL encode arguments
- Clicking on filecopy results displays text in the error log
- Click tracking not working by default
- collection.cfg settings not being updated to point at new locations on an upgrade
- Collection parameter whitelist not greying out fields
- Collection summary rows should show successful update (green tick) after a successful index upgrade
- Command line administration / Unix scheduling / Apache integration will not work if the Perl binary is not at /usr/bin/perl
- Command line updates fail if not started from the bin directory
- crawler_binaries parameter not being updated properly on an upgrade
- Creating local collections with an unfindable source directory displays a confusing error message
- _disabled__see_start_urls_file parameter being displayed in update log
- Documentation CSS is indexed in the funnelback_documentation collection
- Enable data reports for web collections on an upgrade
- Filters not picking up title metadata from some Word docs
- Fluster crashes when a query contains "(" or ")"
- Fluster links have redundant CGI parameters
- funnelback_documentation collection shouldn't be deletable from admin interface
- Funnelback installer should complain if empty input is given for some fields
- htpasswd_modify is not fixed in an upgrade
- Improved handling of URL case sensitivity in the crawler
- Incorrect handling of numeric entities in crawled URLs
- Investigate fallback for external filters
- Investigate how to make query expansion work with Fluster
- java_libraries contain duplicated path after upgrade
- Local collection url prefixes don't work as expected
- Long logs are difficult to scroll
- new-collection.pl does not create start.urls file
- Old Jetty HTTPS server not shut down during upgrade
- Padre displays result counts in minresults mode
- PADRE failing to parse XML with empty elements
- Padre date sorts don't work for documents in the 16th / 17th century
- Padre produces invalid XML for some documents that contain ampersands in their title
- Padre segfault under rare combinations of gscopes and metadata searches
- Parsing of meta parameters is broken
- PDF not extracted correctly but output file with binary content was created
- PDF results include shell error output
- Permission errors under IIS
- Remove trailing space in spelling suggestions
- Reporting date routines do not handle leap years
- Report links do not work under IIS
- rss.cgi crashes when xsltproc is not found
- RTF files filtered in trim collections do not have meaningful titles
- Schedule updates page on windows incorrectly handles invalid input
- Security violation displayed when empty filename is submitted for upload
- Start URL parameter in instant update add doesn't check for a protocol
- The "results can't be displayed because this collection has never updated" page looks awful
- Various .cgi files do not have execute permission
- Very rare hang caused by schtasks when upgrading from Funnelback 6.0.x to Funnelback 7.0.x
- Viewing data reports forces the user displayed on the header to "admin"
- Visual bugs when viewing administration under IIS
- When editing a collection, changes are lost when navigating between tabs
- Word expansion does not work with query_* parameters