Social media collections
Funnelback provides support for gathering content from a range of social media sources for inclusion within search results. Currently supported sources are:
Setting up a social media collection
Social media collections within Funnelback are created using custom collections, allowing the social media gathering code to be customised where required. For each of the social media sites noted above, Funnelback includes custom gathering templates and default metadata mappings, providing a starting point for configuring the custom collection appropriately.
Social media collections will read their settings from
collection.cfg (See each social media platform individual help page for the setting details). However it may be desirable to customise the gathering script to have better control of the type of entities returned, or to filter the crawled data by an arbitrary criterion (only recent posts, etc.). The following section gives an overview of the structure of the gathering templates to allow customisation.
General gathering script format
The following code outlines the general structure of the social media gathering template scripts. Broadly:
- Any social media type specific login is performed
- A record processing object is created and configured to store gathered content into the appropriate Funnelback data directory
- A list of query objects are created describing the content which is to be requested from the social media site
- Both the record processor and query object list are given to a fetcher object.
The call to the execute method on the fetcher arranges to gather the results for each of the given queries, to convert each result into XML appropriate for Funnelback, and then to store this XML ready for indexing.
import com.funnelback.common.config.*; import com.funnelback.common.io.store.xml.*; import com.funnelback.socialmedia.fetcher.*; import com.funnelback.socialmedia.processing.*; import com.funnelback.socialmedia.SOCIAL_MEDIA_TYPE.*; import com.funnelback.socialmedia.utils.*; import java.net.URL; // Any SOCIAL_MEDIA_TYPE specific login def config = ConfigFactory.createNoOptionsConfigWithLogging(new File(args), args); def processor = new CollectionStoreRecordProcessor(new XmlStoreFactory(config)); List<p><SOCIAL_MEDIA_TYPEQuery> queries = [ // SOCIAL_MEDIA_TYPE specific Query objects ]; new SOCIAL_MEDIA_TYPEFetcher(processor, config).execute(queries);</p>