Flickr collections
Introduction
Flickr is a social media site focused sharing photos.
Funnelback can crawl user and groups on Flickr. Funnelback crawls the textual data available for each photo and providing information such as the name and the link for each photo.
Please note that your usage of Funnelback to gather content from Flickr must comply with Flickr's terms of service.
Getting API keys and tokens
Gathering Flickr content requires an API key, API secret and user authentication tokens.
Getting your API key and secret
To get your API key and secret you will first need to create a Flickr account, then to apply for an app. If you want to find the API keys and secrets you already have visit API - registered keys.
Getting user authentication tokens
An API key and secret will only let you talk to Flickr's API and fetch some public content such as the public content on a users page. If you want to fetch content from a group you may need a user authentication token even though the photos themselves are public. To join your Flickr account to the group, ensure you are logged in as your user and run:
# Unix
java -cp '$SEARCH_HOME/lib/java/all/*' com.funnelback.socialmedia.flickr.FlickrAuthentication apiKey apiSecret
# Windows
java -cp %SEARCH_HOME%\lib\java\all\^* com.funnelback.socialmedia.flickr.FlickrAuthentication apiKey apiSecret
This will return a URL, ensure you visit that URL as the correct Flickr user, as you will receive authentication tokens for the user that is currently logged in.
Configuration options
Flickr collections support the following settings:
- flickr.api-key: API key.
- flickr.api-secret: API secret.
- flickr.auth-token: Authentication token.
- flickr.auth-secret: Authentication secret.
- flickr.user-ids: Comma delimited list of user accounts IDs to crawl.
- flickr.groups.public: List of group IDs to crawl with a "public" view. Only public photos part of the group will be retrieved, private photos will be skipped.
- flickr.groups.private: List of group IDs to crawl with a "private" view. All photos part of the group (public and private) will be retrieved. Note that the user id specified in
flickr.user-ids
must be a member of the group to access private photos.
Metadata mappings
Flickr collections include a number of Flickr specific metadata mappings:
Class ID | Type | Behaviour | Explanation | Metadata fields included |
---|---|---|---|---|
author |
text | content | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoOwnerFullName , /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/ownwer/realName |
|
c |
text | content | Description | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoDescription , /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/description |
d |
date | date | Date | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/dateUpdated |
imageSmall |
text | display | Image URL - small | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoSmallImageUrl |
imageMedium |
text | display | Image URL - medium | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoMediumImageUrl |
imageLarge |
text | display | Image URL - large | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoLargeImageUrl |
imageSquare |
text | display | Image URL - small (square) | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoSmallImageSquareUrl |
image320Pixels |
text | display | Image URL - small (320px) | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoSmallImage320Url |
image640Pixels |
text | display | Image URL - medium (640px) | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoMediumImage640Url |
image800Pixels |
text | display | Image URL - medium (800px) | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoMediumImage800Url |
imageId |
text | display | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoId |
|
imageThumbnail |
text | display | Thumbnail URL | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoThumbNailUrl |
latLong |
geospatial x/y co-ordinate | N/A | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/photoLatLong |
|
t |
text | content | Title | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoTitle , /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/title |
username |
text | display | /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto/photoOwnerUserName , /com.funnelback.socialmedia.flickr.FlickrXmlRecordPhotoset/ownwer/username |
Use the -SF
query processor option to access these metadata fields on the
search response and in the templates (i.e. `-SF=[author,username]).
Limits
Please note that Flickr applies limits to the volume of content which can be retrieved from their APIs, and so in the case of large photo streams Funnelback may be unable to gather all historical content.
Working with the fetched data
Funnelback will crawl Flickr and convert responses into XML. You can use the metadata customisation tool to map elements to metadata classes.
Note: To preview the crawled records please enable debug mode by setting flickr.debug=true
in collection.cfg
file.
The XML that Funnelback generates for a Flickr collection is as follows:
<com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto>
<photoId>photo_id</photoId>
<url>url to photo in flickr, do not use this url to get the actual image</url>
<photoTitle>some title</photoTitle>
<photoThumbNailUrl>https://farm4.static.flickr.com/x/photo_id_qwert_t.jpg</photoThumbNailUrl>
<photoSmallImageUrl>https://farm4.static.flickr.com/x/photo_id_qwert_m.jpg</photoSmallImageUrl>
<photoSmallImage320Url>https://farm4.static.flickr.com/x/photo_id_qwert_n.jpg</photoSmallImage320Url>
<photoSmallImageSquareUrl>https://farm4.static.flickr.com/x/photo_id_qwert_s.jpg</photoSmallImageSquareUrl>
<photoMediumImageUrl>https://farm4.static.flickr.com/x/photo_id_qwert.jpg</photoMediumImageUrl>
<photoMediumImage640Url>https://farm4.static.flickr.com/x/photo_id_qwert_z.jpg</photoMediumImage640Url>
<photoMediumImage800Url>https://farm4.static.flickr.com/x/photo_id_qwert_c.jpg</photoMediumImage800Url>
<photoLargeImageUrl>https://farm4.static.flickr.com/x/photo_id_qwert_b.jpg</photoLargeImageUrl>
<photoOwnerFullName>some username</photoOwnerFullName>
<photoOwnerUserName>username</photoOwnerUserName>
</com.funnelback.socialmedia.flickr.FlickrXmlRecordPhoto>
The <url>
element will not give the image URL, it gives the URL of the photo inside a Flickr page. If you want to get the actual picture URL, ie something ending in .jpg, then you need to use a element whose name is <*Url>
. For example <photoLargeImageUrl>
will give you a URL to a large version of the picture and this could be included in a <img>
HTML tag.