Skip to content

Social media collections: YouTube

Introduction

YouTube is a social media site focused on video sharing.

A YouTube gathering template is included as part for Funnelback's social media collections support to allow content from YouTube to be gathered and then presented within Funnelback search results.

Please note that your usage of Funnelback to gather content from YouTube must comply with YouTube's terms of service.

Setting up collection

To create a YouTube collection, you will need to create a social media collections and select the YouTube template.

Once you have created the collection you will need to fill out the collection.cfg file. To do this go to the Administer tab of the admin home page and then click on the Browse Collection Configuration Files link.

Getting API key

Before you can crawl YouTube you are going to need to get an API key. First, you will need a Google account. Following instructions should give you your API key.

Configuration options

YouTube's gathering template will read the configuration from collection.cfg. The following settings are supported:

  • youtube.api-key: API key retrieved from the Google API console.
  • youtube.channel-ids: Comma delimited list of YouTube channel IDs to crawl.
  • youtube.liked-videos: Boolean flag to enable fetching of videos liked by a channel ID. It'll apply to all channels provided in youtube.channel-ids.
  • youtube.playlist-ids: Comma delimited list of YouTube playlist IDs to crawl.
  • youtube.debug: Boolean flag to enable debug mode. When debug mode is enabled the gathering script will print out the crawled records in XML form.

Example

youtube.api-key=...
# This is Funnelback's YouTube Channel ID
youtube.channel-ids=UC28P4i0bRdTb08l86PhCXHA
youtube.liked-videos=false
youtube.debug=false

The gathering template can be further customised to crawl only specific entity types (e.g. channels, playlists).

Metadata mappings

The YouTube gathering template includes a number of YouTube specific metadata mappings:

Class IDTypeBehaviourExplanationMetadata fields included
authortextcontent/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/channelTitle
authorUrltextdisplay/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/channelUrl
ctextcontentDescription/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/description
categorytextcontent/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/category
commentCountnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/commentCount
ddatedateDate/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/publishedDate
dislikeCountnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/dislikes
durationtextdisplay/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/durationPretty
durationInSecondsnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/durationInSeconds
favoriteCountnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/favoriteCount
identifiertextdisplay/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/videoId
imageSmalltextdisplayImage URL - small/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/thumbnails/veryDefault/thumbNailUrl
imageMediumtextdisplayImage URL - medium/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/thumbnails/medium/thumbNailUrl
imageLargetextdisplayImage URL - large/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/thumbnails/high/thumbNailUrl
latLonggeospatial x/y co-ordinateN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/latLong
likeCountnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/likes
ttextcontentTitle/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/title
viewCountnumberN/A/com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord/viewCount

Limits

Please note that YouTube applies limits to the volume of content which can be retrieved from their APIs, and so in the case of large channels Funnelback may be unable to gather all historical content.

Customising gathering template

To apply further customisation please edit custom_gather.groovy.

Define queries

To crawl YouTube you need to tell Funnelback what to crawl. This is done by specifying queries to YouTube.

Types of queries

UploadsChannelQuery

Gathers videos on the 'Uploads' playlists for a list (up to 50) channel IDs.

new UploadsChannelQuery(List<String> channelIDs)

LikesChannelQuery

Gathers the videos on the 'Likes' playlists for a list (up to 50) channel IDs.

new LikesChannelQuery(List<String> channelIDs)

PlayListQuery

Gathers videos from provided playlist IDs.

new PlayListQuery(List<String> playListIDs)

Example

def queries = []
queries.add(new UploadsChannelQuery(channelIDs))
queries.add(new LikesChannelQuery(channelIDs))

In this example, the first query gets all videos uploaded by the channel, the second query gets all videos liked by the channel.

Working with the crawled data

Funnelback will crawl YouTube and convert responses into XML. You can use the metadata customisation tool to map elements to metadata classes. The XML that Funnelback generates for a YouTube collection is as follows:

<com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord>
  <url>http://www.youtube.com/watch?v=the_video_id&amp;feature=youtube_gdata_player</url>
  </title>
  </description>
  <videoId>the_video_id</videoId>
  <viewCount>1</viewCount>
  <commentCount>0</commentCount>
  </latLong>
  <thumbnails>
    <veryDefault>
      </thumbNailUrl>
      <thumbNailUrlWidth>120</thumbNailUrlWidth>
      <thumbNailUrlHeight>90</thumbNailUrlHeight>
      <thumbNailUrlExtras class="linked-list"/>
    </veryDefault>
    <medium>
      </thumbNailUrl>
      <thumbNailUrlWidth>320</thumbNailUrlWidth>
      <thumbNailUrlHeight>180</thumbNailUrlHeight>
      <thumbNailUrlExtras class="linked-list"/>
    </medium>
    <high>
      </thumbNailUrl>
      <thumbNailUrlWidth>480</thumbNailUrlWidth>
      <thumbNailUrlHeight>360</thumbNailUrlHeight>
      <thumbNailUrlExtras class="linked-list"/>
    </high>
  </thumbnails>
  <publishedDate>2018-02-18T17:00:00.000Z</publishedDate>
  </category>
  <durationInSeconds>239</durationInSeconds>
  <durationPretty>00:03:59</durationPretty>
  <likes>0</likes>
  <dislikes>0</dislikes>
  <favoriteCount>0</favoriteCount>
  </channelTitle>
  </channeld>
  </channelUrl>
  <embedHtml>&lt;iframe width=&quot;480&quot; height=&quot;270&quot; src=&quot;//www.youtube.com/embed/thie_video_id&quot; frameborder=&quot;0&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen&gt;&lt;/iframe&gt;</embedHtml>
  <extras class="linked-list"/>
</com.funnelback.socialmedia.youtube.v3.YouTubeXmlRecord>

See also

top

Funnelback logo
v15.14.0