crawler.request_header
Optional additional header to be inserted in HTTP(S) requests made by the webcrawler.
Key: crawler.request_header
Type: String
Can be set in: collection.cfg
Description
This parameter can be used to specify an optional additional header to be inserted in HTTP(S) requests made by the webcrawler. For example, sending a cookie header may help the WebCrawler in gaining access to a web site which uses cookies to store login information. An alternative approach is to specify in_crawl crawler.form_interaction.in_crawl.groupId.url_pattern or pre_crawl crawler.form_interaction.pre_crawl.groupId.url form interaction entries to login to a specific site.
Default Value
(Empty)
Examples
Send a cookie string:
crawler.request_header=Cookie: phpbb2mysql_data=xyx; phpbb2mysql_sid=123
This cookie information could be got by loading up the relevant website in a web browser and then examining the cookies it tries to set and store.
Notes
- If sending cookie strings you should set crawler.accept_cookies to "false", to avoid the cookie strings you are trying to send being overridden.
- You will probably want to use the crawler.request_header_url_prefix parameter as well to limit what URLs the crawler sends these request headers to.