http_source_host (collection.cfg setting)


This parameter specifies the IP address or hostname used by the crawler, on a machine with more than one available e.g. a multihomed machine with two or more physical network interfaces.

The default value is empty, which means the crawler will use the system default network interface. The IP or hostname is inserted into the network requests the webcrawler makes.

You might set this value to a specific IP or hostname if you want the webservers the crawler contacts to think it is coming from a particular internal or external network.

