crawler.max_url_repeating_elements (collection.cfg setting)
The crawler will ignore all URLs that contain more than this number of repeating elements. For example, the following url:
will be ignored if the default limit of 5 is being used, as it has 6 repeating "a" elements or directories. This check is used to guard against crawler traps and badly configured web servers.