IP reject (Query processor option)
The PADRE query processor contains an option to provide limited assistance in managing load on a single machine. This option (ipreject) monitors the frequency of queries being sent from individual IP addresses and will not process queries for IP addresses that have exceeded a certain threshold. This option should be used to prevent occasional query spikes from individual users bringing down query processing machines. This option is not a complete solution to preventing machine load or preventing Distributed Denial of Service (DDoS) attacks. It should be used in conjunction with other measures.
Simply add the following option to your 'query_processor_options' configuration item.
For example, to allow roughly 4 queries in a 2-second period:
This option is available for Linux systems only.
How it works
When the 'ipreject' option is enabled, PADRE will check very early in its processing whether or not a query should be allowed from a particular IP. If the query is not allowed, PADRE will exit with an appropriate error message. This early exit will help prevent load on your server, since PADRE will not perform the resource intensive query processing operations.
Whenever PADRE is started, it will obtain the user's IP address from the REMOTE_USER CGI environment variable. It will check its own records (stored at $SEARCH_HOME/log/iptracker) to see how many queries the user has performed in the last n seconds, where n is provided as a parameter (The windowSeconds option). If this number is below the queryLimit parameter, then the query is allowed. If it is not, the query will not be performed.
Synchronisation of multiple PADRE instances on the IP records is achieved through an operating system semaphore. Note that killing a PADRE process while it holds onto this semaphore may result in the semaphore being left around. If this happens, it will need to be deleted. It will be located in the /dev/shm directory.
The upperQueryLimit provides a flexible way to block IPs from performing queries if they issue too many query requests. Every time a query is performed from a particular IP a count is incremented for that IP address. The record of this count can be reduced over time according to the limits set by the queryLimit and windowsSeconds parameters. The record of number of queries performed will not exceed the upperQueryLimit however. For example, using the option
-ipreject=2,5,20 will mean that users are allowed to submit 2 queries every 5 seconds. If a user submits 1000 queries in the first second however, all but their first two will be blocked, and they will have to wait about 50 seconds before being able to submit any more queries. If for the same case
-ipreject=2,5,100 were used instead, the user would have to wait about 250 seconds (approx 4 minutes) before being able to submit any more queries.