Background
I'm actually a dev, but I think this question fits here as its about maths. I'm implementing a site wide throttle on too many failed requests as protection against distributed brute force attacks.
The question I am stuck with is, after how many failed login requests should I start to throttle?
Now one reasonable way is, as mentioned here "using a running average of your site's bad-login frequency as the basis for an upper limit". If the site has an average of $100$ failed logins, $300$ (puffer added) might be a good threshold.
Now I don't have a running average and I don't want someone having to actively increase the upper limit as the user base grows. I want a dynamic formula that calculates this limit based on the active users amount.
The difficulty is that if there are only a few users, they should have a much higher user to threshold ratio than let's say $100,000$ users. Meaning that for example for $50$ users the limit could be set at $50\%$ of the total user count which means allowing $25$ failed login requests site-wide in a given timespan. But this ratio should decrease for 100k users, the threshold should be more like around $1\%$. $1000$ failed login requests in the same let's say hour, is a lot (probably not accurate at all I am not a security expert, the numbers are only examples to illustrate).
The Question
I was wondering, is there any mathematical formula that could archive this in a neat way?
This is a chart of what I think the formula should be calculating approximately:
Here is what I have now (I know it's terrible, any suggestion will be better I'm sure):
$threshold = 1;
if ($activeUsers <= 50) {
// Global limit is the same as the total of each users individual limit
$threshold *= $activeUsers; // If user limit is 4, global threshold will be 4 * user amount
} elseif ($activeUsers <= 200) {
// Global requests allows each user to make half of the individual limit simultaneously
// over the last defined timespan
$threshold = $threshold * $activeUsers / 2;
} elseif ($activeUsers <= 600) {
$threshold = $threshold * $activeUsers / 2.5;
} elseif ($activeUsers <= 1000) {
$threshold = $threshold * $activeUsers / 3.5;
} else { // More than 1000
$threshold = $threshold * $activeUsers / 5;
}
return $threshold;

TL;DR $$y = 659.113 \log\left(\frac{x}{240.399} + 1\right) + \frac{-1213.555 x}{x + 739.845}$$
Let $f: [0, +\infty) \to [0, +\infty)$ be the function that maps the number of users to threshold, i.e. the green curve
$f$ should satisfy these properties:
To fit the green curve, we first make some measurements (blue dots). Then, we use R to perform regression analysis (using the
nls_multstartfunction from the nls.multstart package).Because of the $\Theta{(\log(n))}$ requirement, the obvious choice is to assume $y = a \log (\frac{x}{b} + 1)$ for some unknown $a, b \in \mathbb{R}$. Unfortunately, this gives us a poor fit.
We need to improve our model. To proceed, let's assume there is some correction term such that $y = a \log(\frac{x}{b} + 1) + \mathcal{O}(1)$. A commonly used $\mathcal{O}(1)$ nonlinear model is $\frac{rx}{x + s}$ for some unknown $r, s \in \mathbb{R}$. So let's add it to our model $$y = a \log\left(\frac{x}{b} + 1\right) + \frac{rx}{x + s}$$
nls_multstartfinds us the following best fit (rounded to 3 decimal places): $$y = 659.113 \log\left(\frac{x}{240.399} + 1\right) + \frac{-1213.555 x}{x + 739.845}$$Next, we need to show properties $1$ - $6$ holds. Property $1$ obviously holds because $y = \Theta(\log x) + \mathcal{O}(1) = \Theta(\log x)$ as $x \to \infty$. To see property $2$ holds, observe
$$f'(x) = \frac{a}{x + b} + \frac{rs}{(x + s)^2} = \frac{a(x + s)^2 + r s (x + b)}{(x + b)(x + s)^2}$$
Now we can obtain $f^{(k)}$ by keep applying the quotient rule. Because of how quotient rule works and $f'$ being a rational function, $f^{(k)}$ can only be non-differentiable at $-b = -240.399$ or $-s = -739.845$, which means $f$ is infinitely differentiable on $[0, +\infty)$.
Property $3$ holds by simple computation. Property $4$ holds because $$f'(0) = \frac{659.113}{0 + 240.399} + \frac{-1213.555}{(0 + 739.845)^2} = 1.101... \approx 1$$
Now $f'(x) = 0$ exactly when $$a(x + s)^2 + r s (x + b) = 659.113 x^2 + 77440.315994 x + 1449386311.619988 = 0$$ which has negative discriminant, implying $f'$ has no real roots and hence never crosses the $x$-axis. From above, $f'(0) = 1.101... > 0$ and $f'$ is differentiable hence continous on $[0, \infty)$. By the Intermediate Value Theorem, $f'(x) > 0$ on $[0, +\infty)$ holds (property $5$). Similar arguments show $f''$ has no non-negative real roots, continous, take a negative value at $x = 0$, and hence property $6$ holds again by the IVT.
Here is the R code I use for regression and plots: