Consider the following ODE system to be solved numerically:
\begin{align} \dot{x}_1 &= -x_1 \\ \dot{x}_2 &= -2x_2 + \beta x_4 \\ \dot{x}_3 &= 2x_2 \\ \dot{x}_4 &= x_1 - \beta x_4 \end{align} When I solved the system numerically, I found that increasing the value of $\beta$ increases the number of iterations until convergence in some solvers (e.g. Matlab ode113), but the number of iterations stays relatively constant in other solvers (e.g. Matlab ode15s), and in fact it does it in fewer iterations.
My question: how does the magnitude of $\beta$ in this system affect the rate of convergence of a numerical method? Is there any specific relation that clearly shows that?


Your system converges to a fixed point with $x_1^*=x_3^*=x_4^*=0$ and $x_2^*=c_1+c_2+c_3+c_4$ where $x(0)=c$. That last follows from the fact that the sum over the derivatives cancels all the terms on the right side, so that the sum over the components is constant.
The non-stiff solver
ode113uses Adams-Bashford-Moulton, but in a PECE implementation which makes this an explicit method. As such it has a bounded stability region for any order that is used internally. Assume that the order stabilizes. Then the step size controller will increase the step size $h$ until that boundary is reached and surpassed for $\lambda h$ where $\lambda=-\beta$ is the dominant eigenvalue of the Jacobian. Remember that the stability region is where the step propagator is contracting (non-expanding) if the flow around the exact solution is contracting (non-expanding).At this point, outside the stability region, the distance from the fixed point will grow, and with it the step error, so that the step size gets regulated down. Good step size controllers will minimize the resulting oscillation. The end result still is that for the most part of the integration interval you get $\beta h=const.$ where the constant is some value typically in the interval $[2,3]$. The number of evaluations over some interval $[0,T]$ will then be about $N=T/h$ which is proportional to $\beta$.
Note that this behavior is largely independent of the target error tolerance, the equilibrium step size will not change, only the equilibrium distance from the stationary point will change.
In the stiff solver
ode15sthe BDF methods that are truly implemented as implicit methods do not have (all) such restricted stability regions. Thus the error estimator is less wrong and keeps the step size in the region that is optimal for the error tolerance. As the components that have reached their asymptotic value fall below the absolute error tolerance, they do not influence the step size controller. Thus the step size dynamic is largely oriented to approximate the $e^{-t}$ and $e^{-2t}$ components correctly, which is independent of $\beta$ and gives thus the almost constant step counts.