Consider a linear system that has a known impulse response function $h(t)=e^{-at}H(t)(at)^n$ where $a$ and $n$ are known coefficients and $H(t)$ is the Heaviside function. If there is an external force applied to the system, $x(t)$, the output of the system can be written as a convolution
$$ y(t) = \int_{-\infty}^\infty h(t - \tau) x(\tau)\ d\tau. $$
My question is for non-integer $n > 0$, how can we write the system in the first order ODE format? That is,
$$ \frac{d \mathbf{s}}{dt} = \mathbf{f}(\mathbf{s}, x) $$ for some states $\mathbf{s}\in\mathbb{R}^m$ and some function $\mathbf{f}(\cdot)$, and the output is determined by $$ y = g(\mathbf{s}) $$ with some function $g(\cdot)$.
What I have done
For integer $n$, I know how to solve it. First, take the simplest example where $n = 0$. The output $y$ can be written as
$$ y(t) = \int_{-\infty}^\infty e^{-a(t-\tau)}H(t - \tau) x(\tau) d\tau $$ where the derivative is $$ \frac{dy}{dt}(t) = \int_{-\infty}^\infty \left[-a H(t - \tau) + \delta(t-\tau)\right] e^{-a(t-\tau)} x(\tau) d\tau $$ with $\delta(\cdot)$ as the Dirac delta function, and can be simplified into $$ \frac{dy}{dt}(t) =-a y(t) + x(t) $$
So in the case of $n = 0$, we can write $ds/dt = f(s, x) = -a s + x$ with the output $y = g(s) = s$. The similar procedure can be repeated for $n = 1$, but in this case we need to differentiate $y$ twice, and get $$ \frac{d^2y}{dt^2}(t) =-2 a \frac{dy}{dt}(t) - a^2 y(t) + 2 a x(t). $$ With the equation above, we can write the first order ODE form by choosing the states $\mathbf{s} = (dy/dt, y)$.
Similar procedures can be done for integer $n \geq 0$. However, it is unclear to me how to get such equation with non-integer $n$.
For linear time invariant systems one can only achieve systems with impulse responses which have integer values for $n$, by having $n+1$ repeated poles at $-a$. Namely, by taking the Laplace transform of $h(t)$ yields
$$ \mathcal{L}\{h(t)\}(s) = \frac{n!\,a^n}{(s+a)^{n+1}}. $$
However, it can be quite easy to solve this problem by adding time dependency to $g(s)$. If you do not want to add time as input parameter one can also achieve the same behavior using an extra state with a time derivative of always one, for example using $s\in\mathbb{R}^2$
\begin{align} \frac{d}{dt} \begin{bmatrix} s_1 \\ s_2 \end{bmatrix}&= \begin{bmatrix} -a\,s_1+x \\ 1 \end{bmatrix}, \\ y &= s_1\,(a\,s_2)^n, \end{align}
since $s_2$ would be equal to $t$ (assuming that the impulse response starts at $t=0$ and $s_2(0)=0$). But this only produces the desired $h(t)$ if the impulse is applied at $t=0$.
One can tweak the proposed model a little, such that the impulse response does match when applied at different times using
\begin{align} \frac{d}{dt} \begin{bmatrix} s_1 \\ s_2 \\ s_3 \end{bmatrix}&= \begin{bmatrix} -a\,s_1+x \\ s_3 \\ x \end{bmatrix}, \\ y &= s_1\,(a\,s_2)^n, \end{align}
with $s_2(0)=0$ and $s_3(0)=0$. Namely, applying an impulse at time $t_i$ yields that at an infinitesimal time step after $t=t_i$ $s_3(t)=1$ and thus the rest of the dynamics is equivalent to the previous model but starting at $t=t_i$ instead of $t=0$. This probably also doesn't give the intended behavior, for example applying a different magnitude impulse yields that $s_2$ would grow at a different rate as time.
Instead, one can also construct a linear time invariant system that approximates the impulse response. Such an approximation can for example be obtained by taking a Padé approximant of the factional part of the given Laplace transform of $h(t)$ given above and use is as the transfer function for the system.
For example taking a (3,3) order Padé approximant using $n = 1/2$ and $a=1$ yields
$$ \frac{1}{(s+1)^{1/2}} \approx \frac{s^3 + 24\,s^2 + 80\,s + 64}{7\,s^3 + 56\,s^2 + 112\,s + 64}, $$
$$ \frac{1/2!}{(s+1)^{3/2}} \approx \frac{1/2!}{(s+1)}\frac{s^3 + 24\,s^2 + 80\,s + 64}{7\,s^3 + 56\,s^2 + 112\,s + 64}. $$
Plotting the impulse response of this transfer function, denoted with $\tilde{h}(t)$, and comparing it to $h(t)$ yields