I am trying to mathmatically understand how the $P$, $I$, and $D$ parameters work on a system, quite having a hard time doing so.
I've only been able to show that the Steady State Error (SSE) never becomes zero for using a P using a simple example, but I am not able to show for the others, I would be very grateful if any could mathematically show it.
SSE: Consider a the plant being \begin{equation} G(S) = \frac{1}{s(s+1)} \end{equation} and the controller only being $G_c(s) = K$. The error signal is computed to be \begin{equation} E(s) = \frac{1}{1+ \frac{k}{s(s+1)}}. \end{equation} Calculating the limit as $s \to 0$ for at step input we get \begin{equation} SSE = \frac{1}{1 + k}, \end{equation} thus showing SSE decreases but never becomes zero.
I might have found a solution for overshoot as well, but $I$ and $D$ are still a bit tricky to handle. I cannot see how the damping ratio changes due to the addition of an $I$ and $D$.
Since you're using a second-order plant, I'll tailor my answer to that.
As you've noted above, the proportional gain, $k_p$, is what determines most of the controller's response to error, though a $P$ term alone generally has steady state error (this is sometimes called "droop").
To fix this steady state error, we introduce the integral gain, $k_I$. This helps to eliminate steady state error because it looks not only at the magnitude of the error, but also at its duration. Specifically, if we let $e(t)$ denote our error signal, a $PI$ controller takes the form \begin{equation} u_{PI}(t) = k_pe(t) + k_I\int_{0}^{t}e(\tau) d\tau \end{equation} and has transfer function \begin{equation} \frac{U(s)}{E(s)} = k_p + \frac{k_I}{s}. \end{equation}
If we want to control a second-order system with transfer function \begin{equation} \tag{1} \frac{Y(s)}{U(s)} = \frac{A}{s^2 + a_1s + a_2}, \end{equation} and have it track a reference signal $R(S)$, our controller becomes \begin{equation} U(s) = k_p(R(s) - Y(s)) + k_I\left(\frac{R(s) - Y(s)}{s}\right). \end{equation} If we substitute this into Equation $(1)$ and do some rearranging, we find that our characteristic equation is \begin{equation} s^3 + a_1s^2 + (a_2 + Ak_P)s + Ak_I = 0. \end{equation} Here we can control two coefficients, but we'd like to control three of them (all except the $s^3$ one) because then we can set all three roots of the characteristic polynomial. In practice, $PI$ controllers can lead to overshoot and oscillations. Thus, while a $PI$ controller has solved the problem of having steady-state errors, it can still cause other problems and for these reasons we introduce the $D$ term.
Going through the above manipulations again with a $PID$ controller, which has \begin{equation} U(s) = k_P + \frac{k_I}{s} + k_Ds \end{equation} as its transfer function, we see that the characteristic equation of our second-order system with a $PID$ controller is \begin{equation} s^3 + (a_1 + Ak_D)s^2 + (a_2 + Ak_P)s + Ak_I = 0. \end{equation} Here, we can choose three coefficients and hence determine all three roots of the characteristic polynomial. This means that, as with the $PI$ controller, the steady-state error is under our control, but also that the system's oscillations are as well. In practice, the $D$ term lets us cancel the oscillations we see when the $I$ term is introduced.