In class, we have seen that the covariant derivative of some form $R$ can be written as:
$$DR = dR + [A, R] = dR + A\wedge R - R\wedge A \tag1$$
Here, $d$ represents the external derivative over forms and $A$ is the local connection defined via the pull-back of a section $S: U_i \in M \rightarrow P(M, G)$ where $P(M, G)$ is the principal bundle with $M$ the base space and $G$ the Lie group that plays the fiber role. Therefore, $A = S^*\omega$, with $\omega \in \Omega^1(P)\otimes T_eG$ and $\Omega^1(P)$ the set of 1-forms in $P(M, G)$. So while $\omega$ is a connection for all $P$, $A$ is just over $U_i$
So by Eq. (1) we can write:
$$D = d + [A,\ ·\ ] \tag2$$
Eq. (2) is pretty similar to the one used in QFT:
$$D_\mu = \partial_\mu + igA_\mu \tag3$$
$g$ is just the coupling constant of the interaction, so $igA_\mu$ is somehow equivalent to the connection $A$ of the Eq. (1). I understand that the index $\mu$ comes out from the fact that in Eq. (1) we work with forms, so
$$A\sim A_\mu dx^\mu \tag4$$
But, what I don't see is how to make the relation between the commutator in Eq. (2) and the simple form $igA_\mu$.
The formal definition of horizontal and vertical spaces are not to important right now, for us they will just serve as a tool. The Idea is that by means of horizontal spaces we can talk about how a connection (i.e, a background field) changes the differentiation rule (you can compare that with general relativity, where curvature of spacetime does, indeed, force you to use the covariant differentiatial operator $\nabla_{\mu}$ (with respect to some coordinate system). Let me try to make this claim precise!
Formal prelimarys
Throughout this section, we will fix a representation $\rho:G \to GL(V)$ of the structure group on some finite dimensional vector space $V$ and a connection form $A$ on $P$.
Then you get:
Relating the two derivatives:
(1.) $\quad$ In your first equation, you are looking at horizontal 1 forms of type $Ad,$ where $Ad:G \to GL(T_eG)$ is the adjoint representation of the Lie Group $G$ on its Lie Algebra. Since $Ad_*(X) = [X, \ . \ ],$ you get your equation as a special case of the preeceding theorem (after everything is pulled back to a suitable set $U \subset M$ via a section $s: U \to P$).
(2.) $\quad$ Now, lets take any smooth function $\psi: P \to V$, which satisfies $\psi(pg)= \rho(g^{-1})\psi$. If you pull it back via the section $s:U \to P$ you get a smooth function $\psi': U \to V$ wich you can think of as a gauged (fermion) field. These are the functions on which the derivative induced by the connections which correspond to gauge Bosons are meant to act. Observe that if you choose a different gauge $s': U \to V$ then you can find a function $\mu: U \to G$ such that $\psi'' := \psi \circ s' = \rho(\mu( \ \ )^{-1}) \psi'$ which is why you want to look at those kind of functions. They precisely capture the transformation property of your fields under a given gauge transformation. Now, on $\psi$ we have, since $\psi \in \Omega^0_{\mathrm{hor}}(P,V)^{(G, \rho)}$: $$D_A \psi = d\psi + \rho_*A \psi$$ which, after pulling it back via $s$ and then writing it in coordinates, becomes: $$D_{\mu} \psi' = \partial_{\mu}\psi' + \rho_*A^{s}_{\mu} \psi' $$ where $A^s$ is the pulled back connection, i.e your ''gauge potential'' and the index $\mu$ denotes it's components in your coordinate system of choice. The factor $ig$ is, as far as I know, convention to emphazise that the representation $\rho$ is not trivial, i.e $\rho_* \neq 0$ and the mentioning of the explizit representation is most of the time omitted in physical literature.
Concluding Remarks:
For further reading, I'd recommend ''Gauge Theory and Variational Principles'' by d. Bleecker. This is not the easiest read, but he has many physical examples and I think it'll help you. Furthermore, the preeceding is technical but that's really it! I didn't write everything out in detail, since I think it'll be a good exercise to verify it yourself (mainly because it doesn't seem to ''brutal'' anymore, after you've done a few calculations with it).
Edit: Maybe Bleeckers book is overkill, I think any book that's about connections on principal fibre bundles will do the job. Be also aware that what I've written is meant pre second quantisation.