I'm currently taking a classical electrodynamics course. I have a mathematical background and I know that the classical theorems of integral calculus (Stokes, Gauss, ...) are just particular versions of Stokes' theorem for manifolds and that the curl, divergence, ... operators are particular forms of the exterior derivative.
But in this course of physics that I'm taking most of the student don't know a lot of mathematics and because of this, the teacher solves most of the problems without justifying that he can apply the desired results from calculus.
For instance, in one exercise I have to prove that a magnetic monopole does not satisfy Maxwell's equations. To do this we'll prove that $\nabla \vec{B} \neq 0$.
The field created by a magnetic monopole is:
$$\vec{B}_m = \frac{\mu_0 q_m}{4 \pi} \frac{\vec{r}}{\lvert\vec{r}\rvert^3} $$
And the solution given by my teacher is that the divergence of this field is:
$$\nabla \vec{B_m} = \mu_0 q_m \delta^{(3)}(\vec{r})$$
Where the superscript is to show that it is a 3-dimensional delta. I don't know how to justify this rigorously because the field has a singularity at $\vec{r}=0$ and the $\delta$ is not a function but a distribution. Is there any way to generalize Stokes theorem to be able to show this? What is the formal way to prove this statement?
I would really appreciate any mathematical reference where I could find how to deal with this kind of singularities.
In short, your professor is not taking the care to specify which sense of "derivative" is meant in Maxwell's equations and, by extension, which functions the equations apply to.
If you take the usual definition of the (partial) derivative, then Maxwell's equations have to be understood to also include the statement "all fields are differentible everywhere". For example, Gauss' law for magnetism $\nabla \cdot \mathbf{B} = 0$ needs to be interpreted as stating that 1) $\nabla \cdot \mathbf{B}$ exists everywhere and 2) equals zero everywhere. Your field $\mathbf{B}_m$ fails to be differentiable everywhere and is for that reason not allowed to be a magnetic field. The problem isn't that the divergence is nonzero; the problem is that the field isn't differentiable!
Your confusion is arising because your professor is secretly switching his or her definition of derivative to use the distributional derivative: in the sense of distributions, we have your statement $\nabla \cdot \mathbf{B}_m (\mathbf{r})= \mu_0 q_m \delta^{(3)} (\mathbf{r})$ which of course isn't zero (even in the sense of distributions!). This is unnecessary as I show above and logically pointless: your professor is claiming that $\mathbf{B}_m$ doesn't satisfy Maxwell's equations with normal derivatives by showing that $\mathbf{B}_m$ doesn't satisfy the law with distributional derivatives. However, if he or she doesn't connect the dots between the usual Maxwell equations and the distributional Maxwell equations, then there is logically nothing that can be concluded from this exercise.
The resolution of course comes from generalizing Maxwell's laws to allow them to be applied to distributions. Perhaps the simplest way to do this is to use the integral forms of the laws rather than the differential forms. Instead of $\nabla \cdot \mathbf{B} = 0$, require that $\oint_{\partial \Omega} \mathbf{B} \cdot d\mathbf{S} = 0$ for any closed surface $\Omega$. We don't have a divergence theorem anymore as $\mathbf{B}_m$ isn't differentiable, so let's just evaluate the surface integral directly.
Choose $\Omega$ to be a sphere of radius 1 centered at the origin oriented outward. On $\Omega$ we have that $\frac{\mathbf{r}}{|\mathbf{r}|} = \hat{\mathbf{r}}$ so $\mathbf{B}_m \cdot d\mathbf{S} \equiv 1$. Then we have
$$\oint_{\partial \Omega} \mathbf{B}_m \cdot d\mathbf{S} = \oint_{\partial \Omega} 1 dS = 4\pi.$$
Clearly, $4\pi \ne 0$ thus this field $\mathbf{B}_m$ doesn't satisfy Maxwell's equations.
This answer side-steps your question about generalizing the various vector calculus theorems to distributions. But as you expect the answer is yes, you can reformulate these theorems to apply to distributional derivatives. I'll leave the exploration of this topic as an exercise for the interested reader.