This is a sort of soft-question to which I can't find any satisfactory answer. At heart, I feel I have some need for a robust and well-motivated formalism in mathematics, and my work in geometry requires me to learn some analysis, and so I am confronted with the task of understanding weak solutions to PDEs. I have no problems understanding the formal definitions, and I don't need any clarification as to how they work or why they produce generalized solutions. What I don't understand is why I should "believe" in these guys, other than that they are a convenience.
Another way of trying to attack the issue I feel is that I don't see any reason to invent weak solutions, other than a a sort of (and I'm dreadfully sorry if this is offensive to any analysts) mathematical laziness. So what if classical solutions don't exist? My tongue-in-cheek instinct is just to say that that is the price one has to pay for working with bad objects! In other words, I do not find the justification of, "well, it makes it possible to find solutions" a very convincing one.
A justification I might accept, is if there was a good mathematical reason for us to a priori expect there to be solutions, and for some reason, they could not be found in classical function spaces like $C^k(\Omega)$, and so we had to look at various enlargements in order to find solutions. If this is the case, what is the heuristic argument that tells me whether or not I should expect a PDE (subject to whatever conditions you want in order to make your argument clear) to have solutions, and what function space(s) are appropriate to look at to actually find these solutions?
Another justification that I would accept is if there was some good analytic reason to discard the classical notion of differentiability all together. Perhaps the correct thing to do is to just think of weak derivatives as simply the 'correct' notion of differentiability in the first place. My instinct is to say that maybe weak solutions are a sort of 'almost-everywhere' type generalization of differentiability, similar to the Lebesgue integral being a replacement for the Riemann integral which is more adept at dealing with phenomena only occurring in sets of measure $0$.
Or maybe both of these hunches are just completely wrong. I am basically brand new to these ideas, and wrestling with my skepticism about these ideas. So can somebody make me a believer?
Worth noting is that there is already a question on this site here, but the answer in this link is essentially that there exist a bunch of nice theorems if you do this, or that physically we don't care very much about what happens pointwise, only in terms of integrals over small regions. It should be clear why I don't like the first reason, and the second reason I may accept if it could be turned into something that looks like my proposed justification #2 - if integrals over small regions of derivatives are the 'right' mathematical formalism for PDEs. I just don't understand how to make that leap. In other words, I would like a reason to find weak solutions interesting for their own sake.
Let's have a look at the Dirichlet problem on some (say smoothly) bounded domain $\Omega$, i.e. $$ -\Delta u=f \text{ in } \Omega\\ u=0~ \text{ on } \partial \Omega $$ for $f \in \text{C}^0(\overline{\Omega})$. Then, Dirichlet's principle states a classical solution is a minimizer of an energy functional, namely $E(u):=\dfrac{1}{2}\int_\Omega \left|\nabla u\right|^2 \mathrm{d}x-\int_\Omega f u ~\mathrm{d}x$. (Here we need some boundary condition on $\Omega$ for the first integral to be finite).
So the question one may ask is, if I have some PDE why not just take corresponding the energy functional, minimize it in the right function space and obtain a solution of the PDE. So far so good. But the problem that may occur is finding this minimizer. It can be shown that such functionals are bounded by below, so we have some infimum. As also stated in the Wikipedia article, it was just assumed (e.g. by Riemann) that this infimum will always be attained, which shown by Weierstrass unfortunately not always is the case (see also this answer on MO).
Hence, we find differentiable functions which are "close" (in some sense) to a "solution" of the PDE, but no actual differentiable solution. I feel that this is quite unsatisfactory.
So have could we save this? We can multiply the PDE (take the Laplace equation for simplicity) with some test function and integrate by parts to obtain $$ \int_\Omega \nabla u \cdot \nabla v~\mathrm{d}x= \int_\Omega fv~\mathrm{d}x $$ for all test functions $v$. But from what space should $u$ come from? What do we need to make sense to the integral?
Well, $\nabla u \in \text{L}^2(\Omega)$ would be nice, because then the first integral is well-defined via Cauchy-Schwarz. But as shown by Weierstrass, classical derivatives are not enough, so we need some weaker sense. And here we got to Sobolev Spaces and looking again at the last formula, we see the weak formulation.
I am aware that this does not give a full explanation to why one should "believe" in weak solutions, Sobolev spaces and so on. What I stated above is a quick run through how in my course on PDE the step from classical to weak theory was motivated and at least I was quite happy about it.