This question is somewhat open-ended and there is unlikely to be a single fixed answer to this, but I would love to hear some thoughts of the people with a deeper understanding of this field than I have.
I'm taking a graduate level class in Combinatorial Optimisation, and we have spent some of the first few weeks going over proofs in Graph Theory and problems in Network Flow.
I come from a background in engineering but I've taken Discrete Mathematics and Algorithms in the past, and so I do have a functional understanding of these topics. Where I find myself lacking, however, is converting the intuition I have for these techniques/theorems into actual proofs.
For the sake of an example, consider the case of the "Bottleneck Claim" as taken from here:
Theorem 1 If $(G = (V, E), s, t, c)$ is a network in which the cost of the maximum flow is $\text{opt}$, then there is a path from $s$ to $t$ in which every edge has capacity at least $\frac{\text{opt}}{\vert E \vert}$.
It then goes on to prove the theorem using Flow Decomposition described as follows
Lemma 2 (Flow Decomposition) Let $(G = (V, E), s, t, c)$ be a network, and $f$ be a flow in the network. Then there is a collection of feasible flows $f_1, \dots , f_k$ and a collection of $s \rightarrow t$ paths $p_1, \dots , p_k$ such that:
$k \le \vert E \vert$;
the cost of $f$ is equal to the sum of the costs of the flows $f_i$
the flow $f_i$ sends positive flow only on the edges of $p_i$.
Now there is a reasonable amount of intuition behind all of this — for any path $s \rightarrow t$ path, $i$, you reduce flow by $f_i$ (the bottleneck); you can do this at most $\vert E \vert$ times before there are no more paths. And using this, it's simple enough to show that Theorem 1 is correct.
So clearly, the intuition is there, but I struggle with is the process of thinking about or constructing such a proof independently.
The question, then, is: what kind of resources/tools are available to someone hoping to delve deeper into Theoretical Computer Science, in order to sharpen the intuition into something more formal? Most of the textbooks I have recently tried to read, I have found somewhat inaccessible and densely packed with notation that is not very helpful.
Any comments, philosophical or otherwise, and/or general words of advice are also welcome.
I don't know whether that is the answer you are looking for, but for me personally, the key to understanding network flow is to simply think of as water flow. I.e., suppose at the vertex $s$ there is some water pump pumping $x$ amount of water out per second, and at the the vertex $s$ there is a hole, draining $x$ amount of water per second. In the middle, the water travels along some edges at different flow rates, which you can't really control. But the maximum flow rate among an edge is bounded by its capacity, which you can imagine as the thickness of a corresponding water pipe. (A small technicality is that you can technically also have water flowing in cycles somewhere, kind of disobeying the laws of physics and forming a perpetuum mobile.)
The flow decomposition theorem then simply states, that each $s$-$t$-flow can be decompositioned into at most $|E|$ "rivers" form $s$ to $t$ (and/or circular flows). If you accept the flow decomposition theorem as given, Theorem 1 becomes really easy: If there are at most $|E|$ rivers from $s$ to $t$, at least one of them has to carry at least the average flow, which is at least $\text{opt}/|E|$.
Now, how would one prove the flow decomposition theorem? Assume you have some $s$-$t$-flow, then select some edge $e$ such that there is nonzero flow over $e$, say a flow of 10 units/second for example. Now consider the directed graph $H$ consisting out of all edges over which some water flows and the edge directed into the direction which the water flows. Then you can extend $e$ either into a directed $s$-$t$-path or a directed cycle $P$ in $H$ (because the water must be coming from somewhere, right?). Now, the flow along the edges of $P$ is not necessarily 10 units/second everywhere, but there is some bottleneck edge in $P$, say with a flow of $c$ units/second. Then you can mentally "subtract" a flow along $P$ of value $c$. Note that now the bottleneck edge disappeared from $H$, so you can proceed by induction and are guaranteed to take $|E|$ steps at most.
Now, of course not every theorem has an intuitive explanation like this. Overall, intuition in mathematics is very hard to obtain and takes a lot of time and is different for everyone. In particular, exercises help a lot with it, in my opinion.