There are lots of methods of proof: direct proof, proof by contrapositive, proof by contradiction, proof by induction, proof by cases, computer proof... On the other hand, the process of mathematical discovery seems quite mysterious to me. Take induction as an example. In order to do induction, I need to be given the right conjecture first. Say I want to
Prove $$\sum_{k=1}^{n} k^2=\frac{n(n+1)(2n+1)}{6}$$
compared with
Find a general formula of $$\sum_{k=1}^{n} k^2$$
The latter is much harder. I would do the first one by induction, but the second one is a different story. More generally, how do mathematicians make discoveries? A historical example of a heuristic approach would be enough. Sorry if my question is too broad.
Often mathematicians make discoveries by working on an applied problem, and being confronted with a mathematical problem that is either new, or at least not well-known. Solving the required mathematics for an immediate applied problem can yield new mathematical results, and once this is done, there are often natural extensions/generalisations to the applied problem that can lead to broader mathematical problems, which have correspondingly broader results. If the mathematical problem turns out to arise in a lot of diverse applied problems (which is often the case) then the mathematical result becomes of direct interest.
The example you give in your question is a well-known mathematical problem, and it arises in a large number of applied problems in all sorts of applied mathematical fields. Although it is now well-known, once upon a time this was new, and had no known general formula. The formula for the sum of squares of the first $n$ natural numbers (called the square pyramidal numbers) was derived by Fibonacci in 1202 in his book Liber Abbaci (The Book of Calculation). A natural generalisation of the problem is to look at other positive integer powers, and this latter problem leads to a general form called the Bernoulli-Faulhaber formula. This general form of this summation for any positive integer power was published by Jacob Bernoulli in 1713 in his book Ars Conjectandi (The Art of Conjecturing), and was later independently discovered by Johann Faulhaber. In this particular case the formulae were given but a formal proof of correctness did not occur until the nineteenth century. There were many later extensions extending this sum to allow complex numbers, looking at the relationship with the Reimann-zeta function, etc.
As you can see from this example, mathematical discoveries may involve an initial discovery of a solution to some relatively narrow mathematical problem, and then later discovery of solutions to natural generalisations of that problem. This can occur in a gradual expansion as more and more extensions are made, and relationships between other results and mathematical objects are derived. Often an initial discovery is not made via strict proof, but may instead done with a heuristic demonstration, and formal proof might only come later (often from a different author).