Where does the Lagrangian function come from? I know about gradient vectors and $\lambda$: $$ \nabla f = \lambda \nabla g $$
But from there what steps were done to get the Lagrangian function? (This one): $$ L\left(x,\ y,\ \lambda\right)=f\left(x,y\right)-\lambda\left(g\left(x,y\right)-c\right) $$
Why can the constant $c$ be included here?
I am a high school student trying to explain it, and it most probably goes way over my head, so I am mostly just looking for a reference proof I can point towards in my bibliography. I can't find one; thank you for any help in that regards.

I’ll try to stick with your notation. You are minimizing $f(x,y)$ subject to the constraint that $g(x,y) = c$.
You can learn about Lagrange multipliers without mentioning the Lagrangian. Once we write down the optimality conditions $$ \nabla f(x,y) - \lambda \nabla g(x,y) = 0 \quad \text{and} \quad g(x, y) - c = 0 $$ someone clever might notice that if we introduce the function $$ L(x,y, \lambda) = f(x,y) - \lambda (g(x,y) -c) $$ then these equations can be written as $$ \frac{\partial L}{\partial x} = 0, \quad\frac{\partial L}{\partial y}=0, \quad \frac{\partial L}{\partial \lambda}= 0 $$ which looks kind of neat.
Here is another way you might think of introducing the Lagrangian. One might try to enforce the constraint $g(x,y) = c$ by including a penalty term $-\lambda(g(x,y) - c)$ in the objective function. With this strategy, our new optimization problem is to minimize $$ \tag{1} f(x,y) - \lambda (g(x,y) - c) $$ with no constraints on $x$ and $y$. If we solve this problem and find that $g(x,y) < c$, then let’s increase the value of $\lambda$ a bit (which increases the penalty for having $g(x,y) < c$) and try again. We can hope to find a perfect value of $\lambda$ such that a point $(x,y)$ which minimizes (1) also satisfies $g(x,y) = c$.