I learn the classical definition of Green's function from Hunter's Applied Analysis.
Consider the second-order ordinary differential operators $A$ of the form $$Au=au''+bu'+cu,$$ where $a,b$ and $c$ are sufficiently smooth functions on $[0,1]$.
Think about the Dirichlet boundary value problem for the second-order differential operator $A$ defined above: $$Au=f,\qquad u(0)=u(1)=0,\qquad \tag{10.9}$$ where $f:[0,1]\to{\mathbb C}$ is a given continuous function.
The author gives a heuristic discussion using Dirac delta function: the Green's function $g(x,y)$ associated with the boundary value problem in (10.9) is the solution of the following problem: $$Ag(x,y)=\delta(x-y),\qquad g(0,y)=g(1,y)=0.\qquad \tag{10.13}$$ He reformulates $(10.13)$ in classical, pointwise terms. The book said that we want $g(x,y)$ to satisfy the homogeneous ODE(as a function of $x$) when $x\neq y$, and we want the jump in $a(x)g_x(x,y)$ across $x=y$ to equal one in order to obtain a delta function after taking a second $x$-derivative. We therefore make the following definition:†
† A function $g:[0,1]\times[0,1]\to{\mathbb C}$ is a Green's function for (10.9) if it satisfies the following conditions.
(a) The function $g(x,y)$ is continuous on the square $0\leq x,y\leq 1$, and twice continuously differentiable with respect to $x$ on the triangles $0\leq x\leq y\leq 1$ and $0\leq y\leq x\leq 1$, meaning that the partial derivatives exist in the interiors of the triangles and extend to continuous functions on the closures. The left and right limits of the partial derivatives on $x=y$ are not equal, however.
(b) The function $g(x,y)$ satisfies the ODE with respect to $x$ and the boundary conditions: $$\begin{align} Ag=0\qquad \text{in}~0<x<y<1~\text{and}~0<y<x<1,\\ g(0,y)=g(1,y)=0\qquad\text{for}~0\leq y\leq 1. \end{align} $$ (c) The jump in $g_x$ across the line $x=y$ is given by $$g_x(y^+,y)-g_x(y^-,y)=\frac{1}{a(y)}$$ where the subscript $x$ denotes a partial derivative with respect to the first variable in $g(x,y)$, and $$g_x(y^+,y)=\lim_{x\to y^+}g_x(x,y),\qquad g_x(y^-,y)=\lim_{x\to y^-}g_x(x,y).$$
The words in bold---
...we want the jump in $a(x)g_x(x,y)$ across $x=y$ to equal one in order to obtain a delta function after taking a second $x$-derivative
refer to condition (c) in the definition above.
Here is my question:
How can one get $Ag(x,y)=\delta(x-y)$ from $$g_x(y^+,y)-g_x(y^-,y)=\frac{1}{a(y)}?$$ Added:
The confusion is that I don't know what the words in bold mean. Finally, we want $$Ag(x,y)=\delta(x-y),$$ but what's the relation between "taking a second $x$-derivative" of $a(x)g_x(x,y)$ and $Ag(x,y)$?
Let $f:(a,b]\to\mathbb{R}$ and $g:[b,c)\to\mathbb{R}$ be $C^1$-functions. The derivative of $F:(a,c)\to\mathbb{R}$ defined by $$F(x):=\begin{cases}f(x) & x<b \cr g(x) & x>b\end{cases}$$ equals $\tilde F + (g(b)-f(b))\delta_b$ where $\tilde F$ is the classical derivative $$\tilde F(x):=\begin{cases}f'(x) & x<b \cr g'(x) & x>b\end{cases}$$ in the distributional sense. In other words, if a function has a jump, then its derivative is a delta distribution which measures the height of the jump (in addition to the classical derivative).
See also my answer to this question to get a very quick overview on distributions, and what it means for a function to have a distribution or a measure as its derivative (note that the delta distribution is actually a measure).
Edit: Here is a heuristic reason of why this is true. The function $F$ is defined piecewise by $f$ on the left side of $b$ and $g$ on the right side of $b$ (the value at $b$ precisely is not important, you can think of the function as a graph which has a vertical line at $b$), but the two pieces don't fit together at $b$. What could the derivative of $F$ reasonably be? If you just ignore the discontinuity at $b$ and differentiate $f$ and $g$ seperately, you get $\tilde F$, but you see that it's not a good candidate for a derivative as soon as you test the fundamental theorem of calculus: Let $x\in (a,b)$ and $y\in (b,c)$, then $$\int_x ^y \tilde F(t) dt=\int_x ^b f'(t) dt + \int_b ^y g'(t) dt=f(b)-f(x)+g(y)-g(b)$$ where it should be $g(y)-f(x)$. So we are off by $g(b)-f(b)$ which is the height of the jump, so we correct it by adding this number times $\delta_b$ to it. $\delta_b$ has the property that it's zero outside of $b$, and $$\int_x ^y \delta_b(t)=1$$ Of course such a function does not exist, that's why it's heuristic. Making this rigorous requires the theory of distributions.