Context
The context of this problem begins with the variational principle and Lagrange's equations of motion, as well as the principle of least action [1,2]. However, the problem can be cast more generally into a more general exposition of branch of mathematical physics called the calculus of variations [3].
In the standard first go found in [1,2,3] we begin with the equation for the functional, $J$ (often $S$ is used for the action functional), and an unknown function $y$. We write $y$ as $$y = y(x,\alpha) = y(x,0) + \alpha\,\eta{(x)}.$$ We then place restrictions on $\eta$. The first set of restrictions are essentially boundary conditions that are given by the equations $$\eta{(x_1)}=\eta{(x_2)}=0\,.$$ To this structure, we add the condition for an extreme value, which is that $$ \left[\frac{\partial J}{\partial \alpha} \right]_{\alpha = 0} = 0\,. $$ Running through the calculus of variation we arrive at an Euler equation. Physicists are most likely most familiar with with the form of the Euler equation called the Euler-Lagrange equations of motion.
In a typical course in mechanics, the calculus of variations comes up for a second time in the context of the principle of least action. This rendition of the calculus of variations is substantially similar to the first. There is one difference. In this case, there is no restriction on the second boundary at $x_2$. Running the through the calculus of variations without this restriction on the second boundary results in the following. $$ \frac{\partial J}{\partial \alpha} = \left[\eta(x) \frac{\partial f}{\partial y_x}\right]_{x_2}. $$
I understand from [3] that occasionally (like in [2]) we see $$\delta{J} = \alpha \left[\frac{\partial J}{\partial \alpha}\right]_{\alpha = 0}.$$ So, in this occasion, I have that $$ \delta{J} =\alpha \left[ \frac{\partial J}{\partial \alpha}\right]_{\alpha = 0} = \left( \left[\eta(x) \frac{\partial f}{\partial y_x}\right]_{x_2} \right)_{\alpha = 0}. \tag{10}$$
In the case of the principle of least action, we have an analogue of Equation (10). Now, $J$ is the action $S$; $f$ the Lagrangian $L$; the variables are re-lettered as
\begin{equation}
x \to t
\quad \text{and}\quad
y\to q;
\end{equation}
$\delta{q}(t_2) = \delta{q}$; and $\frac{\partial L}{\partial q_t} $ is the generalized momentum $p$. We have that
\begin{equation}
\delta{S} = \alpha \left( \frac{\partial S}{\partial \alpha}\right)_{\alpha = 0} = \left( p\,\delta{q} \right)_{\alpha = 0}.
\end{equation}
Landau makes the following proposition [2]:
From this relation it follows that the partial derivative of the action with respect to the [co-ordinate is] equal to the corresponding [momentum]: $$ \frac{\partial S}{\partial q} = p.\tag{43.3}$$
I do not see how this proposition is true.
Question
Given that \begin{equation} \delta{S} = \left( p\,\delta{q} \right)_{\alpha = 0}, \end{equation} prove that $$ \frac{\partial S}{\partial q} = p.$$
Bibliography
[1] Goldstein, 3rd Ed., p. 356.
[2] Landau, Volume 1, 3rd Ed., p. 138.
[3] Arfken, 5th Ed, p. 1018.
OP's main issue is possibly related to the fact that one must distinguish between the off-shell action functional $$ I[q;t_i,t_f]~:=~ \int_{t_i}^{t_f}\! {\rm d}t \ L(q(t),\dot{q}(t),t), \tag{A} $$ and the Dirichlet on-shell action function $$ S(q_f,t_f;q_i,t_i)~:=~I[q_{\rm cl};t_i,t_f], \tag{B} $$
where $q_{\rm cl}:[t_i,t_f] \to \mathbb{R}$ is the extremal/classical path, which satisfies the Euler-Lagrange (EL) equation $$\frac{\delta I}{\delta q} ~:=~\frac{\partial L}{\partial q} - \frac{\mathrm d}{\mathrm dt} \frac{\partial L}{\partial \dot{q}}~\approx~ 0,\tag{C} $$ with the Dirichlet boundary conditions $$ q(t_i)~=~q_i \qquad \text{and}\qquad q(t_f)~=~q_f.\tag{D}$$ This distinction is only mentioned in words in a paragraph below eq. (43.1) on p. 138.
Once this distinction between eqs. (A) & (B) is made clear, the proof of eq. (43.3) is relatively straightforward [as indicated in Ref. [LL] around eqs. (2.5) & (43.2)] with some extra assumptions:
The classical path is unique and exists for each set of Dirichlet boundary conditions (D),
The classical path is uniformly continuous wrt. changes of the Dirichlet boundary conditions (D).
For more details, see e.g. eq. (11) in my Phys.SE answer here.
References: