Backpropagation of simple model

37 Views Asked by Bumbble Comm At 27 Mar 2026 - 9:37

I've been attempting to grok how backpropagation works. I've therefore come up with a super simple model that I wanted to attempt to optimize:

$f_{p}(x) = p x$

For some parameter $p$.

My toy training data look like the following:

$X = \{(1, 1), (2, 2), (3,3)\}$

Therefore the $p$ should obviously end up being $1$.

I'm using a very simple loss function:

$E(X, p) = \frac 12 (\hat{y} - f_p(x))^2$

The change of $p$ per training step $t$ for a learning rate $\alpha$ is defined such that:

$p^{t+1} = p^t - \alpha \frac{\partial E (X, p^t)}{\partial p}$

I'm unsure how I should compute the above, here's my attempt for the first step so far:

$$\begin{eqnarray} p^{0} &=& -1 \\ p^{1} &=& -1 - \alpha \frac{\partial E(X,p^0)}{\partial p} \\ \frac{\partial E(X,p^0)}{\partial p} &=& \frac{\partial}{\partial p} \frac 12 (\hat y -y)^2 \end{eqnarray}$$

But I'm kind of stuck here. Any help would be much appreciated!

Original Q&A

There are 1 best solutions below

Bumbble Comm On 05 Mar 2020 - 1:39 BEST ANSWER

\begin{align*} \nabla =&\frac{\partial{E(X, p)}}{\partial p} = \frac{\partial ~0.5(\hat y - f_p(x))^2}{\partial p} = \frac{\partial ~0.5(\hat y - px)^2}{\partial p} \end{align*}

Here, let's set $q \equiv (\hat y - px)$, so that we can write

$$ \nabla = \frac{\partial ~0.5(\hat y - px)^2}{\partial p} = \frac{\partial 0.5q^2}{\partial p} = 0.5 \frac{\partial{q^2}}{{\partial p}} $$

We now invoke the chain rule to get:

$$ \nabla = 0.5 \frac{\partial{q^2}}{{\partial p}} = 0.5 \frac{\partial{q^2}}{{\partial q}} \frac{\partial{q}}{{\partial p}} = 0.5 \cdot 2q \cdot \frac{dq}{dp} = q \frac{dq}{dp} $$

We now evaluate $\frac{dq}{dp}$ as:

\begin{align*} \frac{dq}{dp} &= \frac{\partial (\hat y - px)}{\partial p} \\ &= \frac{\partial \hat y}{\partial p} - \frac{\partial (px)}{\partial p}\\ &= 0- x\frac{\partial p}{\partial p} \\ &= - x \cdot 1 = -x \end{align*}

This gives us the full expression:

$$ \nabla = q \frac{dq}{dp} = -qx = (\hat y - px) \cdot (-x) $$

The important thing to remember is that we assume that $x, \hat y$ are independent of the value of $p$ (the parameter), since that's the data. Therefore:

$$ \frac{\partial x}{\partial p} = 0 \qquad \frac{\partial \hat y}{\partial p} = 0 $$

Also note that in "real world" implementations of automatic differentiation, one does not compute the derivatives symbolically. Rather, they're a different set of techniques.

Here is a good reference on the topic

Backpropagation of simple model

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in PARTIAL-DERIVATIVE

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Related Questions in ERROR-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions