the Tikhonov regularization for a linear operator $T: X \rightarrow Y, x \mapsto y$ means minimizing the least square problem
$$\begin{align*} \lVert Tx - y \rVert^2_Y + \alpha \lVert x\rVert_X \rightarrow \min_{x\in X} \end{align*}$$
This is equivalent to solving the regularized normal equations
$$\begin{align} (T^*T+\alpha I)x = T^*y, \end{align}$$
where $T^*$ is the adjoint operator of $T$.
Now I've currently read about the nonlinear case. Apparently, minimizing the above mentioned Tikhonov functional leads to the equation
$$\begin{align} T'(x)^* (T(x)-y) + \alpha x = 0 \quad \Rightarrow \quad T'(x)^*T(x)+\alpha x = T'(x)^*y \end{align}$$
I compared both expressions and I am wondering why the derivative of $T$ appears in the non-linear case but not in the linear one.
If I plug in a linear operator into the nonlinear expression, I feel like it should be equal to the linear regularized normal equations. But there is definitely a difference between a linear operator and its derivative, which would be a constant operator.
What am I missing here?
I think you're confused by the notation. Lets say we have some transformation $f$ acting on $x$, i.e. $f(x)$.
The Problem can be stated as:
$$ \min_x \,\{\lVert f(x)-y\rVert^2 + \lambda\lVert x\rVert^2\} $$
Taking the gradient w.r.t. $x$ yields: $$ \frac{d\Phi}{dx} = f'(x)(f(x)-y)+\lambda x $$
If $f(x)$ is a linear transformation, then $$ f(x) = Tx $$ and hence $$ f'(x) = T^* $$ Which in turn would again mean: $$ (T^*T+\lambda I)x = T^*y $$