In some numerical work I'm doing, I have a specific expression for a certain quantity $h$ that I'm looking for, but which relies on several integrals over an unknown function $g(x)$. In other words, something like this: $$ h(x,g(x))= A(x)\int_{x-a}^x B(x^{\prime})g(x^{\prime})dx^{\prime}\left(\int_{x-a}^x C(x^{\prime})g(x^{\prime})dx^{\prime}\right)^{1/2} $$ where the primes on $x$ designates the dummy argument, differentiation otherwise. Fortunately, I have a nonlinear equation relating $g(x)$ that I can solve (say, $f(g(x))=0$), looking the same way with multiple integrals over $g(x)$. The problem is, I don't know how to solve it yet, depending on what operations I'm allowed to do on the expression. $x$ itself is not really an independent variable, more of a function depending in a highly non-linear way on input parameters (it's a position). It acts more as a label. In fact, we don't really see $h(x,g(x))$ or $g(x)$ as functions of $x$, but rather $h(g)$ as a function of $g$ (and other variables) and $g$ as an independent variable, a physical quantity. They just 'happen' to depend on $x$, the position.
The question I have may seem like a stupid one, but I'm a bit mixed-up right now. Say I want to solve $f(g(x))$ with a Newton-Raphson method, which means I need $f^{\prime}(g)=\frac{df(g(x))}{dg(x)}$: since $g(x)$ is considered as a variable, can I do the $g(x)$ derivative without having recourse to the chain rule? To me it would seem like cheating. But in a much simpler analogue case I constructed the function $f(g(x))=0$ for solving with Newton-Raphson and did the trick for $f^{\prime}(g)$: $$ \frac{d}{dg(x)}\int_{x-a}^{x}A(x^{\prime})g(x^{\prime})dx^{\prime} = \int_{x-a}^{x}A(x^{\prime})\frac{dg(x^{\prime})}{dg(x)}dx^{\prime} = \int_{x-a}^{x}A(x^{\prime})dx^{\prime} $$ the integral representing any member of $f(g)$. Considering that for this case $f(g)$ can be written as $f(g(x))=p(g(x))+g(x)=0$, I also rewrote $f(g(x))$ as $$ g(x)=-p(g(x)) $$ and solved as regular relaxation problem. In both cases I got the exact same answer.
In another case, there's a complete working (ie. converging) code that needs the quantity: $$ \frac{dh(x,g(x))}{dg(x)} $$ where $$ h(x,g(x))=a(x)g(x) $$ for simplicity. In that case, the former was given by $$ \frac{dh(x,g(x))}{dg(x)}=a(x) $$
Since it seems to be working in last two examples, why can we do this? (if we really can do it) Is there a rigorous answer for this, or it's some kind of trick in numerical problems? Or is this one of these gray areas a physicist would go in?
I still have not understood your problem completely, so this is only a partial answer.
I believe you are interested in a function $g : I \rightarrow J$, but you do not have an explicit formula for $g$. Here $I$ and $J$ are real intervals. However, it is known that $y = g(x)$ satisfies an equation of the form $$f(x,y) = 0,$$ where $f : I \times J \rightarrow \mathbb{R}$. In this situation, you can certainly try to apply Newton's method. It takes the following form $$ y_{n+1} = y_n - \frac{f(x,y_n)}{\frac{\partial f}{\partial y}(x,y_n)},$$ where the initial value $y_0$ should be close to the target value $y=g(x)$ to ensure convergence. I want to stress, that the derivative is merely the regular partial derivative of $f$ with respect to its second variable and that the chain rule is not involved here.
It is entirely possible that computing this derivative will be quite expensive in your specific case. It may be worthwhile to apply the secant method instead. It converges at a slower rate than Newton's method, but can be substantially faster because it only requires $1$ rather than $2$ function evaluations per iteration. Ideally, you should combine the secant method with bisection in order to obtain a method which is robust.
You mentioned a specific example, that of $$h(x,g(x)) = a(x)g(x).$$ I much prefer to write $$h(x,y) = a(x)y.$$ With this notation all confusion evaporates and we have $$\frac{\partial h}{\partial y}(x,y) = a(x).$$ The notation you cited is common in physics texts. I object to it precisely because of the confusion it can generate.
EDIT: It seems prudent to mention the related case of $$k(x) = h(x,g(x)).$$ Then if $h = h(x,y)$ and $g = g(x)$ are differentiable, we have that $k=k(x)$ is differentiable, with \begin{align} k'(x) &= \frac{\partial h}{\partial x}(x,g(x))\cdot \frac{\partial x}{\partial x}(x) + \frac{\partial h}{\partial y}(x,g(x))\cdot \frac{\partial g}{\partial x} (x) \\ & = \frac{\partial h}{\partial x}(x,g(x)) + \frac{\partial h}{\partial y}(x,g(x))g'(x). \end{align}