Applying the chain rule to a line integral over gradients to get the Integrated Gradients method

Question

Applying the chain rule to a line integral over gradients to get the Integrated Gradients method

438 Views Asked by Bumbble Comm At 11 May 2026 - 7:22

I'm trying to understand how the authors of the paper 'Axiomatic Attribution for Deep Networks' determined the formula for Integrated Gradients.

The path function $\gamma = (\gamma_1, ..., \gamma_n): [0, 1] \rightarrow \mathbb{R}^n$ specifies a smooth path from the baseline $x' \in \mathbb{R}^n$ to the input $x \in \mathbb{R}^n$, where $\gamma(0)=x'$ and $\gamma(1)=x$. The function $F : \mathbb{R}^n \rightarrow [0, 1]$ is represented by a deep neural network in this case. The authors integrate the gradients over the path $\gamma$ with $\alpha \in [0,1]$ using the following line integral:

$$\text{PathIntegratedGrads}_i^\gamma(x)::=\int_{\alpha=0}^1 \frac{\partial F(\gamma(\alpha))}{\partial \gamma_i(\alpha)} \frac{\partial \gamma_i(\alpha)}{\partial \alpha} d\alpha. \tag{1}$$

Now, the authors consider the straight line path:

$$\gamma(\alpha) = x' + \alpha \cdot (x - x') \text{ for } \alpha \in [0, 1]. \tag{2}$$

Plugging the straight line path into the above formula for $\text{PathIntegratedGrads}_i^\gamma(x)$ they get:

$$\text{IntegratedGrads}_i(x)::= (x - x')\int_{\alpha=0}^1 \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial x_i} d\alpha. \tag{3}$$

Since

$$\frac{\partial \gamma_i(\alpha)}{\partial \alpha} = (x - x'), \tag{4}$$

it follows, that: $$\frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial (x_i' + \alpha \cdot (x_i - x_i'))} \overset{!}{=} \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial x_i}. \tag{5}$$

However, applying the chain rule, I got:

$$\frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial (x_i' + \alpha \cdot (x_i - x_i'))} = \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial x_i} \frac{1}{\alpha}. \tag{6}$$

Then shouldn't integrated gradients instead be:

$$\text{IntegratedGrads}_i(x)::= (x - x')\int_{\alpha=0}^1 \frac{\partial F(x' + \alpha \cdot (x - x'))}{ \partial (x_i' + \alpha \cdot (x_i - x_i'))} d\alpha. \tag{7}$$

This is also the way it is implemented in the author's github:

# Scale input and compute gradients.
scaled_inputs = [baseline + (float(i)/steps)*(inp-baseline) for i in range(0, steps+1)]
predictions, grads = predictions_and_gradients(scaled_inputs, target_label_index)  # shapes: <steps+1>, <steps+1, inp.shape>

avg_grads = np.average(grads[:-1], axis=0)
integrated_gradients = (inp-baseline)*avg_grads # shape: <inp.shape>

Did I go wrong somewhere or am I missing something?

EDIT: I'm still having some trouble to follow. Let's say we have the path integral and plug in the straight line path, then we get Integrated Gradients:

$$ \text{IntegratedGrads}_i(x)::=\int_{\alpha=0}^1 \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial (x'_i + \alpha \cdot (x_i - x'_i))} \underbrace{\frac{\partial (x' + \alpha \cdot (x - x'))}{\partial \alpha}}_{(x - x')} d\alpha.\tag{8} $$

Then I have a $\partial (x'_i + \alpha \cdot (x_i - x'_i))$ in the denominator of the integrand, whereas the paper only has a $\partial x_i$. Therefore, equation (8) is different from equation (3), which is the Integrated Gradients formula from the paper. When I rewrite the integrand of Integrated Gradients from the paper with the chain rule I get the following:

$$ \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial x_i} = \frac{\partial F(x' + \alpha \cdot (x - x'))}{\partial (x'_i + \alpha \cdot (x_i - x'_i))} \underbrace{\frac{\partial (x'_i + \alpha \cdot (x_i - x'_i))}{\partial x_i}}_\alpha, \tag{9} $$ which explains the additional $\alpha$ in (6) and that the LHS and RHS are not equal in equation (5). I also changed the equality sign in equation (5) to emphasize that the LHS and RHS are not the same. What am I missing, such that I get the formula of Integrated Gradients as stated in the paper?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Regarding the interpretation of the math in the paper, I think you have a $∂(x′i+α⋅(xi−x′i))$ term instead of a $∂(xi)$ term in the denominator of the LHS of the equation "it follows, that:".

(I also wonder if there is some confusion about notation: The derivative is taken with respect to a variable[$xi$] at a particular point[$x′i+α⋅(xi−x′i$)].) Regarding the implementation in github repository, I am not seeing the extra α term in the implementation. [If it helps, the derivative returned by the ML library (e.g. tensorflow) corresponds to $∂F(x′+α⋅(x−x′))/∂xi$ and not $∂F(x′+α⋅(x−x′))/∂(x′i+α⋅(xi−x′i)$].

(If you want to do the complete derivation, first use the fundamental theorem of calculus, then the partial derivative chain rule as in case 1 here: http://tutorial.math.lamar.edu/Classes/CalcIII/ChainRule.aspx and the result should match up.)

Applying the chain rule to a line integral over gradients to get the Integrated Gradients method

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in CHAIN-RULE

Related Questions in LINE-INTEGRALS

Trending Questions

Popular # Hahtags

Popular Questions