Effect of perturbation on softmax

455 Views Asked by At

Let's say we have the following function $$S_i(\textbf{x},T)=\frac{\exp(f_i(\textbf{x})/T)}{\sum_{j=1}^N\exp(f_j(\textbf{x})/T)}$$ where $f_{i=1,\dots,N}(\textbf{x})$ represents the output of a neural network. We want to add a perturbation $$\tilde{\textbf{x}}=\textbf{x}-\epsilon\ \text{sgn}\left(-\nabla_\textbf{x}\log S_i(\textbf{x},T) \right) $$ The result should be: $$\log S_i(\tilde{\textbf{x}},T)=\log S_i(\textbf{x},T)+\epsilon||\nabla_\textbf{x}\log S_i(\textbf{x},T)||_1+o(\epsilon)$$ The paper says that this is the first order Taylor expansion, but I cannot figure how it has been derived. Any hint? Thanks.