I want to calculate the Lipschitz constant of softmax with cross-entropy in the context of neural networks. If anyone can give me some pointers on how to go about it, I would be grateful.
Given a true label $Y=i$, the only non-zero element of the 1-hot ground truth vector is at the $i^{th}$ index. Therefore, the softmax-CE loss function can be written as:
$$ \mathrm{CE}(x) = - \log S_{i} (x) = -\log \left(\frac{e^{x_{i}}}{\sum_{j} e^{x_{j}}}\right) $$
$$ \left| \log S_{i} (x) - \log S_{i} (y) \right| \leq L | x - y | $$
I would like to estimate the value of $L$. I'd appreciate any pointers, thank you.