I am looking for a proof that the multi-class SoftMax logistic regression using Maximum Liklihood has a convex performance function?
In particular I am interested in showing the function:
$$ -ln\Biggl(\frac{e^{{w_i}^T x}}{\sum_{j} e^{{w_j}^T x}}\Biggr) $$
is convex with respect to the weight vectors(I guess all weight vectors need to be considered).
The function simplifies to: $$ \log \left(\sum_{j} e^{{w_j}^T x}\right) - {w_i}^T x.$$ Log-sum-exp is convex (see Convex Optimization by Boyd and Vandenberghe).