In section 12.3 of Understanding Machine Learning: From Theory to Algorithms, it is stated that the hinge loss, $l^{hinge} = max \{0, 1 - y<w,x> \}$ is convex. It is stated that the hinge loss is convex because of claim 12.5. Claim 12.5 basically states that if each part of a function that is the “maximum” is convex, then the whole function that is the “maximum” is convex. So it would suffice to show that $0$ and $1 - y<w,x>$ are both convex to show that the hinge loss function is convex.
It is clear to me why $0$ is convex. However I don’t understand why $1 - y<w,x>$ is convex. Regarding what I thought about, there is a lemma, lemma 12.3, that states that a function $f$ is convex if and only if $f’$ is monotonically nondecreasing. But because $1 - y<w,x>$ is a function that is parameterized on a multivariable tuple $w$, $1 - y <w, x>$ would have no derivative. There is also claim 12.4 stating that the convexity of g implies convexity of $f(w)=g(⟨w,x⟩+y)$. However $1 - y<w,x>$ is of a different form than $f(w)=g(⟨w,x⟩+y)$ so claim 12.4 does not help.
Do you know why $1 - y <w, x>$ is convex?
This is from https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/understanding-machine-learning-theory-algorithms.pdf