It is well known that if $f:\mathbb{R}^n\to\mathbb{R}$ is $C^1$ smooth function on an open domain, then $f$ is convex if and only if $$f(y)\geq f(x)+\nabla f(x)^T(y-x),\ \forall x,y\in \text{dom}(f).$$
Similarly, I suspect that the same relationship also holds for a $C^0$ non-smooth function whose gradients exist almost everywhere (e.g., piecewise continuous function) and would like to show the proof. Currently, I have some difficulties to modify the gradient inequality above considering the "almost everywhere" condition. Any references and counterexamples are very welcome.
If $\nabla f(x)$ only exits for almost all $x$, then you have to modify the inequality to take this into accout. Assume $f$ and almost everywhere differentiable and satisfies $$ f(y) \ge f(x) + \nabla f(x)^T(y-x) $$ for all $y$ and all $x$ such that $\nabla f(x)$ exists.
Take $x,y$, $\lambda\in(0,1)$ such that $\nabla f(\lambda x + (1-\lambda )y)$ exists. Then $$ f(x) \ge f(\lambda x + (1-\lambda )y) + (1-\lambda) \nabla f(\lambda x + (1-\lambda )y)(x-y) $$ and $$ f(y) \ge f(\lambda x + (1-\lambda )y) - \lambda \nabla f(\lambda x + (1-\lambda )y)(x-y). $$ Multiplying the first inequality by $\lambda$, the second by $1-\lambda$, adding them gives $$ \lambda f(x) + (1-\lambda)f(y) \ge f(\lambda x + (1-\lambda )y) $$ for all $\lambda$ such that $\nabla f(\lambda x + (1-\lambda )y)$ exists.
Now let $x,y$ and $\lambda\in (0,1)$ be arbitrary, set $z:=\lambda x + (1-\lambda) y$. Then for all $\epsilon>0$ there exists $z_\epsilon \in B_\epsilon(\lambda x + (1-\lambda )y)$ such that $\nabla f(z_\epsilon)$ exists. Set $x_\epsilon:= x+(z_\epsilon-z)$, $y_\epsilon:=y+(z_\epsilon-z)$, then $z_\epsilon = \lambda x_\epsilon + (1-\lambda)y_\epsilon$. This implies $$ f( \lambda x_\epsilon+ (1-\lambda)y_\epsilon ) \le \lambda f(x_\epsilon) + (1-\lambda)f(y_\epsilon). $$ Passing to the limit with $\epsilon\searrow0$ we get the inequality $$ \lambda f(x) + (1-\lambda)f(y) \ge f(\lambda x + (1-\lambda )y) $$ by continuity of $f$.
(In the last step it is important to perturb $x$ and $y$. A perturbation argument for $\lambda$ is not enough as lines have zero measure in dimension $n>1$.)