The primal problem is $min_{w\in R^d}: P(w)$ where $P(w)=\frac{1}{n}\sum_{i=1}^n\phi_i(w^Tx_i)+\frac{\lambda}{2}||w||^2$.
The dual problem is $max_{\alpha\in R^n}: D(\alpha)$ where $D(\alpha)=\frac{1}{n}\sum_{i=1}^n-\phi_i^*(-\alpha_i)+\frac{\lambda}{2}\left \|\frac{1}{\lambda n}\sum_{i=1}^n \alpha_ix_i\right \|^2$, where $\phi_i^*$ is the convex conjugate of $\phi_i$.
I spent almost two days, but still cannot figure it out why these two are primal-dual problem. Because the primal problem has no constrains, is the dual problem means its conjugate problem? I know if the primal problem has constraints, then I can use Lagrange form to find its dual. But this problem has no constraint, I kind of don't know what to do.
Here is my solution that can be used to obtain dual of ERM without forming constrained problem as shown by Michael above.
From Fenchel conjugacy one has: \begin{equation} \phi(w^Tx) = \sup_z \langle w^Tx,z \rangle - \phi^*(z) \end{equation} from where we conclude that \begin{equation} \phi(w^Tx) \geq \langle w^Tx,z \rangle - \phi^*(z) \end{equation} Using the above trick, we can lower bound the given objective function as: \begin{equation} F(w,\alpha) = \lambda r(w) -\frac{1}{n}\sum_i(w^tx_i\alpha_i + \phi^*_i(-\alpha_i) ) \leq P(w) \end{equation} where I have taken ${any}$ regularizer $r(w)$ and $z$ replaced by $-\alpha_i$. Next, we minimize the lower bound as: \begin{aligned} D(\alpha)&=\inf_w F(w,\alpha)\\ &=\inf_w \left( \lambda r(w) -\frac{1}{n}\sum_i(w^tx_i\alpha_i + \phi^*_i(-\alpha_i) )\right)\\ &=-\sup_w \langle w,\frac{1}{n}\sum_i x_i\alpha_i \rangle -\lambda r(w)-\frac{1}{n}\sum_i\phi^*_i(-\alpha_i)\\ &=-\lambda r^*(\frac{1}{\lambda n}\sum_i x_i\alpha_i)-\frac{1}{n}\sum_i\phi^*_i(-\alpha_i) \end{aligned} where in step 3, we used the linear algebra fact $\sum_i <.,.>=<.,\sum_i .>$; inf-sup equality and in the last step, the definition of conjugate and one property of conjuagtes (conjugate of $ah(x)$ is $ah^*(y/a)$).