Suppose you have an $n$ dimensional data vector $x = (x_1, \ldots, x_n)$ and two classes $y = 0$ or $y = 1$. Assuming the dimensions of $x$ are conditionally independent given $y$, and that the conditional likelihood of each $x_i$ is Gaussian with $\mu_{i0}$ and $\mu_{i1}$ as the means of the two classes and $\sigma_i$ as their shared standard deviation.
We can see that $p(y = 1 \mid x)$ takes the form of a logistic function $\sigma(wx + b)$
Now assume that a Gaussian prior is placed on each $w$ such that $p(w_i) \sim \mathrm{normal}(0, \sigma)$. My question is how do we find the posterior distribution $p(w,b\mid N)$ based on the prior and the likelihood function of logistic regression? ($N$ is our $N$-dimensional sample)
At the end the log likelihood of this posterior $L(w, b)$ is supposed to take the form of an $L2$ regularized version of the likelihood function.
I started with deriving an expression for the likelihood function. I got $$- \sum_i \log(1 + e ^{b + xw}) + \sum_i y_i(b + xw).$$
Now I was thinking of using bayes rule to derive the posterior but then I don't know how to calculate the $p(N\mid b, w)$ term.
A hint would be much appreciated.
This question is basically the same as how we derive the expression for penalized logistic regression based on prior probabilities.