Let $Y$ be a random variable and $E$ an event. In case $Y$ is discrete, I know that $$\mathbb{P} (E) = \sum _{y} \mathbb{P} (E | Y=y) \ \mathbb{P} (Y=y)$$ is just a variant on the Law of Total Probability.
However, for continuous $Y$ I often see the formula, $$\mathbb{P} (E) = \int _{-\infty} ^{+\infty} \mathbb{P} (E | Y=y) f_Y (y) \ dy,$$ where $f_Y$ is the probability density function of $Y$.
How do I make sense of the conditional probability $\mathbb{P} (E|Y=y)$, since $\mathbb{P} (E \cap \{ Y=y\})$ divided by $\mathbb{P} (Y=y)$ yields $0/0$? And how can I prove the above formula? Is there a simple explanation, or do I need measure theoretic probability theory in order to comprehend this?
There are different ways to define this "conditional probability" with respect to the zero-mass event $Y=y$ which agree in appropiate sense.
1) If you know the notion of conditional expectation, recall the following statement:
Let $(\Omega, \mathcal F)$ be a measurable space and $Y:\Omega \to \mathbb R$, $ g:\Omega\to \mathbb R$ measurable functions such that $g$ is in fact $\sigma (Y)$-measurable. Then there exists an measurable function $h:\mathbb R \to \mathbb R$ such that $g = h(Y)$.
In this terms, take $g = \mathbb P (E \vert Y) $, thus there exists $h$ such that $\mathbb P (E\vert Y) = h(Y) $. Then define $\mathbb P (E \vert Y= y) := h(y)$.
2) Second possibility is the way of Lucas' answer.
3) Another matching definition can be defining exactly what you need:
If $X$ is another random variable (not singular) let $f(x,y)$ be the joint density of $X,Y$ (with respect to $\mu \otimes \lambda$ where $\mu \in \{\lambda , \chi\}$ , where $\lambda$ denotes the Lebesgue-measure and $\chi$ the counting measure). Then define
$$\mathbb E (X \vert Y = y) := \int_{\mathbb{R}} \frac{f(x,y)}{f_Y(y)} \text d \mu (x)$$
All in all in my opinion Lucas' way is the most intuitiv one. But with 1) and 3) the "law of total probability" is immediatly fulfilled.