I'm struggling with the concept of conditional expectation. First of all, if you have a link to any explanation that goes beyond showing that it is a generalization of elementary intuitive concepts, please let me know.
Let me get more specific. Let $\left(\Omega,\mathcal{A},P\right)$ be a probability space and $X$ an integrable real random variable defined on $(\Omega,\mathcal{A},P)$. Let $\mathcal{F}$ be a sub-$\sigma$-algebra of $\mathcal{A}$. Then $E[X|\mathcal{F}]$ is the a.s. unique random variable $Y$ such that $Y$ is $\mathcal{F}$-measurable and for any $A\in\mathcal{F}$, $E\left[X1_A\right]=E\left[Y1_A\right]$.
The common interpretation seems to be: "$E[X|\mathcal{F}]$ is the expectation of $X$ given the information of $\mathcal{F}$." I'm finding it hard to get any meaning from this sentence.
In elementary probability theory, expectation is a real number. So the sentence above makes me think of a real number instead of a random variable. This is reinforced by $E[X|\mathcal{F}]$ sometimes being called "conditional expected value". Is there some canonical way of getting real numbers out of $E[X|\mathcal{F}]$ that can be interpreted as elementary expected values of something?
In what way does $\mathcal{F}$ provide information? To know that some event occurred, is something I would call information, and I have a clear picture of conditional expectation in this case. To me $\mathcal{F}$ is not a piece of information, but rather a "complete" set of pieces of information one could possibly acquire in some way.
Maybe you say there is no real intuition behind this, $E[X|\mathcal{F}]$ is just what the definition says it is. But then, how does one see that a martingale is a model of a fair game? Surely, there must be some intuition behind that!
I hope you have got some impression of my misconceptions and can rectify them.
I happened to read an article on Wikipedia today on Conditional Expectation. That clarified a lot of my questions. Hope it helps!
For your first question, in the linked article, there is the definition for conditional expectation of a r.v. $X: \Omega \rightarrow \mathbb{R}$ given a sub sigma algebra $\mathcal{F}$ of the one $\mathcal{A}$ on domain $\Omega$. It is a $\mathcal{F}$-measurable function $: \Omega \rightarrow \mathbb{R}$, denoted as $E(X \vert \mathcal{F})$. If you evaluate this conditional expectation at a point $\omega \in \Omega$, you will get a value $E(X \vert \mathcal{F})(\omega)$, which is called the conditional expectation of $X$ given $\mathcal{F}$ at $\omega$.
When the r.v. $X$ is an indicator function on some measurable subset say $A \in \mathcal{A}$, its conditional expectation given the sub sigma algebra is called the conditional probability of the subset $A$ given the sub sigma algebra $\mathcal{F}$, denoted as $P( A \vert \mathcal{F})$. It is a mapping: $\Omega \rightarrow \mathbb{R}$.
If we let $A$ vary within $\mathcal{A}$, the conditional probability $P( \cdot \vert \mathcal{F})$ is a mapping: $\mathcal{A} \times \Omega \rightarrow \mathbb{R}$. In some cases, $\forall \omega \in \Omega$, $P( \cdot \vert \mathcal{F})(\omega)$ is a probability measure on $(\Omega, \mathcal{F})$, in which case $P(\cdot \vert \mathcal{F})$ is called a regular conditional probability.
When $\mathcal{F}$ is generated by another r.v. $Y$, then the conditional expectation and conditional probability will be called the ones given the r.v. $Y$.