Here's a definition and an example/framework from our class.
Definition 5.4.3 (Martingale) A martingale is a sequence $Z_{0}, Z_{1}, \ldots, Z_{m}$ of random variables such that, for each $1 \leq i \leq m$, we have $\mathbb{E}\left[Z_{i} \mid\left\{Z_{j}: j<i\right\}\right]=Z_{i-1}$.
Loosely speaking, given what has previously transpired, we expect nothing to change in the $i$th step.
Recall
- $G \sim G(n, p),$ and we are exploring a graph parameter $f(G)$
- $S_{i}=\left\{e_{j}: j \leq i\right\}$
- $Z_{i}=\mathbb{E}\left[f(G) \mid E(G) \cap S_{i}\right]$
Conditional expectations
- $\mathbb{E}\left[Z_{i+1} \mid E(G) \cap S_{i}\right]= \mathbb{E}\left[\mathbb{E}\left[f(G) \mid E(G) \cap S_{i+1}\right] \mid E(G) \cap S_{i}\right] = \mathbb{E}\left[f(G) \mid E(G) \cap S_{i}\right]=Z_{i}$
- $\Rightarrow$ this is a martingale
I have a few questions:
- Why is the condition in $\mathbb{E}\left[Z_{i+1} \mid E(G) \cap S_{i}\right]$ defined that way? By the definition of martingales, I was expecting the condition to be in terms of the expected values of the parameter $f(G)$, given $E(G) \cap S_j$ for $j \leq i$, i.e. the condition should be $\mathbb{E}[f(G) | E(G) \cap S_j: j \leq i]$. How are the two equivalent?
- How do we get this equality $\mathbb{E}\left[\mathbb{E}\left[f(G) \mid E(G) \cap S_{i+1}\right] \mid E(G) \cap S_{i}\right] = \mathbb{E}\left[f(G) \mid E(G) \cap S_{i}\right]$?
- In general, is there an intuitive interpretation of martingales, or are they just a convenient mechanism that we use?
- I'm also analyzing the case where $f$ is $c$-(edge-)Lipschitz, i.e., for $c > 0$, and for any edge $e$, $|f(G)-f(G \Delta e)| \leq c$. The aim is to make it possible to use Azuma's inequality, which requires $Z_0 = 0$ and $|Z_i - Z_{i-1}| \leq 1$. So for the martingale above, we can normalize it to obtain the martingale: $$Z_{i}=\frac{1}{c}\left(\mathbb{E}\left[f(G) \mid E(G) \cap S_{i}\right]-\mathbb{E}[f(G)]\right).$$ And so: $$|Z_{i} - Z_{i-1}| = \frac{1}{c} |\mathbb{E}[f(G) \mid E(G) \cap S_{i}] - \mathbb{E}[f(G) \mid E(G) \cap S_{i-1}]|.$$ I'm a bit lost about what operations I can carry out with respect to the expression between the absolute value signs, so that I can obtain $$|\mathbb{E}[f(G) \mid E(G) \cap S_{i}] - \mathbb{E}[f(G) \mid E(G) \cap S_{i-1}]| \leq c.$$ My interpretation is that, from the time $i-1$ with $(E(G) \cap S_{i-1})$ to the time $i$ with $(E(G) \cap S_{i})$, the number of edges in $G$ changes at most by $1$, so the value $f(G)$ changes at most $c$ (as per the stipulated condition). That makes sense but I can't verify it with computations, plus we're actually concerned with $\mathbb{E}[f(G)|...]$, not just $f(G)$, and I don't know if there's a difference between the two.
Question 1 is the hard one...
The definition of martingale you were given is incomplete. The proper general definition of martingale is given in terms of a "filtration".
If you are familiar with the concept of a "filtration" and/or sigma-algebras, please skip ahead. Otherwise, think of a filtration as an increasing sequence $(\mathcal{F}_n)_{n\geq0}$ of known "information" indexed by time. If we are observing a stochastic system over time then, even though we (usually) don't know exactly what's going to happen in the future, at any time $n$ we know what the system looks like now, and how it looked at every point in the past. A filtration captures this notion: specifically, the set $\mathcal{F}_n$ is the complete history of the system up to and including time $n$.
The proper definition of a martingale states that a sequence $Z_n$ is a martingale with respect to a filtration $(\mathcal{F}_n)_{n\geq0}$, if the sequence is "adapted" to the filtration (meaning that it's part of the information that we're tracking, so the filtration $\mathcal{F}_n$ includes complete information about $Z_1,\dots,Z_{n}$, and possibly other additional information as well) and we have: $$E[Z_{n+1}\mid\mathcal{F}_n] = Z_n$$ for all times $n$.
Let me point out again that the filtration generally contains complete information about the entire observed system that we're considering, as it evolves over time, while a particular $Z_n$ may just be some part of the system's behavior.
The definition you were given isn't wrong because we often talk about $Z_n$ being a martingale with respect to the filtration "generated" by $Z_n$. This is the filtration that contains information about $Z$ itself, as it evolves over time, but nothing else. In that case, conditioning in $\mathcal{F}_n$ means exactly the same thing as conditioning on $Z_1,\dots,Z_n$, so if you adjust the indices appropriately, you'll get back $E[Z_i\mid\{Z_j : j < i\}] = Z_{i-1}$.
Soooo, using this more general definition of martingale, the claim being made is that an "exploration" of a graph over time: $$Z_i = E[g(G) \mid E(G)\cap S_i]$$ is a martingale with respect to the filtration generated by the sequence $E(G)\cap S_i$.
The sequence $Z_i$ is adapted to this filtration by the properties of conditional expectation. If you know everything about $E(G)\cap S_i$, then you know the value of any expectation conditioned on that information, including $Z_i$. In addition, the claim is being made that: $$E[Z_{i+1} \mid E(G) \cap S_i] = Z_i$$ or, written in terms of filtrations: $$E[Z_{i+1} \mid \mathcal{F}_i] = Z_i$$ which matches the general definition.
It turns out that if $Z$ is a martingale with respect to any filtration $\mathcal{F}_n$, it's also a martingale with respect to the filtration generated by $Z$ (i.e., matching your original definition). This is because: $$E[Z_{i+1} \mid \{Z_j : j \leq i\} ] =E[ E[Z_{i+1} \mid \mathcal{F}_i] \mid \{Z_j : j \leq i\} ] =E[ Z_i \mid \{Z_j : j \leq i\} ] = Z_i $$ Here, the first equality is an application of the law of iterated expectations (see the general version on Wikipedia), because the "inner" conditional expectation conditions on a larger filtration (more information) than the "outer" conditional expectation. Remember that $\mathcal{F}_i$ contains at least all the information about $Z_j$ up to time $i$. The second equality follows from the martingale property with respect to $\mathcal{F}_n$. The third equality follows because $Z_i$ is part of the information on which we're conditioning.
For question 2, this is also the law of iterated expectations, because the inner conditional expectation conditions on more information than the outer conditional expectation. That is, it's true for any filtration and any random variable that: $$E[E[Y\mid\mathcal{F}_{i+1}]\mid\mathcal{F}_i] = E[Y\mid\mathcal{F}_i]$$ and this is just a particular choice of random variable and filtration.
For question 3, the martingale property can be generalized to: $$E[Z_n | \mathcal{F}_m] = Z_m$$ for any $n > m$, which means that -- for a martingale -- its expected future value at any point in time, conditioned on what we know now, is its current value. So, intuitively, the expected value of a martingale is "stable".
To sum up the answers: