There's a theorem in my book (Complexity and cryptography by Talbot and Welsh, chapter 4) where I don't understand some parts of its proof:
THEOREM: Suppose $f \in \mathbb Z[x_1,..., x_n]$ has degree at most $k$ and is not identically zero. If $a_1,...,a_n$ are chosen independently and uniformly at random from $\{1,...,N\}$then $$Pr[f(a_1,...,a_n) = 0] \leq \frac kN$$ Proof: We use induction on n.$\color{blue}{\text{For } n = 1}$ the result holds since a polynomial of degree at most $k$ in a single variable has at most $k$ roots. So let $n > 1$ and write $$f = f_0+ f_1x_1+ f_2x_1^2+\dots+ f_tx_1^t$$ where $f_0,\dots, f_t$ are polynomials in $x_2, x_3,... x_n$; $f_t$ is not identically zero and $t \geq 0$. If $t = 0$ then $f$ is a polynomial in $n −1$ variables so the result holds. So we may suppose that $1 \leq t \leq k$ and $f_t$ is of degree at most $k −t$. We let $E_1$ denote the event ‘$f(a_1,\dots,a_n) = 0$’ and $E_2$ denote the event ‘$f_t(a_2,\dots,a_n) = 0$’. Now $$Pr[E_1] = Pr[E_1 | E_2] Pr[E_2]+Pr[E_1 | not E_2] Pr[not E_2] \leq Pr[E_2]+Pr[E_1 | not E_2]\tag{1}$$ Our inductive hypothesis implies that $$Pr[E_2] = Pr[f_t(a_2,...,a_n) = 0] \leq\frac{k −t}N$$ since $f_t$ has degree at most $k −t$. Also $$Pr[E_1 | not E_2] \leq \frac tN$$ This is true because a_1 is chosen independently of $a_2,...,a_n$, so if $a_2,...,a_n$ are fixed and we know that $f_t(a_2,...a_n) = 0$ then f is a polynomial in $x_1$ that is not identically zero. Hence f , as a polynomial in $x_1$, has degree t and so has at most t roots. Putting this together we obtain $$Pr[f(a_1,...,a_n) = 0] \leq \frac{k −t}N +\frac tN \leq \frac kN$$ as required.
- I don't understand how it holds for $n=1$ and $t=0$.
- Where did the formula $(1)$ come from? I don't get it.
For your first question, notice that the given proof doesn't actually have a case "$n=1$ and $t=0$", because the case $n=1$ is treated separately, and $t$ is introduced only in the argument for $n>1$. If you nevertheless want to consider $n=1$ and $t=0$, then you're dealing with a constant polynomial (because $t=0$) of one variable (because $n=1$). Furthermore, the constant isn't $0$, because of the hypothesis in the theorem that $f$ isn't identically zero. So $f$ is a non-zero constant, and the probability that $f(a_1,\dots,a_n)=0$ is zero, as the theorem claims.
For your second question, equation (1) comes from combining three facts. First, the event $E_1$ is the disjoint union of the two events $E_1\cap E_2$ and $E_1\cap\sim E_2$ (where I'm using $\sim$ for "not"). Second, by definition of conditional probability, $$ Pr[E_1\cap E_2]= Pr[E_1\mid E_2]\cdot Pr[E_2], $$ and similarly with $\sim E_2$ in place of $E_2$. That gives the equality at the beginning of (1). Third, all probabilities and all conditional probabilities are between $0$ and $1$, inclusive. Applying that to $Pr[E_1\mid E_2]$ and to $Pr[\sim E_2]$, you get the inequality at the end of (1).
EDIT, in answer to a comment: The case $n=1$ is covered by the fact from elementary algebra that a polynomial, in one variable, of degree $k$ cannot have more than $k$ roots. So of the $N$ possible choices for $a_1$, at most $k$ make $f(a_1)=0$. So, if $a_1$ is chosen art random from the $N$ possibilities, the probability of getting $f(a_1)=0$ is at most $k/N$.
For $n>1$ and $t=0$, the polynomial $f$, though apparently a polynomial in $n$ variables $x_1$ to $x_n$, doesn't actually involve $x_1$ at all; that's what $t=0$ means. So it's really a polynomial in $n-1$ (or fewer) variables. Since the proof is by induction on $n$, the desired conclusion for this $f$ will be given by the induction hypothesis.