This is an exercise mentioned in lots of places. I have searched around for detailed answers, but none of them has explained clearly on a critical part of analysis.
Setting: we have an oblivious Deterministic Turing Machine $M$ with 1 input tape, 1 work tape and 1 output tape; with input $x$ and $u$, where $u$ is the verifier. The execution of machine can be seen as a series of snapshots, and each snapshot is denoted as: $Z_i = <a, b, q>$ where $a$ is the input read from input tape, $b$ is the input read from working tape, and $q$ is the state of $M$ at step $i$.
Usually the answer goes like this: As in Cook-Levin Reduction, we have defined a series of snapshots $Z_i$, and each snapshot can be determined by previous state and the transition function by: $Z_i=F(Z_{i-1},Z_{prev(i)}, y_{inputpos(i)})$, where $prev(i)$ denotes the last step before $i$ when machine $M$'s head was in the same cell on its work tape that it is on during step $i$; $inputpos(i)$ denotes the location of $M$'s input tape head at the $i$th step. And we denote the final output SAT instance of Cook-Levin reduction as $\Psi_x$.
First notice that by the proof we know the length of $\Psi_x$, since we can "simulate" a machine that creates the formula without writing anything to the work or output tapes and only by keep track of the number of elements that would be otherwise written to the output tape.
The following proof all follows from this critical claim that we can simulate $M$ by running its snapshots within log space.
My Confusion: While it confused me why we can simulate this machine without writing anything to the work tape. Since to reach the next snapshot $Z_i$, we not only need the current snapshot $Z_{i-1}$ to know how to move the cursor and what to write to the current cursor position, but also we need to know what has been written under the cursor position we are moving to ($Z_{prev(i)}$).
One might argue this is a local behavior so we only need to track a small piece of information, while this is not true. We cannot know where we are moving to, and which piece of information we shall keep. Even with obliviousness, which means we know exactly where our cursor will be at each single step, so one might argue when computing $Z_i$, we can simply pre-calculate $Z_{i-1}$ and $Z_{prev(i)}$, but notice this is a recursive call, as we now also need to calculate $Z_{i-1}$ and $Z_{prev(i)}$. And at each level of recursion we need at least a counter so denote which step we are at (For example when calculate $Z_i$ we need to have a counter i, so we know we are calculating $Z_i$), and since each recursive stack requires a counter of size $O(log(|x|))$ (Even if we only need constant space at each recursive call this may not work), this could easily expand into a polynomial space calculation instead of a log space calculation.
Some may again argue that we do not keep any information when pre-calculate $Z_{prev(i)}$, while notice when calculating this you will need $Z_{prev(prev(i))}$, and how do you keep track of this without keeping anything on the stack?