A multi-head Turing machine is a Turing machine with a single tape but multiple heads. Initially, all the heads are positioned above the first cell of the input. The transition function is modified to allow reading, writing, and moving of the heads. Formally,
$\delta: Q \times \Gamma^k \rightarrow Q \times \Gamma^k \times \{L, R\}^k$
where $k$ is the number of heads. The expression
$\delta(q_i, a_1, ..., a_k) = (q_j, b_1, ..., b_k, L, R, ..., L)$
means that if the machine is in state $q_i$ and the heads from 1 to $k$ read symbols $(a_1, ..., a_k)$, then the machine transitions to state $q_j$, writes symbols $(b_1, ..., b_k)$, and moves the heads left and right as specified.
Prove that any multi-head Turing machine can be simulated by a single-tape deterministic Turing machine.
I already know how to demonstrate multi-tape Turing machines are equivalent to normal Turing machines. I could easily demonstrate that for any multi-head Turing machine $M_{MH}$, there exists some multi-tape Turing machine $M_{MT}$ such that L($M_{MH}$) = L($M_{MH}$).
The exercise asks you to explicitly use single-tape deterministic Turing machine, unfortunately even using different markers and settings for $w$ I can't solve the exercise.
Any hint?

Here's a rough idea. Given a $k$-multi-headed Turing machine $M=(Q,\Gamma,q_0,\delta,F)$, build a single-tape deterministic Turing machine $M'$ as follows. Let $\Gamma' = \Gamma \sqcup \{z_i\}_{i=0}^k$. These new symbols will be used to delimit "zones" on the tape, with each zone belonging to a head on the original $M$. Make it so that when $M'$ starts (with the tape empty), it sets the tape something like:
z_0␣z_1␣z_2␣...␣z_kwith␣meaning empty cell, andz_imeaning the symbol $z_i$.Once $M'$ has finished setting up the tape, it may start simulating the multiple heads, cycling between them. It starts at the cell between $z_0$ and $z_1$, assuming the role of the first head. Since there are a finite amount of heads, a finite amount of symbols and a finite amount of states in $M$, you can just hardcode what $M'$ should do on reading the symbol between $z_0$ and $z_1$, using $\delta$. If $M'$ "runs out of space", i.e. it reads the symbol that starts the zone allocated to the next or previous head, you can go into an "expansion mode" where you shift over the contents of the tape until you have enough space. Once the shifting is done, you go back into "simulation mode", finish that step for the first head. Then $M'$ goes to the next zone and starts simulating the next head. Once $M'$ reads the symbol $z_k$, it has to cycle back and start simulating the first head again.
Hope that helps.