The typical set $S$ is given as consisting of all sequences $x$ with at most three 1s.
Here $x \in ${0,1}$ ^{200} $ and $P(0) = 0.99$
We can use $\log_{2}|S|$ to encode such sequences. And that is the best one can do in case of equal length codewords I think.
But how to get the minimum expected no. of bits required for encoding the sequences in $S$ ?
Should I think in terms of the ceil of the entropy of the least probable one of these sequences times the number of such sequences?
You should have specified you were interested in minimizing the expected codeword length. Lets look at arbitrary $n,$ and consider $\{0,1\}^n.$
In that case you can assign a 1 bit codeword, say $0$, to the all zero sequence which has probability ($p=P(0)$) $$ p^n, $$ In general, you can specify all weight $w$ codewords with $$ W_w = \left\lceil \log_2 \binom{n}{w}\right\rceil $$ bits for $w=1,2,3.$ You have 4 classes of typical sequences $W_0,W_1,W_2,W_3$ and you can specify which class you have with a prefix of 2 bits [note for $p=0.99$ there may be a slightly more efficient use of the prefix but I don't think so since $.99^{200}$ is already a small number]. For the all zero codeword just use the prefix 00 and no extra bits are necessary.
So there exists a good code with small expected codeword length $$ \overline{L}=2+\sum_{w=1}^3 W_w \binom{n}{w} p^{(n-w)}(1-p)^w. $$
Using $\log_2 |S|$ I get 20.347 bits are needed so 21 bits are needed.
The expectation above is approximately 9.64 bits so you can get an expected codeword length of 10+2=12 bits, if I have calculated correctly (please check).