For a)
$$z =\frac{ \bar{x} - \mu }{ \frac{\sigma }{ \sqrt n}}$$
I have deciphered that sample mean is $$\frac{20 + 23 + 21 + 22}{ 4} = 21.5$$
I came up with $1.29099..$ for Standard Deviation.
Sample size is $4$ since $4$ sharks
$$z =\frac{ 21.5 - 20 }{ \frac{1.209 }{ \sqrt 4}}$$
I came up with $2.168870...$ for the test statistic.
For critical value of $z$, I used the given alpha to find the value of $2.326$ from a confidence interval of 98%.
What are the answers and what am I doing wrong ?
(Thank you for the edit)

@Flounderer's Answer is correct (+1). This is for clarification, intuition, and verification.
You cannot to a z-test here. The population standard deviation $\sigma$ is not known, and so has to be estimated by the sample standard deviation $S = 1.291.$ So your test statistic is $$T = \frac{\bar X - \mu_0}{S/\sqrt{n}}.$$ You should not write $\sigma$ in the denominator because you do not know it.
The critical value for t in this one sided test cuts off 1% from the upper tail of Student's t distribution with 3 degrees of freedom. So the critical value is 4.541. You should look in row 3 of the printed t table in your text to find that value.
Here is a printout of this one-sided one-sample t test from Minitab 17 statistical software.
The P-value 0.051 indicates that this null hypothesis could not (quite) be rejected at the 5% level because the P-value is slightly above 0.050. But you want to test at the 1% level and the decision not to reject is nowhere near the borderline.
In terms of the critical value 4.541 from the t table, you cannot reject because the computed value of the T statistic is 2.32. You could reject at the 1% level only if $T > 4.541.$
Just based on tuition, you should not expect to reject $H_0: \mu \le 20$ against the alternative $H_a: \mu > 20.$ One of the sharks captured is 20 ft long and the other three are only a little longer. Expensive and dangerous or not, catching just four sharks is not enough.
According to a more advanced computation, if the true length of sharks off the Bermuda were 22 ft (with a SD around 1.3), then a sample of size of at least $n = 6$ would have been required in order to have a reasonable chance of detecting that they average over 20 ft.