This problem comes from R.Hartley & A.Zisserman Multiple View Geometry in Computer Vision at page 171.
With the camera center at infinity, projecting a point in 3d projective space into it's image point in a 2d projective plane can be modeled as an affine camera,especially in an orthographic projection case, the camera model is $$\begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&0&1\end{bmatrix} \begin{bmatrix}R & \mathbf t \\ {\mathbf0}^T & 1\end{bmatrix}_{4 \times 4}$$ Here $R$ is a rotation matrix of $3 \times 3$, denote $R$ as $\begin{bmatrix}r_1^T \\ r_2^T \\r_3^T \end{bmatrix}$. So the camera matrix become $$\begin{bmatrix}r_1^T & t_1 \\r_2^T & t_2 \\ \mathbf 0^T & 1 \end{bmatrix}_{3 \times 4}$$. My question is how to figure out that degree of freedom of above transformation is 5? Particularly, I cannot understand why matrix block $\begin{bmatrix}r_1^T \\ r_2^T \end{bmatrix}$ contributes 3 degrees of freedom.
Any 3d rotation has three real degrees of freedom. You can picture this in various ways: three Euler angles, simultaneous rotations around three axes, or similar for other rotation formalisms.
With a camera, you can picture the camera pointing at any point on the sky sphere, and you can describe that point with two coordinates, e.g. latitude and longitude. But even looking at the same point you can roll the camera around the optical axis, which adds the third parameter.
Removing one of the rows of the matrix doesn't really change the information contained therein: any row of an orthogonal matrix can be computed using the cross product of the other two. So you still need all three degrees of freedom for the $2\times3$ submatrix you described. When you go to a single row, then you loose one degree of freedom since a unit length vector essentially has one degree of freedom less than it has elements.