Intrinsic camera parameter from three vanishing points

2.6k Views Asked by At

I would like to recover the intrinsic parameter from three vanishing points of a digital image. I read a bit on the famous textbook Multiple View Geometry in Computer Vision (pag. 226) but I couldn't be able to get a satisfactory result. In detail I need to recover the K matrix with three unknowns: focal length f , u and v. It can be considered as a simplified case where K has the form: $$ K = \begin{bmatrix}f & 0 & u\\0 & f & v\\0 & 0 & 1\end{bmatrix} $$ I start from three vanishing points in pixel coordinates, calculated as the intersections of three sets of parallel lines. They are in the form: $$ V_1 = \begin{bmatrix}x_1\\y_1\\1\end{bmatrix}= \begin{bmatrix}212\\2138\\1\end{bmatrix}, V_2 = \begin{bmatrix}x_2\\y_2\\1\end{bmatrix}= \begin{bmatrix}-49\\42\\1\end{bmatrix}, V_3 = \begin{bmatrix}x_3\\y_3\\1\end{bmatrix}= \begin{bmatrix}1105\\146\\1\end{bmatrix} $$

The author suggests to consider this matrix $\omega$:

$$ \omega = \begin{bmatrix}\omega_1 & 0 & \omega_2\\0 & \omega_1 & \omega_3\\\omega_2 & \omega_3 & \omega_4\end{bmatrix} $$ and recover its values by the orthogonality property of vanishing points, stacking three equations as constraints:

$$V_1^T*\omega *V_2=0$$

$$V_1^T*\omega *V_3=0$$

$$V_3^T*\omega *V_2=0$$ This will generate a linear matrix equation of the form $A_{3,4} * \omega_{4,1} = 0_{3,1}$. Then $\omega$ can be calculated as the nullspace of $A$. Finally $K$ can be recovered by a Cholesky factorization as $\omega = (KK^T)^{-1}$.

Here my $\omega$ matrix: $$ \omega = \begin{bmatrix}1.6372\cdot 10^{-6} & 0 & -6.2870\cdot 10^{-4} \\0 & 1.6372\cdot 10^{-6} & -4.7153\cdot 10^{-4}\\-6.2870\cdot 10^{-4} & -4.7153\cdot 10^{-4} & 1\end{bmatrix} $$ which gives me

K = chol(inv(w));

$$ K = \begin{bmatrix}921 & 193 & 0.970\\ 0 & 841 & 0.396 \\0 & 0 & 1\end{bmatrix} $$ Unfortunately the results are not like expected: $K$ matrix presents non-zero term in position (1,2) and difference between term (1,1) and (2,2). I think I did some mistake all they way through. Could you help me in any way? Maybe pixel coordinates aren't right for this procedure?

I link here a picture rapresenting the scene enter image description here