Given a set of 3d points such that each point $p_i^j$ is the point $p_i^k$ rotated about some unknown axis by a known angle $\alpha_j^k$. The goal is to find the axis of rotation.
For three points $p_i^j$ with $j=1,2,3$ we could compute a circle passing through the points (without using the angles) to solve the problem. However, the points are errornous and I'm interested in a solution that minimizes the transformation error of the obtained axis.
What methods are appropriate to compute the axis with or without using the rotation angles. Since the angles are known very precisely, using them should improve the result.
Beeing a rather general problem, below I outline some specifics about my instance of the problem:
The points are identified by circular markers on a flat board which is placed on a high accuracy rotary table with a typical absolute accuracy of 0.006 degree. The points are at maximum distance of 200 mm from the rotation axis. Given the accuracy of the rotary this yields an uncertainty of 20µm measured along the circle of rotation. This is roughly a factor 10 better than the measuring precision of the marker points (which may not be independent from the angle of rotation, since it changes the angle under which the circular marks are observed by the measurement device). Currently I take three measurements by driving the rotary to some angular position $\theta_j$ in its own coordinate system. Hence, we have $a_j^k=\theta_k-\theta_j$ for $j,k \in \{1,2,3\}$. Typically, I see around $400$ corresponding points in each pair of rotated frames. Usually, we use $\theta_{j+1} = \theta_j + \Delta $ and $\Delta = 15°$.
Thanks to the inpiring post of "Nominal Animal" I come up myself with the following two-stage solution:
Compute rotary's coordinate system using SVD by using the property that all vectors $p_i^k - p_i^j$ are perpendicular to the axis of rotation. Thus, create a matrix $A$ with a row $p_i^k - p_i^j$ for each corresponding pair of points. The largest eigenvector of $A$ is the axis of rotation. Using SVD $A=USV^T$ of $A$, the matrix $V$ is an orthonormal base of the rotary.
Compute a point $c$ in space that supports the line of the rotation axis. Therefore, let the rotary rotate about its $z$-axis and let $R_j^k$ the rotation matrix in the rotary's coordinate system for $\alpha_j^k$. Given the matrix $V$ from the first step we have for each pair of corresponding points $$R_j^k V' (p_i^j - c) = V' (p_i^k - c)$$ Let $M_j^k = VR_j^k V'$ we can rewrite the above identity to a system of linear equations, which can be resolved for $c$ $$c(I_3 - M_j^k) = p_i^k-M_j^kp_i^j$$
This works well, since both steps yield an optimal solution in the least squares sense using SVD. We only use systems of linear equations and it may be possible to combine the two in one step. However, I have not figured it out yet.
This is not an answer, but a few notes on the properties of the problem at hand.
I've worked with a rather similar problem, except with $\theta$ being unknown too: estimating the rotation of atom clusters in molecular dynamics simulations (for the purpose of rotation cancellation). With certain temperature control schemes, this would be extremely useful. While there is no observational error, there is considerable noise in the positions due to a finite temperature, and often with atomic interactions at the outer layers of the cluster. It is often very hard to determine whether an atom belongs to the cluster or not, unless you have several observations. Finally, memory restrictions are sometimes such that one cannot store the coordinates for several observations; otherwise, the overall "costs" are too high, and simply using a different thermal control scheme, one where cluster rotation is not induced, is a better solution.
Let the unit vector describing the axis of rotation be $$\hat{u} = \left ( u_x, u_y, u_z \right )$$ and the angle of rotation $\theta$. For simplicity, let's also use $$\begin{align}c &= \cos\theta \\ s &= \sin\theta \\ n &= 1 - \cos\theta \end{align}$$ The matrix describing the rotation is then $$\mathbf{R} = \begin{bmatrix} u_x u_x n + c & u_x u_y n - u_z s & u_x u_z n + u_y s \\ u_y u_x n + u_z s & u_y u_y n + c & u_y u_z n - u_x s \\ u_z u_x n - u_y s & u_z u_y n + u_x s & u_z u_z n + c \end{bmatrix}$$ and the overall problem is to fit $\hat{u}$ into $$\vec{q}_i = \mathbf{R} \vec{p}_i + \vec{T} + \vec{\epsilon}_i$$ where $\vec{T}$ is the translation keeping some point $\vec{o}$ stationary, $\vec{T} = \vec{o} - \mathbf{R}\vec{o}$, i.e. $$\vec{q}_i = \mathbf{R} \left ( \vec{p}_i - \vec{o} \right ) + \vec{o} + \vec{\epsilon}_i$$ and $\vec{\epsilon}_i$ are small unknown errors in the observations.
If we assume all errors $\vec{\epsilon}_i$ have the same distribution, say uniform within some range, or perhaps a Gaussian, then the errors in observations for points $\vec{p}_i$ closest to the axis affect the result the most. Because of that, you'll want to add weights $w_i$ to each point for fitting, using some monotonic function that is zero at the axis, and increases as the distance from point $\vec{p}_i$ to the axis increases.
If we assume point $\vec{o}$ is on the axis, then the distance from $\vec{p}_i$ to the axis is $$d_i = \left\lvert \hat{u} \times \left ( \vec{o} - \vec{p}_i \right ) \right\rvert$$ and I personally would start with $w_i = d_i^2$.
Why not use $$\Delta_i = \left\lvert \vec{q}_i - \vec{p}_i \right\rvert$$and find the axis (where $\Delta = 0$)?
That should actually work if $\theta$ is small, so that the straight distance is a good approximation for the length of the circular arc for $i$. The larger $\theta$ is (in magnitude, up to $180°$, of course), the larger the difference between the arc length and the endpoints' distance. For a sensible fitting model, the two should be approximately equal for the fit to correctly locate the axis (at $\Delta = 0$).
For small $\theta$, one could actually simply assume $$\Delta_i = \left\lvert \vec{q}_i - \vec{p}_i \right\rvert + \epsilon_i$$where $\epsilon_i$ is some signed random number, with the same distribution as the radial component in the observational error for point $i$.
Again, this means that larger $\Delta_i$ should be weighted more than smaller $\Delta_i$ when fitting (to find the axis; zero $\Delta$).
An iterative scheme, where the known angle, and the axis from the previous step is used to calculate the expected displacement for point $i$, would allow one to estimate the amount of error in the observations for each point -- actually, the error vectors $\vec{\epsilon}_i$. If we know or assume some distribution for them, we could use the error in the distribution to adjust the axis location. As long as the adjustment results in a better error distribution, this should optimize the solution.