Okay, I need to develop an alorithm to take a collection of 3d points with x,y,and z components and find a line of best fit. I found a commonly referenced item from Geometric Tools but there doesn't seem to be a lot of information to get someone not already familiar with the method going. Does anyone know of an alorithm to accomplish this? Or can someone help me work the Geometric Tools approach out through an example so that I can figure it out from there? Any help would be greatly appreciated.
Best Fit Line with 3d Points
19.1k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
In $\mathbb{R}^3$ a line can be defined by the following set of points
$$ \{\vec \ell = \vec a + \vec n \, t \, : \, t \in \mathbb R\} $$
where $\vec a$ is some point on the line, and $\vec n$ is (normalized) line direction.
To find $\vec a$ from given set of points $\{(x_i,y_i,z_i)\}$ is quite easy, just calculate average position $(\overline x, \overline y, \overline z)$
$$ \vec a = (\overline x, \overline y, \overline z) = \frac 1N \sum_i (x_i, y_i, z_i). $$
To find line direction $\vec n$, one can solve eigen problem $K\,\vec v_k = k\,\vec v$ for the covariance matrix, expressed by
$$ \text{cov}(x,y,z) = K =\left[ \begin{matrix} \overline{x^2} - \overline x^2 & \overline{xy} - \overline x\, \overline y & \overline{xz} - \overline x\,\overline z\\[0.5em] \overline{xy} - \overline x\,\overline y & \overline{y^2} - \overline y^2 & \overline{yz} - \overline y\,\overline z\\[0.5em] \overline{xz} - \overline{x}\,\overline{z} & \overline{yz} - \overline y\,\overline z & \overline{z^2} - \overline z^2 \end{matrix}\right], $$
and then by selecting eigenvector $\vec v_k$, which corresponds to the largest eigenvalue $k$, one finds the $\vec n = \text{max}_k\,\vec v_k$.
An example code (C++) can be found here.
On
I think that I have found an alternative and easy approach. A 3D straight line would have:
xi=a.q+x'
yi=b.q+y'
zi=c.q+z'
here the variable a, b, c are the normal direction vector of 3D line. Each point on the line are (xi, yi, zi).
when average of all these points are taken for xi, yi, and zi separately; the xbar, ybar, zbar may be accepted as (x', y', z'). Now substract the averages from xi, yi, zi values to convert the equations into
xi=a.q
yi=b.q
zi=c.q
Now, all poins are alingned around (0, 0, 0) point. Therefore, all xi, yi, zi values are constant replicates of each other.
In other words, yi = C * xi; and zi = D * xi and zi = E * xi etc.
These ratios would provide us the direction vector of the line.
Just take average of all yi/xi.
Then take average of all zi/xi.
These two ratios will be the imperfect normal vector by assuming x direction value is one. i.e., (1, average(yi/xi), average(zi/xi)) is the direction vector.
At this stage, this vector's amplitude can be calculated by A=sqrt(1+ average(yi/xi)^2+average(zi/xi)^2)
The direction vector will be (1, average(yi/xi), average(zi/xi))/A = (a, b, c)
The line will pass from point (x', y', z')
Therefore, the equation of the best fit line is
xi=a.q+x'
yi=b.q+y'
zi=c.q+z'
I hope you find this easy to understand and useful. Ahmet Turer.
Here's how you can do a principal component analysis:
Let $X$ be a $n\times 3$ matrix in which each $1\times3$ row is one of your points and there are $n$ of them.
Let $\displaystyle (\bar x,\bar y,\bar z) = \frac 1 n \sum_{i=1}^n (x_i,y_i,z_i)$ be the average location of your $n$ points. Replace each row $(x_i,y_i,z_i)$ with $(x_i,y_i,z_i) - (\bar x,\bar y,\bar z)$. This is a linear transformation that amounts to replacing $X$ with $PX$, where $P$ is the $n\times n$ matrix in which every diagonal entry is $1-(1/n)$ and ever off-diagonal entry is $-1/n$. Note that $PX\in\mathbb R^{n\times 3}$. Notice that $P^2 = P = P^T$, so that $P^T P = P$.
Now look at the $3\times3$ matrix $(PX)^T PX = (X^T P^T) (P X) = X^T (P^T P) X = X^T P X$.
This matrix $X^T PX$ is $n$ times the matrix of covariances of the three variables $x$, $y$, and $z$. It is a symmetric positive definite matrix (or nonnegative definite, but not strictly positive definite, if the $n$ points all lie in a common plane). Since this matrix symmetric and positive definite, the spectral theorem tells us that it has a basis of eigenvectors that are mutually orthogonal, and the eigenvalues are nonnegative. Let $\vec e$ be the (left) eigenvector with the largest of the three eigenvalues. The the line you seek is $$ \left\{ (\bar x,\bar y,\bar z) + t\vec e\ : \ t\in\mathbb R \right\} $$ where $t$ is a parameter that is different at different points on the line, and $t=0$ at the average point $(\bar x,\bar y,\bar z)$.