Given a set of $N$ lines $\mathbf{L}$, each of which is defined by a point $\mathbf{a}$ and the unit vector direction it points in $\hat{\mathbf{d}}$, I have the following function:
$$ C = \frac{1}{N}\sum_{i \in L}\| \mathbf{c} - \mathbf{a_i} \|^2 - [(\mathbf{c} -\mathbf{a_i})\cdot \mathbf{\hat d_i}]^2 $$
You make recognize this as the sum of the squared distances from a point $\mathbf{c}$ to a set of lines.
I want to solve for $\frac{\partial C}{\partial \mathbf{a_j}}$ and $\frac{\partial C}{\partial \mathbf{\hat{d_j}}}$, under the constraint that $\frac{\partial C}{\partial\mathbf{c}} = 0$
I am having trouble solving for these partial derivatives because $\mathbf{c}$ can't be treated as a constant due to the fact that it is a function of all of the $\mathbf{a_i}$ and $\mathbf{\hat{h_i}}$ describing the lines in $\mathbf{L}$.
If it helps, I arrived at this equation trying to solve the following problem:
"If a set of lines $\mathbf{L}$, each of which is defined by a point $\mathbf{a}$ and the unit vector direction it points in $\hat{\mathbf{d}}$, do not intersect, how does the minimum distance between all of the lines change as $\mathbf{a}$ and $\mathbf{\hat{d}}$ are changed? The minimum distance between all of the $N$ lines is defined as the average distance from the point $\mathbf{c}$, where $\mathbf{c}$ is the point that has the least distance to all of the $N$ lines."
What I'm really trying to figure out is a measure of how "close" a set of lines are to intersecting perfectly, and how changing the line parameters slightly affects that "closeness."
If there is any better way to solve this problem than what I have proposed, please let me know. Thanks!
Let $A$ be the matrix whose columns are the $\{a_k\}$ vectors, and $D$ be the matrix whose columns are $\{d_k\}$.
Next define a matrix $$C=\sum_k ce_k^T$$ all of whose columns are equal to the vector $c$.
Finally define the matrix $M=(A-C)$.
Given the standard basis vectors $\{e_k\}$ we can write $$ \eqalign { a_k &= Ae_k, &\,\,\,\,d_k = De_k, &\,\,\,\,(a_k-c) = Me_k \cr } $$
Since $C$ has been used to represent a matrix, let's use the symbol $L$ to represent the cost function (multiplied by $N$). Let's write it in terms of the above matrices and find its differential
$$ \eqalign { L &= \sum_k Me_k:Me_k - (Me_ke_k^TD^T)^T:(Me_ke_k^TD^T) \cr\cr dL &= 2\,\sum_k Me_k:dM\,e_k - (Me_ke_k^TD^T)^T:d(Me_ke_k^TD^T) \cr &= 2\,\sum_k ME_{kk}:dM - (DE_{kk}M^T):(dM\,E_{kk}D^T+ME_{kk}\,dD^T) \cr &= 2\,\sum_k (ME_{kk}- DE_{kk}M^TDE_{kk}):dM - (E_{kk}M^TDE_{kk}M^T):dD^T \cr &= 2\,\sum_k (ME_{kk}- DE_{kk}M^TDE_{kk}):(dA-dC) - (ME_{kk}D^TME_{kk}):dD \cr } $$ where the colons represent the double-contraction product, i.e. $$A:B={\rm tr}(A^TB)$$ From the differential expression we can see that $$ \eqalign { \frac{\partial L}{\partial A} = -\frac{\partial L}{\partial C} = 0 \cr } $$ by virtue of the constraint.
This leaves us with the gradient wrt $D$ as the only non-zero term $$ \eqalign { \frac{\partial L}{\partial D} &= -2\,\sum_kMe_ke_k^TD^TMe_ke_k^T \cr \frac{\partial L}{\partial d_j} &= -2\,Me_jd_j^TMe_j \cr &= 2\,(c-a_j)d_j^T(a_j-c) \cr } $$ To recover your original cost function, divide these expressions by $N$.