Let $\lVert A\rVert = \left(\sum_{i,j=1}^n \left\| a_{ij} \right\|^2\right)^{\frac{1}{2}} = \sqrt{\operatorname{tr}(A^\top A)}$ be the Frobenius norm on $n \times n$ matrices.
Fix $A \in GL_n(\mathbb{R})$.
1) Is there a formula for the $dist(A,O(n))$?
where $dist^2(A,O(n)) =\underset{X \in O(n)}{\text{min}} \|A - X\|^2$
(O(n) is the orthogonal group , i.e matrices satisfying $X^TX=I_d$, The minimum exists since $O(n)$ is compact)
Note: In general we don't always have a unique minimizer (i.e there can be more than one orthogonal matrice which is closest to $A$ in $o(n)$), at least if we consider non-invertible matrices $A$.
For example, $A=0$ is in the same distance from each element of $o(n)$ since the Frobenius norm of any isometry is $\sqrt n$.
Question: Can we prove the minimizer is unique (if $A \in GL_n(\mathbb{R})$)? If so, is the function: $GL_n(\mathbb{R}) \to \mathbb{R}^{n^2} \, , \, A \to X(A)$ where $X(A)$ is the minimizer, smooth? Can we provide an explicit formula for it?
I have tried using Lagrange's multipliers, but so far with no success. (I couldn't determine if the gradients of all the $n^2$ constraints are always linearly independent.
Remark: \begin{align} \|A - X\|^2 &= \mathrm{tr} \left( (A-X)^t (A-X) \right) \\\ &= \mathrm{tr} \left( (A^t -X^t) (A-X) \right) \\\ &= \mathrm{tr} (A^tA -A^tX - X^tA + X^tX) \\\ &= \mathrm{tr} (A^tA) - \mathrm{tr}(A^tX) - \mathrm{tr}((A^tX)^t) + \mathrm{tr} (I_d)\\\ &= n + \mathrm{tr} (A^tA) - 2\mathrm{tr}(A^tX) \end{align}
so minimizing $\|A - X\|$ is equivalent to maximizing $\mathrm{tr}(A^tX)$. (In particular, the objective function to optimize is linear and not quadratic)
Let us prove Omnomnomnom's suspicion. Fix $A \in M_n(\mathbb{R})$. Following the OP's observation, we want to maximize the linear functional $\varphi \colon M_n(\mathbb{R}) \rightarrow \mathbb{R}$ given by $\varphi(X) = \mathrm{tr}(A^tX)$ subject to the constraint $X^t X = I$. First, let us assume that $A$ is a diagonal matrix with non-negative entries on the diagonal. Then, given an orthogonal $X$ we have
$$ \varphi(X) = \mathrm{tr}(A^tX) = \sum_{i = 1}^n a_{ii} x_{ii} \leq \sum_{i=1}^n a_{ii} = \mathrm{tr}(A^t) = \varphi(I) $$
(because $-1 \leq x_{ij} \leq 1$ for all $1 \leq i, j \leq n$). Thus, the minimizer in this case is just $X = I$. Note that if $a_{ii} > 0$ for all $1 \leq i \leq n$ then the proof shows that $X = I$ is the unique maximizer and if some $a_{ii} = 0$ then clearly there can be infinitely many maximizers.
Now, given an arbitrary $A$, use SVD decomposition to write $A = U \Sigma V^t$. Then, $A^t = V \Sigma U^t$ and given an orthogonal matrix $X$ we have
$$ \varphi(X) = \mathrm{tr}(V \Sigma U^t X) = \mathrm{tr}(V^t (V \Sigma U^t X) V) = \mathrm{tr}(\Sigma U^t X V) = \mathrm{tr}(\Sigma W(X)) $$
where $W(X) = U^t X V$ is also an orthogonal matrix. When $X$ runs over all orthogonal matrices, so does $W(X)$ and so we have reduced the problem to the previous case and $X$ such that $W(X) = U^t X V = I$ is a maximizer. Solving for $X$, we see that $X = UV^t$. If $A$ is invertible, the matrix $\Sigma$ has strictly positive entries on the diagonal and so by the previous case the maximizer is in fact unique. This does look surprising and this implies that while the SVD decomposition is highly non-unique in general, the product $UV^t$ should be unique, at least when $A$ is invertible. It would be interesting to check this directly.
Regarding your question about a formula, you can also decompose $A$ using polar decomposition as $A = OP$ where $O$ is orthogonal and $P$ is non-negative. By orthogonally diagonalizing $P$ with $P = UDU^t$, you obtain an $SVD$ decomposition $A = OUDU^t$ for $A$ showing that $O$ is a minimizer for the distance. If $A$ is invertible, then $O$ is determined uniquely and is given by $O = A(\sqrt{A^t A})^{-1}$ where $\sqrt{A^tA}$ is the unique positive root of $A^t A$. However, as $\sqrt{A^t A}$ doesn't really have an "explicit" formula, I'm not sure how much this helps.