I am reading the paper "ON THE CONVERGENCE OF ADAM AND BEYOND". In this paper, they proposed the following framework of adaptive methods.

I was confused on the last step: $x_{t+1} = \Pi_{\mathcal{F},\sqrt{V_t}} (\hat{x}_{t+1}).$
The definition of the operation is the following:
For $A\in \mathcal{S}^+_{d} \text{ (the set of positive definite $d\times d$ matrices)},$ $\Pi_{\mathcal{F},A} (\hat{x}_{t+1}) = \arg\min_{x\in \mathcal{F}} \|A^{1/2}(x-y)\| \text{ for $y \in \mathbb{R}^d$}.$
My understanding is this is a projection and it projects the $\hat{x}_{t+1}$ onto the feasible set. But I don't understand why do we need the $A$ or $\sqrt{V_t}.$
Can someone help me understand it? Thank you in advance.