How can a function including a sin operation be linearly transformable for any offset?

307 Views Asked by Bumbble Comm At 10 May 2026 - 4:08

In the paper "Attention is all you need" the authors have chosen a function to encode the position of a word in a sequence (section 3.5). The following encoding is chosen:

$ PE(pos, 2dim) = sin(pos / 10000 ^ {2dim/d_{model}} ) $

For the purposes of this question this function can be simplified to:

$ PE(pos) = sin(pos) $

The text states that "for any fixed offset $k$, $PE(pos+k)$ can be represented as a linear function of $PE(pos)$". This did not seem obvious due to me due to the nonlinearity of the sine function. Other resources like Attention is all you need Explained mention this property but do not go deeper into it.

I attempted to use linear regression techniques in Python to derive this, but was unable to find a fitting linear transform. As $k$ increases and the sine waves resulting from the $PE(pos)$ function get out of sync, the correlation of the transformation and the truth decreases.

Did I misapprehend the statement in the paper, or is my code or understanding of the underlying math here faulty?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 20 Feb 2019 - 8:47 BEST ANSWER

Upon closer inspection, the article defines the function $\operatorname{PE}$ separately for even and odd dimension as \begin{eqnarray*} \operatorname{PE}(\text{pos},2d) &=&\sin(\text{pos}/c^d),\\ \operatorname{PE}(\text{pos},2d+1) &=&\cos(\text{pos}/c^d), \end{eqnarray*} for some constant $c=10000^{\frac{2}{d_{\text{model}}}}$. The trigonometric identity $$\sin(\alpha+\beta)=\sin(\alpha)\cos(\beta)+\cos(\alpha)\sin(\beta),$$ then yields the identity $$\operatorname{PE}(\text{pos}+k,2d)=\operatorname{PE}(\text{pos},2d)\cos(k/c^d)+\operatorname{PE}(\text{pos},2d+1)\sin(k/c^d),$$ which the authers seem to call a linear function of $\operatorname{PE}(\text{pos})$.

How can a function including a sin operation be linearly transformable for any offset?

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in GEOMETRY

Related Questions in LINEAR-TRANSFORMATIONS

Related Questions in NONLINEAR-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions