I know that the way we calculate discrete fourier transform($x$ is the given data set):\begin{aligned}X_{k}&=\sum _{n=0}^{N-1}x_{n}\cdot e^{-i2\pi kn/N}\end{aligned}and there is no problem with this definition.
I tried this on a simple data set:\begin{aligned}x=\begin{bmatrix}1\\0\\0\\1\end{bmatrix}\end{aligned} and i get:
\begin{aligned}X=\begin{bmatrix}2\\1+i\\0\\1-i\end{bmatrix}\end{aligned}
but when i try this in here i get $X/4$ instead of $X$.
i know that when i use the inverse i divide by the number of elements, but in that site they divided when they use DFT.
my question is: when do i divide by $N$? at DFT or at the inverse?
It is purely a matter of convention. Have a look at this question and its answers over at DSP.SE. You can also use the scaling constant $1/\sqrt{N}$ for both the transform and the inverse transform. In that case the DFT becomes a unitary transform.