According to wikipedia, the total variation of the real-valued function $f$, defined on an interval $[a,b]\subset \mathbb{R}$, is the quantity $$V_b^a=\sup_{P\in\mathcal{P}}\sum_{i=0}^{n_P-1}\left | f(x_{i+1})-f({x_i)}\right |$$ where $\mathcal{P}= \left \{P=\{x_0,\ldots, x_{n_P}\} \mid P \text{ is a partition of } [a,b]\right \}$.
This is actually the more traditional definition of total variation, found for example in Rudin's Principles of Mathematical Analysis from the 1960s. However, in modern image analysis applications, the total variation is defined differently: $$\text{TV}(f,\Omega)= \sup \, \bigg\{ \int_{\Omega} f\, div \phi \, dx : \phi \in C_c^{\infty} (\Omega,\mathbb{R}^N), \, \lvert \phi (x) \rvert \leq 1\, \forall x\in \Omega \bigg \} $$ where $\Omega$ is the domain on which we define the TV (and unlike the previous definition it could have dimension higher than 1).
Are these definitions somehow related? Perhaps equivalent?
The two definitions, at least in the one-dimensional case, are related but not equivalent.
You can see in a moment that they are not equivalent because the second definition does not depend on the equivalence class of the function $f\in L^1$, whereas the first definition strongly depends on the pointwise value of the function. As an example, the Dirichlet function (i.e. the characteristic function of the rationals) has $V_0^1 = +\infty$ wheres $TV = 0$.
On the other hand, it can be proved that if $f\in L^1(a,b)$ is such that $TV(f, (a,b))$ is finite, then there exists a representative $\tilde{f}$ of $f$ such that $V_a^b(\tilde{f}) = TV(f, (a,b))$.