I'm reading Hindry & Silverman's book on Diophantine Geometry, specifically the section on height functions.
At the beginning they speak of two important properties that a height function $h:V\rightarrow [0,\infty)$ should satisfy: first, the set of points in the complex projective variety $V$ of bounded height should be finite. Second, it should translate geometric properties of points in $V$ to arithmetic properties.
The second property is a natural thing to strive for, if somewhat difficult to quantify. But I don't see any particular reason to ask for the first property to be true. I understand that such a property is fundamental in many finiteness proofs - to show that $V(\mathbb{Q})$ is finite, for example, one can simply show that its points have finite, bounded height.
So, what exactly is the motivation for seeking this property in height functions? Or was the theory actually developed the other way around - geometers knew that they needed functions with this property in order to tackle certain problems?
It might help if you substitute the word "complexity of an object" for the "height of an object." It's very natural, when studying infinite sets such as the set of points in $\mathbb P^N(K)$ where $K$ is a number field, to try to (roughly) order them by complexity. For any sort of counting results, one wants only finitely many elements whose complexity is smaller than any given bound $B$. In principle, and I think this is the correct intuition, is that the height (complexity) of an object $\mathcal{O}$ is equal to (again, somewhat roughly) $$ \operatorname{height}(\mathcal{O}) = \operatorname{Complexity}(\mathcal{O}) = \text{Number of bits it takes to describe $\mathcal{O}$.} $$ E.g., for a an integer $n\in\mathbb Z$, we have $h(n)=\log(n)$. (Use $\log_2$ if you want, but it won't matter much.) Here's a typical problem: How many primes are there? The way people approach the problem is to ask how fast the following function grows: $$ \pi(X) = \#\{ \text{primes $p$} : \operatorname{Complexity}(p) \le X \}. $$ Similarly for $\mathbb P^N(K)$. So heights are not used only to prove finiteness theorems by showing that the elements in a set have bounded height, but also for infinite sets they are used to give an arithmetic (complexity theoretic) measure of how large the set is by measuring the growth of the height (complexity) counting function.
Then, when trying to solve problems and prove theorems, one is naturally lead to analyze how the complexity of objects change when one has a map between sets of objects. It would be nice if the complexity always varied in some regular way, but generally that's only true if the maps are nice enough. So it's a balancing act.
For an illustration of this, Weil heights in arithmetic geometry have nice functorial transformation properties for morphisms between smooth (or at least not too singular) projective varieties. But for dominant rational maps, and indeed even for birational automorphisms, heights can behave in ways that are very hard to analyze.