This question will make brief forays beyond pure mathematics and into the world of application, so bear with me. This question also requires some small setup.
First, I have a set of functions,
$$\left\{ f(x,y_1), f(x,y_2),\dots, f(x,y_n) \right\}$$
where each $y_1, y_2, \dots, y_n$ is a single value in a predefined range of $y$. These functions have a domain (in $x$) of (0,$\infty$), and codomain of [0,$\infty$), although in practice the range is typically scaled to lie in [0,1]. In general, $f(x,y)$ is not linear in $y$; however, $f$ is also in practice a "black box" function with no (accessible) analytical representation. By extension, we do not have the ability to (readily) compute $f(x,y)$ for any $y$ not in the set $\{y_1,y_2,...,y_n\}$ - but these solutions do exist. This distinction is important. (A side note: This is of course a simplification. In practice, $f$ is highly multidimensional. The reason we must treat $f$ as a "black box" is due to it's complexity - computing a single curve $f(x,y)$ requires ~7 minutes of computation to simulate a physical observation.)
Second: We may make approximations between two curves $f(x,y_i)$ and $f(x,y_{i+1})$ by means of a weighted, linear interpolation. Given some $y_{i+\lambda}, 0 < \lambda < 1$, we note that
$$F(x,y_{i+\lambda}) = (1-\lambda)f(x,y_i) + \lambda f(x,y_{i+1}) \approx f(x,y_{i+\lambda})$$
Note that this approximation approaches equality in the case that $\Delta y = | y_i - y_{i+1} |$ approaches 0, or as $\lambda$ approaches 0 or 1. The nonlinearity of $f$ also indicates that as $\Delta y$ is large, the approximation becomes poorer. Let us define some measure of the degree of accuracy of this approximation as
$$\Delta f(x,y_{i+\lambda}) = |f(x,y_{i+\lambda}) -F(x,y_{i+\lambda})| = |f(x,y_{i+\lambda}) - [ (1-\lambda)f(x,y_i) + \lambda f(x,y_{i+1})]|$$
Which brings me to the application, and my question. I am able to define the spacing $\Delta y$, and further, it may be a function of $y$ - that is, the spacing does not need to be the same at all values of $y$. I wish to define this spacing in a way that minimizes both $n$ (the number of values $y_i$ computed) and $\Delta f$. (In application, I am computing a grid of model data sets. I want to minimize the number of models computed while still allowing reasonably accurate approximations as described above.) How can I do this efficiently?
My first thought to solve this problem, was to sample the relevant domain in an arbitrary way, then find some way to compute the distance between the curves in order to refine my spacing. To this end I have looked at metrics such as Hausdorff and Fréchet distances, which I found to be useful in broad senses but lacking in the necessary subtlety. I am struggling with the concepts behind Dynamic Time Warping, although it seems as if it might be useful, as well as the version presented in this paper, which I also barely understood. (I am more physicist than mathematician.)
Are there any other metrics which might capture the information I'm looking for? Or some other method that will suit my purposes? I would also appreciate directions towards a more lay-friendly discussion of DTW.