Measure of complexity of a function

902 Views Asked by At

Generally, when we try to write a function to fit to a group of points (curve fitting), it is important to find a balance between the accuracy and complexity of the function. The following picture shows different examples of this.

enter image description here

If the function is excessively simple at the cost of accuracy, it will underfit (such as with the line). On the other hand, if the function is excessively complex, it will overfit (such as with the curve on the right).

I am looking for a formal and widely accepted definition for the complexity of the function. One definition I was thinking about was how much the slope varies. Letting the average slope of the function in an interval be $\bar{m}$, then I can write the complexity as $$\int_a^b (f'(x)-\bar{m})^2 dx$$

where $f(x)$ is the function, $a$ is the lower endpoint, and $b$ is the upper endpoint.

Expanding the square, I get $$\int_a^b (f'(x)^2 - 2\bar{m}f'(x) + \bar{m}^2)dx = \int_a^bf'(x)^2dx - 2\bar{m}\int_a^bf'(x)dx + \bar{m}^2\int_a^bdx$$

Using the Fundamental Theorem of Calculus on the second integral, I get $$\int_a^bf'(x)^2dx - 2\bar{m}(f(b)-f(a)) + \bar{m}^2(b-a)$$

Because $\bar{m}$ is $$\frac{f(b)-f(a)}{b-a}$$ I can simplify the above equation to get $$\int_a^bf'(x)^2dx - \frac{\left(f(b)-f(a)\right)^2}{b-a}$$

The higher this value is, the more "complex" and varying the function is. If this value is $0$, then $f(x)$ is a line from $a$ to $b$.

Are there other measures of the complexity of a function? If so, what are the corresponding equations for them?

Edit: I am not looking for a complexity function dependent on the points. For example, $f(x) = x$ should have the same complexity no matter what points are on the graph.

Edit 2: A problem with the above definition of complexity is that $c^x$ would have a very high complexity (when I believe it should have a lower value). Specifically, from $a$ to $b$, the complexity is $$ \ln(c) \frac{c^{2b}-c^{2a}}{2} - \frac{(c^b-c^a)^2}{b-a}$$ which increases exponentially with respect to $b$.

4

There are 4 best solutions below

1
On

This may not be what you are looking for.Since you are using in context of overfitting, I thought you might be interested to look into this.

One thing I can quickly think is of VC- dimension . VC dimension is complexity of whole class of functions not just single function.It forms whole basis of Statistical Learning Theory.There is something called fundamental theorem of Learning Theory that states that class of functions is learnable (under snotion of PAC Learnability) if and only if class of functions have finite VC dimension.It actually gives both lower and upper bounds for sample comlexity.Now larger the VC dimension , larger is sample complexity and hence you need large number of points to reliably learn the class

Some examples of VC-Dimension

  1. VC dimension of hyperplanes in $f:\mathbb{R}^d $ is $d$
  2. VC dimension of affine hyperplanes in $\mathbb{R}^d$ is $d+1$
  3. VC dimension of all set of functions on $\mathbb{R}^d$ is $\infty$

note that above is all in context of classification (functions range are $\{ 0 , 1\}$) there is extensions to regression case as well

1
On

Have you heard of Kalman filters? They balance noise level against precision.
Rather than a single formula, it uses a weighted mean of previous function values to predict the next function value.

0
On

I had the same question as you, and after some digging, I found the curvature of a function, which is the reciprocal of the radius of the osculating circle and can be defined for a function $y = f(x)$ as $$ \kappa = \frac{\left|y''\right|}{\left(1+y'^2\right)^{3/2}} $$ It seemed natural to take the integral over the interval to find the total "complexity" over the interval ($x \in [a, b]$): $$ C = \int_a^b \frac{\left|y''\right|}{\left(1+y'^2\right)^{3/2}} \, dx $$

For any line, $C = 0$ for any interval. For a circle with radius $r$, $C = \frac{b\,-\,a}{r}$ over the interval $[a, b]$, while $C$ over $[-r, r]$ for a circle is always $2$.

The formula above is what I'm calling the "total complexity" over the interval. I specifically didn't divide by $b-a$, to get the average over an interval because that would result in the complexity of a point being non-zero, which I don't think is accurate. However, it could be the "average complexity" over an interval: $$ C_{avg} = \frac{1}{b-a} \int_a^b \frac{\left|y''\right|}{\left(1+y'^2\right)^{3/2}} \, dx $$

For this, you get $$ \lim_{b \to a} C_{avg} = \kappa(a) $$ which I don't think fits, because it makes sense that the complexity at a point should be $0$. But perhaps you think it is a better candidate than the other one.

0
On

This question is too broad. There are many arbitrary ways you could define what you mean by "complexity". I am assuming you are making it independent from how well it fits around the data.

If you are fitting a polynomial, then clearly the degree gives you a natural way to construct more and more complicated models.

Degrees of freedom is a concept used in general for regression models based on how many values free to vary for a given model.

Perhaps also look into function variation (bounded variation) as another example.

In the context of information theory, (assuming your function is a probability density function) entropy is a measure of uncertainty or information content.