How To Evaluate How Well a Mathematical Model Works... Modelled VS Measured Values?

801 Views Asked by At

Let's say I am working with a mathematical model. The output of the model is always a value between 0 and 1, and is dependent on a number of input variables.

I am also measuring the same phenomenon in a laboratory.

As far as I can tell, generally one would run linear regression on the two datasets and get a coefficient of determination (R^2) in order to determine how much of the variation in the measured values is explained by the modelled values.

Are there any other useful types of statistical analysis that can be run in the case of evaluating Modelled Vs. Measured values?

1

There are 1 best solutions below

0
On

Questions and possibilities. Not an answer yet. This is a substantial and interesting statistical question, but I need more information in order to give responsible help.

From your comment I suppose the $input$ to both theoretical model and the lab determination is the amount of light. Your output is a fraction of that--a number between 0 and 1. There would be no need for a model without a source of variability. Totally ignorant of the situation, I suspect possibilities might be different intensities of incident light, different colors or textures of the reflective surface, different characteristics of the medium between surface and detector, and so on. In any case, I wonder whether a 'reflectance' between 0 and 1 is the best way to record the output for what might be hugely varying inputs.

You have paired reflectance measurements $T_i$ from the theoretical model and $L_i$ from the lab measurement. Specifically, on what basis is a particular $T_i$ paired with a particular $L_i$? Would that be the intensity of incident light or something else.

I guess you would not be asking questions if there were no random component to either $T_i$s or $L_i$s. Most likely the $L_i$s are subject to unknown factors you want to model as random variation. Is there also randomness in the $T_i$s, or are they purely deterministic. If the model for the $T_i$ contains a random component, then please describe it.

You say that reflectance measurements lie between 0 and 1. But there are many possible distributions that lie in that interval. Do your measurements tend to be spread roughly uniformly throughout the interval? Somewhat clustered near the center? Mostly near one or both ends? (Look at start of Wikipedia on 'Beta distribution' to see graphs of a few possible distribution curves.) Also, are your $L_i$s fairly well behaved? somewhat noisy? subject to occasion extreme outliers?

What is your central purpose? Do you hope that a $T_i$ can somehow predict the corresponding $L_i$? Do you hope the discrepancy $|T_i - L_i|$ tends to be very small? Those are two quite different objectives. For example, if $L_i$ is always $.9T_i$ (or $T_i^2$), then you have perfect predictability but maybe substantial discrepancies. Or do you have some other goal I have not thought of? A useful statistical approach depends, among other things, upon your purpose. Also on how "close" in some sense you would consider success. (Note: Either excellent predictability or tiny discrepancies might or might not tend to give $r^2 \approx 1$---quite possibly not really answering your real question.)

How many $(T_i, L_i)$ pairs will you have? That also might make a big difference in the method of analysis.

It would be a huge help, possibly even crucial to success, if I could see true examples of your data. All of them, if there are a few dozen pairs. Something like every third, tenth, of hundredth observation if there are very many. (And then with mean and standard deviations of all the $T_i$ and all the $L_i$.)

I am mainly retired, but teaching a huge class right now. So there may be both short and long response times. Please use some combination of edited Question and Comments to provide answers. And leave a Comment directed to me, so I'll know to look.)