How to verify how good my data is?

325 Views Asked by At

I'm in an undergrad physics lab right now and I have taken some data. The theoretical curve should be proportional to $\cos^2(\theta)$. How can I quantify how close my data values are to this theoretical curve? Can I linearize it somehow and then let Excel do a least squares type line for me? Or is there some better way of quantifying how close my data is? Sorry I haven't taken any statistics and this physics lab doesn't explain anything to me. Thanks in advance for any answers. :)

2

There are 2 best solutions below

0
On BEST ANSWER

Suppose one of your input values is $\theta$ and the experimental value for that $\theta$ is $\hat y.$ The theoretical output value is $y=\cos^2\theta$. The difference between $\hat y$ and $y$ measures how good the experimental data is for that particular $\theta$ value.

To measure how good the experimental data is overall, you could take the average of these differences over all data points. Suppose that your experimental data points are $(\theta_1,\hat y_1),(\theta_2,\hat y_2),\ldots,(\theta_n,\hat y_n).$ The corresponding points on the theoretical curve are $(\theta_1,y_1),(\theta_2,y_2),\ldots,(\theta_n,y_n),$ where $y_i=\cos^2\theta_i.$ The average of these differences is then $$\frac{1}{n}\sum_{i=1}^n|\hat y_i-y_i|$$ This is called the mean absolute error.

Another alternative is to take the square root of the sum of the squares of the errors $$\sqrt{\frac{1}{n}\sum_{i=1}^n(\hat y_i-y_i)^2}$$ This is called the root-mean-square error. The formula seems more magical, but is more commonly used in practice (as far as I can tell).

Both of these quantify how close the experimental data is to the theoretical curve. The smaller either of these values is, the better the experimental data is.

10
On

well, first off - I cannot fathom a collection of huge data (by experiments, analytical or empirical) that follows a $$cos^2(\theta)$$ distribution ! naturally most "good" data distributions/collections must follow a Normal distribution( preferrably 0 mean and standard deviation 1 kind of data) , for large amount of observations.

But I'll leave the validity of your 2nd statement , to you.

Coming to the closeness of data. For an undergraduate , generally, its very helpful to learn MATLAB.

Its a software, used for many purposes( which you might want to google). And it would be helpfull if you could learn basic statistics,mean , standard deviation ,variance, mean square values , statistical distributions etc. Also some concepts like cross correlation can help too.

But I would stress on you getting to learn to use MATLAB. There are a lot of resources online to help you learn it .

Pertaining to your question , you can get the "goodness" measure or any such parameter about your data and the theoretical curve you want it to follow by using - the curve fitting module/app of MATLAB/Simulink. In that you can check the closeness of your data to the your desired model by ensuring SSE (sum of squared residuals) is nearly 0 (it can never be absolute 0 because of random factors contributing to the deviation in the data)

Does that help ?