How to translate least mean square regression equation to prercentage

93 Views Asked by At

Suppose that I have temperatures (T1) in x-axis and temperatures' frequencies (F1) in y-axis. So, that for each of the 10 temperatures in x-axis I get related frequencies in y-axis. I have also temperatures (T2) in another graph (in x-axis) and frequencies (F2) in y-axis. I am thinking of using Least mean squares regression in each graph. Can I use y1 = a1x1 + b1 (for T1) and compare it with y2 = a2x2 + b2 (for T2) in order to find a percentage (%) similarity between the 2 temperature graphs? Is there any better way (solution) ?

1

There are 1 best solutions below

22
On BEST ANSWER

So it looks like you have two sets of univariate data: temperatures $T_1$ and temperatures $T_2$. Further, you wish to find a way to compare how "close" the two datasets are to each other, where their "closeness" is measured on a unit interval $[0,1].$

One way to do this is obtain the EDF for each dataset. Then you can use a statistical distance to compare them. One distance you may use is Jensen-Shannon divergence, which is a symmetrized and smoothed version of KL divergence. The advantage of this distance is that it is between 0 and 1 (when using log base 2). You may choose to use the square root of Jensen Shannon divergence to make it a valid metric. Thus, you can roughly say that one minus the Jensen Shannon divergence (or its square root) gives a way to measure "percent similarity" between your two datasets. This distance has applications in e.g. genome comparison and machine learning. It may also help to review the literature to see applications for your kind of setting.