How to calculate a correction factor for two sets of numbers

Question

How to calculate a correction factor for two sets of numbers

44.7k Views Asked by Bumbble Comm At 05 Apr 2026 - 1:07

Suppose one has a set of numbers. To help understand my question, suppose that these numbers are from two different temperature sensors. In this first example, both sensors are placed in the same environment and should read the same temp

Col 1     Col 2
10        10
20        19
30        29
20        20
20        19
30        30
20        19
10        9
20        20
30        28

Since the sensors are in the same environment, they should read the same, but they don't so I need to correct for their offset. To calculate a correction factor between these two sets of numbers, so that column 2 is as equal to col 1 as possible, I do a regression analysis. For a linear regression the equation would be:

y=0.8041x + 3.7143

or

Col 2= 0.8041 * Col 1 + 3.7143

Now suppose I have a second set of numbers. In this second example the numbers represent the same sensors, but this time they are placed in different environments. So I expect them to read differently, but I also expect them to retain the same error I calculated above

Col 3     Col 4
11        10
21        19
30        27
20        20
21        19
30        25
20        18
11        15
20        20
30        25

My question is- is there a way to apply the same correction factor calculated from the first set of numbers to the second set? To be more specific, I am not looking to do this:

Col 4= =0.8041* Col 3 + 3.7143

and get this

Col 3     Col 4 (new based on regression)
11      12.5
21      20.6
30      27.8
20      19.7
21      20.6
30      27.8
20      19.7
11      12.5
20      19.7
30      27.8

as I loose all information about the original column 4. I am hoping to find a way to use the correction factor from Col 1 and Col 2 as a "calibration", and apply it to Column 4 in a way that retains the original information in that column but adjusts it to reflect the calibration equation.

If I assume Col 3 is correct and Col 4 is off, I was thinking the equation would look something like this

Corrected Col 4= Col 4 * (??Correction factor??)

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2012-08-09 20:16:12

To answer your question, let's look at what "error" means and what types of error you could have.

In your first problem, you have an overdetermined system: two measurements for one data point at each time, so a linear regression is essentially the same thing as solving the linear least squares problem for $A^TAv=A^Tb$, and $y = v_1x+v_2$.

What this results in is a model wherein the second sensor $y$ returns a scaled version of the first sensor $x$ plus an offset. It is not true to say that the offset, $v_2$ is the "error" -- unless $v_1$ is close to $1$. This is because the scaling factor $v_1$ is a slope, and the offset factor is introduced to minimize the squared error over the entire range of values.

Error can be considered to be random (uncertain or unknowable fluctuations in the process being observed) or systematic (a mean-shift in the value being observed due to uncertainty in the observation process). What you are looking to compute is the systematic error of sensor 2 with respect to sensor 1.

In this case, what I would do is compute the average difference between measurements, rather than using the offset of the linear regression. This will give you an estimate of the amount by which sensor 2 differs from sensor 1. Only then will you be able to quantify the potential relative drift in a different environment.

So, $$\epsilon = \frac{1}{n}\displaystyle\sum_{i=1}^n x_i-y_i,$$ $$y = x+\epsilon.$$

**user9024** · Answer 2 · 2012-08-09 22:06:42

I like Ed's answer. I don't see how regression makes sense for this problem at all. I especially agree that the two should differ by an additive error not with a slope <1 plus an offset. But also there is no reason to call sensor 1 correct and sensor 2 to be the one in error if there is a discrepency. In fact it is probably more reasonable to think that they are both measured with error and the the average of the two will tend to be closer to the truth than either one alone.

Let s$_1$ be sensor 1$_s$ measurement and s$_2$ sensor 2$_s$ measurement. Then my model would be that s$_1$=x+e$_1$ and s$_2$=x+e$_2$ where e$_1$ and e$_2$ are iid error random variables with mean 0 and the same variance s$^2$ then s$_b$=(s$_1$ +s$_2$)/2=x+(e$_1$ +e$_2$)/2. So s$_1$ and s$_2$ both have mean x and variance s$_2$ while s$_b$ has mean x and variance s$^2$/2.

if we take the sample variance of s$_1$-s$_2$ it will be an unbiased estimate of 2s$^2$. Divide it by 4 and we have an unbiased estimate fro the variance of s$_b$.

Use s$_b$ as the corrected estimate based on s$_1$ and s$_2$ and we have an estimate of its uncertainty.

How to calculate a correction factor for two sets of numbers

There are 2 best solutions below

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions