Assume that we have the following over-determined linear system \begin{cases} z_{1}=c + \phi z_0\\ z_{2}=c + \phi z_{1}\\ \dots\\ z_{n} = c + \phi z_{n-1} \end{cases} with $n>2$ and all $z_{0}, \dots, z_{n}$ are different numbers. Then, using ordinary least squares (OLS), i.e. we can get the solution $(\hat{c}_{1}, \hat{\phi}_{2})$ (which is, actually, the estimate) of the system.
Next, let us consider the following system \begin{cases} z_{2} - z_{1} = \phi(z_{1} - z_{0})\\ (z_{3} - z_{2})= \phi (z_{2} - z_{1})\\ \dots\\ (z_{n} - z_{n-1})=\phi (z_{n-1} - z_{n-2}) \end{cases} and using OLS again let us get the solution for $\hat{\phi}_{2}$.
Why the solution in the second case is very different compare to the first one?
From what I can see from the simulations below, one can not just subtract rows in the system, which is correct for consistent system.
The python code for simulations is below.
import numpy as np
import scipy as sp
import statsmodels.api as sm
"""
function for simulation of z_{t} (the previous value plus some noise)
"""
def simulate_z(nSample, phi, sigma_e, fVal, c):
noise_e = sp.random.normal(0, sigma_e, nSample)
z = np.zeros(nSample)
z[0] = fVal
for period in range(1, nSample):
z[period] = c + phi * z[(period - 1)] + noise_e[period]
return z
"""
OLS estimation
"""
def est_c_ph(z):
x = z[0:-1]
y = z[1:]
p = sp.polyfit(x, y, 1)
# Estimate phi
phi_est = p[0]
# Estimate c
c_est = sp.mean(z) * (1 - phi_est)
return [c_est, phi_est]
"""
values of the parameters for simulation
"""
phi = 0.95 # slope
c = 0.5 # intercept
sigma_e = 0.08 # standard deviation of observation noise
nSample = 500 # sample size
E = c / (1 - phi) # mean value
fVal = E # first value of the simulated process
"""
simulation of AR(1)
"""
z = simulate_z(nSample, phi, sigma_e, fVal, c)
c_est, phi_est = est_c_ph(z)
print("OLS [c, phi]: ", [c_est, phi_est])
"""
differencing of data
"""
z_dif = z[1:] - z[0:-1]
c_d_est, phi_d_est = est_c_ph(z_dif)
print("diff z OLS [c, phi]: ", [c_d_est, phi_d_est])