I was trying to understand how edge detection works in digital image processing works. Here is what I understand so far.
if we have a continuous intensity function of a row of an image, I(x), the gradient of this function would be the derivative , G(x) = $$\lim_{h\to0}\frac{I(x+h) - I(x)}{h}$$. But in the case of one row of an image, we don't have a contentious function instead discrete values. Thus we can approximate the limit with the smallest h = 1, by taking the average of the right and left values. And $$G(x) = \frac{I(x + 1) - I(x) + I(x) - I(x - 1)}{2} = \frac{I(x+1) - I(x-1)}{2}$$. Since we are talking about a row vector finding the the derivative at each point with this equation is equivalent to convolving the row vector with the kernel $\frac{1}{2}$[-1 0 1].
All this have been with one dimensional row of an image. But if we were talking about an image, it has two dimensions. So we will have to do partial differentiation to find the gradient along x and y and add them together. The math however will be the same for individual gradients and $$G_x(x,y) = \frac{I(x + 1,y) - I(x-1, y)}{2}$$. Correct me if I am wrong till this point.
But my questions is if everything until now is correct, why cant we convolve images with [-1 0 1], to get the gradient along the x axis. say the image could be represented as a gray-scale matrix $$ A = \begin{bmatrix} 2 &3& 4\\5 & 6 & 7 \\ 0 &2 &1\end{bmatrix}$$. Then convolving with k = $\frac{1}{2}$[-1 0 1], would result in $$\begin{bmatrix} 1.5 &1& -1.5\\ 3 & 1 & -3 \\ 1 &0.5 &-1\end{bmatrix}$$.
This however isn't what we use for edge detection. We instead use a 3x3 Kernel: the prewitt operator and others. So why particularly the prewitt operator is a 3x3 kernel if 1x3 Kernel could do the job, and why are we not dividing by 2 or 9 in the case of the prewitt in the prewitt operator. Thanks!