Why transpose the weight vector in Perceptron Convergence Theorem

651 Views Asked by At

We have a training set D with a bunch of input vectors x and desired output of y.

We are aiming to make a perceptron that can separate the input vectors x in accordance with their desired output y. The end goal of a perceptron algorithm is to have the weight vector w, input vector x and output vector y that satisfies the following:

$$ y ⋅ (w^{T}\cdot x) > 0 $$ For every point in our dataset that needs to be satisfied until we can finish training. Now the perceptron convergence theorem proves that this can be achieved in a finite amount of loops. What I do not understand however and what I would appreciate some help with is understanding as to why we transpose the weight vector w. Why does it need to be: $$w^T$$ Why can't the normal weight vector w just be used?

I appreciate all help, thank you. Also I hope this is the correct place to post this question.