Perceptron - Convergence proof

104 Views Asked by At

I studied the perceptron algorithm and im trying to prove the convergence by myself. However, i'm wrong somewhere and i am not able to find the error

Assumption: 1)We assume that there is some γ > 0 such that $$y_{t}(\Theta ^{*})^{T}x_{t} \geq \gamma $$ for all t = 1, . . . , n. The additional number γ > 0 is used to ensure that each example is classified correctly with a finite margin.

2)We will assume that all the (training) images have bounded Euclidean norms, i.e., $$\left \| \bar{x_{t}} \right \|\leq R$$ for all t and some finite R

By hypothesis the learning rule is $$\Theta ^{(k)}= \Theta ^{(k-1)} + \mu y_{t}\bar{x_{t}}$$

Now, $$(\Theta ^{*})^{T}\Theta ^{(k)}=(\Theta ^{*})^{T}\Theta ^{(k-1)} + \mu y_{t}\bar{x_{t}} \geq (\Theta ^{*})^{T}\Theta ^{(k-1)} + \eta \gamma $$ so , by induction $$(\Theta ^{*})^{T}\Theta ^{(k)}\geq k\mu \gamma $$

At the same time, $$\left \| \Theta ^{(k)} \right \|^{2} = \left \| \Theta ^{(k-1)}+\mu y_{t}\bar{x_{t}} \right \|^{2} = \left \| \Theta ^{(k-1)} \right \|^{2}+2\mu y_{t}(\Theta ^{(k-1)^{^{T}}})\bar{x_{t}}+\left \| \mu \bar{x_{t}} \right \|^{2} \leq \left \| \Theta ^{(k-1)} \right \|^{2}+\left \| \mu\bar{x_{t}} \right \|^{2}\leq \left \| \Theta ^{(k-1)} \right \|^{2}+\mu ^{2}R^{2}$$

So, by indution $$\left \| \Theta ^{(k)} \right \|^{2} \leq k\mu ^{2}R^{2}$$

We can now combine parts 1) and 2) to bound the cosine of the angle between θ∗ and θ(k):

$$cos(\Theta ^{*},\Theta ^{(k)}) =\frac{\Theta ^{*}\Theta ^{(k)}}{\left \| \Theta ^{*} \right \|\left \|\Theta ^{(k)} \right \|} \geq \frac{k\mu \gamma }{\sqrt{k\mu ^{2}R^{2}}\left \|\Theta ^{2} \right \|}$$

Since cosine is bounded by one, we get:

$$k \leq \frac{R^{2}\left \|\Theta ^{*} \right \|^{2}}{\gamma ^{2}}$$

The problem is that the correct result should be:

$$k \leq \frac{\mu ^{2}R^{2}\left \|\Theta ^{*} \right \|^{2}}{\gamma ^{2}}$$