Most Important Feature from Weights

31 Views Asked by At

I have a dataset which I am writing a classifier for. I am using the following algorithm to calculate the $n$ weights

while abs(error) < 0.5:
    for each example e in the training set:
        error = actual prediction - pred(e)
        update = eta * error
        for each i in [0,n]:
            w_i = w_i + update * X_i(e)

where eta is the learning rate, $X_i(e)$ is the $i$th feature of $e$ and $w_i$ is the associated weight. I also have
\begin{align}\text{pred}(e) &= f\left(\sum_{i=0}^n w_i X_i (e)\right)\\ f(x) &= \frac{1}{1-e^{-x}}\end{align}

This produces $n$ weights for my dataset. Let's say that

\begin{align}w_0 &= 3.00\\ w_1 &= 5.21\\ w_2 &= -2.34\\ w_3 &= -0.09\\ w_4 &= 0.21\end{align}

To classify a new example, I find the closest example using the weighted distance formula:

$$d(a,b) = \left|\sum_{i=0}^n w_i \left(X_i (a)-X_i(b)\right)^2\right|$$

My question is, which of these features is the most important in classifying my new data? Intuitively, I want to say the largest absolute value in this case $w_1$, is the most important and the smallest absolute value, in this case $w_3$, is the least important. I took the absolute value as my data is all binary and so swapping the choice of values for a feature would simply reverse the sign. However, I am now doubting myself. As far as I understand the weights in respect to the distance formula, each weight scales that axis and so larger weights will spread out the points in that axis, meaning they are further from each other.

Can someone help me understand which feature is most important please.