About classification using a neural network

39 Views Asked by At

I am trying to remember how a neural network is used for classification (Been a while since I took the course).

Let the training data have output labels $1,2,3$, consider a single layer perceptron and let there be only one training sample for simplicity. Let the activation function be denoted by $f$. I remember that for a training sample $x\in \mathbb{R}^n$, the output of the network is $f(w_0+w_1x_1+\ldots+w_nx_n)$ where $f$ is a Linear function and the network learns weights iteratively by minimizing an objective function like, for example, the square of difference of predicted and observed output values.

Theoretically can I use any activation function for a classification problem? For example, if I use the activation function $f(y)=y$ for $y\in \mathbb{R}$, then the output of the neural network, after the iterative process of learning weights, is a real number. But the desired output is an integer in $\{1,2,3\}$. So that is unsettling for me?

1

There are 1 best solutions below

5
On BEST ANSWER

For classification problems, we generally use different output nodes for different labels. So we will have output $y_1, y_2, y_3$, and each output will go through that linear weighted sum and activation function. Then the probabilities of it being a specific label, given the outputs is calculated using another function, maybe something like $p_1 = \frac{y_1}{y_1+y_2+y_3}$ just for an example. Given enough data points it can learn these probabilities for all types of $\textbf{linear}$ problems.

For more common use problems which are non-linear, you have to use a non linear activation function like ReLu or sigmoid, and the network also has to have more than one layer.