For a statistical learning problem (classification), I have the data set $\{ (x_i,y_i) \}_{i=1}^n$ with $x_i \in \mathbb{R}^2$ being the input data and $y_i \in \{0,1\}$ the possible classes.
The data is used to compute the log-likelihood for the data, in that equation I have to compute the logistic sigmoid function
$$\sigma(x_i) = \frac{1}{e^{-x_i} + 1}$$
My problem is:
The input data of $x$ is a matrix $X \in \mathbb{R}^{n \times 2}$, now I am confused how I can compute the $\sigma(x_i)$ for a certain value, since one value of the matrix is a tuple, respectively a vector, one row of this matrix.
Any hints on how to approach this problem and compute my $\sigma(x_i)$?
My matrix looks like that:
$$\begin{pmatrix} 1.55545 & -1.00055\\ -1.24155 & 1.58778\\ 1.28068 & -1.0224\\ \vdots & \vdots\\ -1.68505 & 0.290898\\ 1.73686 & 0.793386\\ \end{pmatrix}$$
Hence $x_1 = (1.55545, -1.00055)$, but what is then:
$$\sigma(1.55545, -1.00055) = \frac{1}{e^{????} + 1}$$
The only thing I have found is the Vector exponential which claims that it can be computed by:
$$exp(v) = 1 \cosh(|v|) + \frac{v}{|v|} \sinh(|v|)$$
The $x_i$ of the input data is not the input of the logistic sigmoid function $\sigma(x)$, that $x$ there is only an arbitrary chosen variable name. The actual input of the $\sigma$ function is a scalar function $f$.
In this case here the function is $f(y,x)$, more specific: $$f(1,x) = \phi(x)^T \cdot \beta$$
The $\phi(x)^T$ is just a transposed version of the input vector with additional features. The result is a scalar and therefore $$\frac{1}{e^{f(1,x)} + 1}$$ can be easily computed.