Vector/matrix representation of linear discriminant functions

Question

Vector/matrix representation of linear discriminant functions

73 Views Asked by Bumbble Comm At 30 Mar 2026 - 10:36

I'm currently studying machine learning using the book Pattern Recognition and Machine Learning (Bishop, 2006) and faced some confusion regarding the vector/matrix representation of $K$ linear discriminant functions. More specifically, this is from Chapter 4.1.3: Least Squares for Classification.

The specific portion of the book that I'm referring to states that:

Each class $C_k$ is described by its own linear model so that:

$$y_k(\mathbf{x}) = \mathbf{w}_k^T \mathbf{x} + w_{k0}$$

where $k = 1, \dots , K$. We can conveniently group these together using vector notation so that

$$\mathbf{y}(\mathbf{x}) = \tilde{\mathbf{W}}^T\tilde{\mathbf{x}}$$

where $\tilde{\mathbf{W}}$ is a matrix whos $k$th column comprises the $D + 1$-dimensional vector $\tilde{\mathbf{w}}_k = (w_{k0}, \mathbf{w}_k^T)^T$ and $\tilde{\mathbf{x}}$ is the corresponding augmented input vector $(1, \mathbf{x}^T)^T$ with a dummy input $x_0 = 1$.

I'm mainly having trouble understanding exactly how to understand the representations of $\tilde{\mathbf{w}}_k$ and $\tilde{\mathbf{x}}$. My interpretation of the above equation is that we're taking the operation:

$$ \begin{bmatrix} \mathbf{w}_k^T & w_{k0} \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ 1 \end{bmatrix} = \mathbf{w}_k^T\mathbf{x} + w_{k0} = y_k(\mathbf{x}) $$

a total of $K$ times, and therefore we can conveniently represent these linear models as a compact vector.

$\tilde{\mathbf{w}}_k$ I can somewhat infer that since $\tilde{\mathbf{w}}_k \in \Bbb{R}^{D + 1}$ its transpose (since we use $\tilde{\mathbf{W}}^T$ in the actual equation) would be in $\Bbb{R}^{1 \times (D + 1)}$, and multiplying this by the $D + 1$-dimensional vector $\tilde{\mathbf{x}}$ gives us a scalar value (i.e. the linear model equation).

The two questions that I have following these thoughts would be:

How should I interpret the representation of $(w_{k0}, \mathbf{w}_k^T)^T$? Is it supposed to be:

$$\tilde{\mathbf{w}}_k = \begin{bmatrix}w_{k0} & \mathbf{w}_k^T\end{bmatrix}$$

since

$$(w_{k0}, \mathbf{w}_k^T) = \begin{bmatrix}w_{k0} \\ \mathbf{w}_k^T\end{bmatrix}$$

Why does $\mathbf{x}$ have a transpose? My interpretation was that we're multiplying $\mathbf{w}_k^T$ and $\mathbf{x}$ and $\mathbf{x}^T$?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$(w_{k0}, w_{k}^T)$ is the horizontal concatenation of $w_{k0}$ and $w_k^T$ is the horizontal concatenation between $w_{k0}$ and $w_k^T$. Transposing it would make it a column vector. That is $\tilde{w}_k$ is the $k$-th column of the matrix $W$ and $\tilde{w}_k$ would be the $k$-th row of the matrix $\tilde{W}^T$.

We have

$$\begin{bmatrix} w_{k0} & w_k^T\end{bmatrix} \begin{bmatrix} 1 \\ x\end{bmatrix}=y_k(x)$$

Same reasoning for the second question, $(1, x^T)$ is the horizontal concatenation between $1$ and $x^T$ resulting in a row vector, after which, we further tranpose it $(1,x^T)^T$ to make it a column vector.

Vector/matrix representation of linear discriminant functions

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions