Linear regression vector notation

471 Views Asked by At

Overview: I'm attempting to understand Linear Regression but am currently confused with regard to notation. I'm using a simple profit/loss sheet as a framework to assist my learning.

What I have: Naturally the profit/loss sheet runs over the 12 months of the year Jan - Dec. I have split the data into two separate matrices, one matrix 'y' contains the dependent variables, i.e. profit/loss for each month of the year (shape = 12x1). The other matrix 'x' (design matrix) contains 15 explanatory variables upon which the profit/loss for each month depends (shape = 15x12).

The problem: I am following a course text which states; "x is a D-Dimensional matrix", (see related image, below)

Design Matrix

My question: Is it the number of independent variables (i.e. 15) OR the number of months of the year (i.e. 12) which we refer to the dimension of this matrix?

My current viewpoint: The course text seems to suggest that the months of the year relate to the dimension 'D'. But if I have some vector composed of (x,y) pairs we call this a 2-dimensional vector, which can be represented in a 2-d vector space. If I have many vectors, for arguments sake we'll say 100 vectors, each containing an (x,y) pair, we don't call this a 100 dimensional vector/matrix and we simply represent it graphically as 100 points in the same 2-d vector space.

1

There are 1 best solutions below

1
On

The design matrix is $N \times (D+1)$. Each of the $N$ rows of the matrix represents one datapoint (one month), and each of the $D+1$ columns of the matrix represents an explanatory variable or an intercept term (the all ones column). So your design matrix has size $12 \times (15+1)$.