I'm looking over the answers for old statistics exams in preparation for my own exam, while reviewing a question concerning discriminant analysis I found something really odd.
I have input variables and a class label for each observation
x1 = [2, 10, 6, 14, 6, 10]
x2 = [8, 6, 4, 10, 12, 8]
t = [1, 1, 1, 2, 2, 2]
Normally for DA I would calculate the mean for each group (both groups have mean 8) however the exam asks for the 'group mean vectors' which according to the answers should be:
u1 = [6, 6]
u2 = [10, 10]
However I can't find anywhere how these were calculated. A Google search for 'group mean vector' turns up nothing and if I search for Discriminant Analysis methods all I find is using the means for each group (scalars not vectors).
The second strange things happens in calculating the covariance matrix. Instead of calculating one covariance matrix for both groups, they ask for a "common group covariance matrix" (again Google is useless here, so strange!)
Apparently you have to calculate a matrix per group first and then add these for both groups together (cell per cell) and divide each cell by 2. For x1 this 'group matrix' is:
top_left = 0.5[(2-6)^2 + (6-6)^2 + (10-6)^2 = 16
bottom_right = 0.5[(8-6)^2 + (6-6)^2 + (4-6)^2 = 4
others = 0.5[(2-6)(8-6) + (10-6)(6-6) + (6-6)(4-6) = -4
=> 16 -4
-4 4
A couple of strange things here: The number 14, from data set x1 is never used! The number 8 does not appear in group x1 (but does in group 2, but no other numbers from group 2 are used).
I'm at a loss: how is a 'group mean vector' calculated and how is the matrix for each individual group calculated?
For reference, the answer I found later (reproduced from slides)
Where D is the dimension (2 in the example in the question) and K is the class (there are again 2 classes, or targets, in my example)
The matrix for each group is then: