I am new and know not much about "graph theory" and "graph neural network". Assume, I have one incidence matrix $\mathbf{B}$ such as
| visitor | item1 | item2 | item3 | item4 |
|---|---|---|---|---|
| A | 1 | 0 | 0 | 1 |
| B | 1 | 1 | 0 | 0 |
| C | 0 | 1 | 0 | 0 |
| D | 0 | 0 | 1 | 1 |
| E | 1 | 0 | 1 | 1 |
Where 1 represents purchased and 0 represents no purchase. If I build an undirected adjacency matrix $\mathbf{A}$ for items such as $\mathbf{B}^T\mathbf{B}$ with the diagonal set to 0. The $\mathbf{A}$ is defined as
| item1 | item2 | item3 | item4 | |
|---|---|---|---|---|
| $\textbf{item1}$ | 0 | 1 | 1 | 2 |
| $\textbf{item2}$ | 1 | 0 | 0 | 0 |
| $\textbf{item3}$ | 1 | 0 | 0 | 2 |
| $\textbf{item4}$ | 2 | 0 | 2 | 0 |
$\textbf{Question1:}$ Can I say that if a visitor purchased $\textbf{item3}$, a recommender system should recommend a visitor to purchase $\textbf{item4}$ and $\textbf{item1}$. The $\textbf{item4}$ will be the 1st recommendation because the weight (count) is 2 and $\textbf{item1}$ will be the 2nd recommendation because its weight is 1.
$\textbf{Question2:}$ Should I normalize item matrix $\mathbf{A}$ by row like following OR
import pandas as pd
import numpy as np
B = np.array([
[1,0,0,1],
[1,1,0,0],
[0,1,0,0],
[0,0,1,1],
[1,0,1,1]
])
A = np.transpose(B).dot(B)
np.fill_diagonal(A, 0)
print(A/A.sum(axis = 1).reshape(4,1))
[[0. 0.25 0.25 0.5 ]
[1. 0. 0. 0. ]
[0.33333333 0. 0. 0.66666667]
[0.5 0. 0.5 0. ]]
$\tilde{\mathbf{A}} = \mathbf{D}^{-1/2}\mathbf{A}\mathbf{D}^{-1/2}$, where $\mathbf{D}$ is a degree matrix. The $\tilde{\mathbf{A}}$ is
import pandas as pd
import numpy as np
from scipy.linalg import sqrtm, inv
B = np.array([
[1,0,0,1],
[1,1,0,0],
[0,1,0,0],
[0,0,1,1],
[1,0,1,1]
])
A = np.transpose(B).dot(B)
diag = np.diagonal(A)
zero = np.zeros((4,4))
np.fill_diagonal(zero, diag)
dm = zero.copy() # degree matrix
D = inv(sqrtm(dm)) # D^(-1/2)
out = D.dot(A).dot(D)
np.fill_diagonal(out, 0)
print(out)
[[0. 0.40824829 0.40824829 0.66666667]
[0.40824829 0. 0. 0. ]
[0.40824829 0. 0. 0.81649658]
[0.66666667 0. 0.81649658 0. ]]
many thanks
What you can definitely say from the numbers $1, 0, 0, 2$ you have in matrix $\mathbf A$ is that out of the five visitors you've collected data on, there was $1$ who purchased item3 together with item1, and there were $2$ who purchased item3 together with item4.
Interpreting that any further is more about common sense than it is about mathematics, and also depends a lot on the application: what do you plan to use this conclusion for?
For example, if the plan is "if a customer purchases item3, I will use this data to recommend item4 as well" then using matrix $\mathbf A$ is a mistake for the following reason. Imagine a data set in which item1 is batteries, and item2, item3, item4 are various battery-powered items: an emergency radio, a flashlight, and a calculator. Almost everyone who buys one of these items buys batteries for it as well. Then:
In such a scenario, you might want to normalize the columns of $\mathbf B$ first (say, so that each column adds up to $1$). Then $\mathbf B^{\mathsf T}\mathbf B$ will make more specific recommendations: if item1 often gets purchased with item2 and item3, but item2 gets purchased a lot for other reasons as well, while item3 almost always is only bought with item1, then item3 will be a better recommendation.
On the other hand, if your goal is prediction, and not recommendation, then your original solution is the right choice. It is accurate that more people who buy flashlights will buy batteries than emergency radios, and it is accurate that they will all buy candy as well (in the example above).