evaluating 'equalness' of matrices. What function to use?

46 Views Asked by At

I'm working on my a research project at university which is about a convolutional neural network.
I have neurons/filters which are described by 5x5 matrices with numbers from 0 to 1. In the very beginning, the filters are initialised with random values, which all come from the same distribution (gaussian, same mean and variance). As time goes on, I expect these filters to change, so that they each have areas with high and low values.
I want to compare these matrices statistically. My thinking is, that in the very beginning one can show that these matrices were generated using the same distribution. What function would describe this? Covariance/Corellation?
When the filters specify/change, they should be less equal regarding the distribution of high and low values inside the matrix so the value of the function should go down.
When I have a certain number of filters > 2, should I compare them pairwise or could I also compare the 'equalness' of the whole set at once?

Thanks in advance.

Edit: I have to mention, that the position of values is important for the analysis. I have pictures to visualize my filters right now and I'm sifting through them in order to see which parameters produced good and which produced bad filters. That's what I want to do with a mathmatical model instead.

Here are some pictures (yellow is high, purple low values)

This is a pretty much unlearned filter
Green values indicate that the values are somewhere near the mean.
This is an upper edge detector
You see, that values are near 1 in the top and near 0 in the bottom picture.
bottom edge
This is something in the making

My thinking was, that I could show that the first and last filter have some statistical value in common. Something that decreases when filters start to change. I'd argue that filters which don't 'correlate' with the gaussian distribution are better filters and that filter sets that don't 'correlate' with each other are better filters. What statistical property could I use?

1

There are 1 best solutions below

0
On

I will try a practical example, I might be misunderstanding (so please let me know if I'm making incorrect assumptions here), but I assume we start with two $5 \times 5$ matrices with entries sampled from the same Gaussian distribution...I will just use the standard normal with mean 0 and standard deviation 1. So using python to generate two such matrices:

A = np.random.normal(size=(5,5))
B = np.random.normal(size=(5,5))

I will not display the result here to save space. So then I calculate a histogram with 6 bins for the first matrix:

np.histogram(A.flatten(),bins=6)

For me this resulted in

(array([1, 1, 5, 6, 9, 3]),
 array([-2.74347505, -2.1026386 , -1.46180215, -0.82096569, -0.18012924,
         0.46070721,  1.10154366]))

The first array in the result is the count in each bin, the second indicates the bin edges. Now I calculate a histogram for $B$ with the same bins:

    np.histogram(B.flatten(),bins=[-2.74347505, -2.1026386 , -1.46180215,-0.82096569, -0.18012924,
0.46070721,  1.10154366])

For my $B$ matrix the result was then

(array([0, 0, 3, 8, 8, 5]),
 array([-2.74347505, -2.1026386 , -1.46180215, -0.82096569, -0.18012924,
         0.46070721,  1.10154366])) 

Now I want to apply a statistical test to see if [1, 1, 5, 6, 9, 3] and [0, 0, 3, 8, 8, 5] approximates the same distribution. I have developed a simple distance-based metric that can do this: it requires one to normalize the histograms to essentially become frequency functions, calculating the $l_1$ distance between these frequency functions, and then using a CDF function to convert the distance to a probability score. So normalizing and calculating the distance

a = [x/sum([1, 1, 5, 6, 9, 3]) for x in [1, 1, 5, 6, 9, 3]]
b = [x/sum([0, 0, 3, 8, 8, 5]) for x in [0, 0, 3, 8, 8, 5]]
dist = sum([abs(x-y) for (x,y) in zip(a,b)])

so just to print these results for me

a= [0.04, 0.04, 0.2, 0.24, 0.36, 0.12]
b= [0.0, 0.0, 0.125, 0.3333333333333333, 0.3333333333333333, 0.20833333333333334]
dist= 0.3633333333333334

Now I run it through my CDF for dimension 6:

cdf_6 = joblib.load('repos/bias_distance/cdf/l_1_lin_interp_dim_6.joblib')
cdf_6(dist)

which results in p-value array(0.08840493).

Now for comparison let's say I change $B$ to B = np.random.uniform(-2,2,size=(5,5)), which results in the following histogram counts for the same bins as before: [0, 4, 1, 4, 7, 5]. If I then repeat the procedure and calculate the distance I get

dist= 0.5371428571428571
print(cdf_6(dist))

array(0.33260882)

and there is obviously more "bias" between $A$ and $B$ now.

You would have to decide on reasonable cut-offs for considering two matrices as "the same"...for this metric the closer the value is to zero the more alike the samples are.

At first, as I mentioned in the comments, I thought about just using Pearson's chi-squared test, but at least 5 observations in each bin is required for this test, so for the size matrices you have this won't work. To read more about the distance-based bias metric I am using here please see this preprint. If you want to use it, I can supply you with the (python sklearn based) CDF functions you will need to convert the distances to probability scores.