Why softmax is preserving pattern when applied along different axes in a matrix?

40 Views Asked by At

I am calculating the softmax function over a matrix containing random float values using the following methods:

  1. row-wise
  2. column-wise
  3. Considering the whole matrix

After calculating the values, I have drawn the heatmaps of each resulting matrix. As I have noticed, the patterns in the heatmap (the relative sizes of each cell) are the same for all three methods. I drew this for several random matrices and experienced the same.

What is the reason for this?

If you need to experiment, I have created a Google Colab notebook with Python code.