I have read a few of the posts on here regarding Shapley values and have started to form an intuition surrounding it, especially in connection with explainability of ML models. However, I am still unable to understand the following reasoning behind Shapley values: the formula for a Shapley value, assigning a payoff to the player $i$, is $$\phi_i(N, v) = \frac{1}{|N|!}\sum_{S\subset N\setminus \{i \}} |S|!(|N|-|S|-1)!\bigg(v(S\cup \{i\})-v(S)\bigg).$$ The aspects I do not understand are:
Why do we have to consider the order in which the group is formed? Considering that the weights come into the picture as a way to quantify the number of ways each $S\subset N\setminus \{i \}$ can be realised, isn't the order irrelevant because the coalition should be seen as just a group of elements, regardless of the order? I read this answer to better my understanding, but got stuck here: why is the marginal payoff $v(P_i^R \cup\{i\}) - v(P_i^R)$ and not $v(N) - v(N\setminus \{i\})$?