What is the difference between $P(S = s, N = n )$ vs $P(S = s | N =n)$? I know the former is a joint probability and the latter is a conditional probability, and that $P(S=s, N=n) = P(S=s | N=n)P(N=n)$; however, I can't seem to distinguish between the meaning of the 2 in certain problems.
For this discussion, let's consider $S$ to be the random variable for the sum of numbers we draw from a hat. $N$ is the number of draws. Let's just say we draw with replacement for simplicity.
So $P(S=s, N=n)$, in words, is the probability that we draw $n$ times AND the sum of the $n$ draws is $s$. $P(S=s | N=n)$, in words, is the probability of drawing a sum $s$ given that we draw $n$ times. The two in this situation sounds identical to me, which would imply $P(N=n)=1$. But is this really the case, or am I not understand this correctly?
You have an event of
and then a compound event of drawing from the hat a specific number of times and drawing a specific sum.
You need to carefully determine what your universal set $U$ is -- the set whose cardinality you put in the denominator to calculate probabilities. Suppose the number of draws, $n_{i\ge1}$, belongs to a finite set $P=\{n_1,n_2,...,n_m\}$, and the numbers in the hat are from the set $X$.
While calculating $P(S=s, N=n_1)$, each instance of $n_1$ draws yielding $s$ sum is accounted in the numerator, i.e. the number of $n_1$-tuples adding up to $s$,$$\sum_{x_1+x_2+...+x_{n_1}=s\\x_i\in X}1$$and each instance of any sum being obtained after any number of draws is accounted in the denominator (which is your compound event), i.e. the number of tuples of any length $n_i\in P$,$$|U|=|S=s,N=n_1|+|S\neq s,N=n_1|+|S=s,N\neq n_1|+|S\neq s,N\neq n_1|\\=\sum_{i=1}^m|X|^{n_i}$$
While calculating $P(S=s|N=n_1)$, the universal set is restricted to only $n_1$-tuples. You add only those $n_1$-tuples whose sum is $s$ in the numerator, i.e.$$\sum_{x_1+x_2+...+x_{n_1}=s\\x_i\in X}1$$ and all $n_1$-tuples in the denominator, i.e.$$|U|=|N=n_1|=|S=s,N=n_1|+|S\neq s,N=n_1|=|X|^{n_1}$$
Indeed, when the only element of $P$ is $n_1, P(N=n_1)=1/m=1$ and both the probabilities are identical.