Online, I heard that
When you draw $10$ cards without replacement from a standard $52$ deck, the expected number of aces is $\displaystyle \frac{10}{13}$. This can be simply noted by linearity of expectation, since expectation is linear even with random variables that are not independent.
When I tried to study probability and expectation in a rigorous mathematical manner, I learned that $\Omega$ is a set called the sample space, and an event $A$ is just a set that is a subset from the sample space. The probability can be defined as a function $P : \mathcal{P}(\Omega)\to \mathbb{R}$ where $P(A) = |A|/|\Omega|$.
A random variable $X$ is then defined as a function $X : \Omega \to \mathbb{R}$ that assigns each outcome in the sample space to a real value. I discovered some more definitions of notations:
Denote a set $\{ a \le X \le b\} = \{ \omega \in \Omega \, : \, a \le X(\omega) \le b \}$, and so $P(a \le X \le b)$ is a notation for $P(\{a \le X \le b\})$
Let $A \subseteq \mathbb{R}$, then $P(X \in A) = P(\{\omega \in \Omega \, : \, X(\omega) \in A\})$. Thus, $P(X = x)$ is a notation for $P(X \in \{x\})$.
Let $\mathcal{F}=\{f | f:\Omega\to \mathbb{R}\}$, then the expectation is a function $\mathbb{E}:\mathcal{F} \to \mathbb{R}$ over $\Omega$ is defined as $$\mathbb{E}(f) = \sum_{\omega \in \Omega}f(\omega)P(\{ \omega \})$$ and from this definition we can easily show linearity of expectation by some manipulations of summations.
This is all even assuming that $P$ is uniform, and there's no definition I know of non-uniform distributions. So, we fix the sample space $\Omega$ and then we evaluate expectation. I still don't understand the rigor of conditional probability or conditional expectation or what does dependency between random variables mean.
Now, when we're saying we can draw $10$ cards without replacement, the small (and utterly unsatisfying for me) explanation is that we can define $X_i$ a random variable that takes the value $1$ if we draw an ace on the $i$th time and $0$ if not, and then we can "conclude" that $$\mathbb{E}(X_1 + X_2 + \dots + X_{10}) = \mathbb{E}(X_1) + \mathbb{E}(X_2) + \dots + \mathbb{E}(X_{10}) = 10\mathbb{E}(X_1) = \frac{10}{13}$$
But what does that mean, strictly speaking? When we draw one card, the sample space changes, while the definition of expectation was defined over a fixed sample space. What would expectation on a changing sample space mean?
Am I finding these definitions wrong? They were inspired/drawn from Chung and AitSahlia's Elementary Probability Theory. Is there a way to study probability and expectation with rigor? May be recommending resources? Is there a way that is simple and doesn't need two month of heavy study (with probably a prerequisite of Calculus and Measure theory) for me to understand simple discrete expectation, like the $10/13$ problem above?
P.S. The above is just an example of linearity of expectation applied on dependent variables, but I want to understand what it really means to apply linearity on expectation of dependent variables in general, or what dependency - strictly speaking - of random variables means.
Any help would be appreciated.
The sample space when you are drawing 10 cards without replacement is the space $\Omega$ of all 10-permutations of the set $\{1,2,3, \cdots,52 \} $ equipped with the probability measure : $\mathbb{P}(w_1,w_2,w_3 \cdots w_{10}) = \frac{1}{10!\binom{52}{10} } \forall \ (\omega_1,\omega_2, \cdots, \omega_{10}) \in \Omega$
Then, $X_i : \Omega \to \mathbb{R}$ is the function that maps $(\omega_1,\omega_2, \cdots, \omega_{10}) \to 1$ if $\omega_i$ is an ace, and zero otherwise and we have that $X = \sum X_i$.