I am in a probability class and we are just getting into random variables.
A random variable to me is a function that maps event to some real number. But the concept of event is difficult to swallow.
For example, let f(x) be a function that maps the set of real numbers x to set of real numbers y, this is the notion of a function or mapping that I am most familiar with.
Let instead f be X, and x $\in$ {head, tail}, then X(head) = 0, and X(tail) = 1 is one of the mapping. But head and tail themselves are ill defined, how can we ever operate on strings? I would be more comfortable if they are numbers instead, since we cannot add heads, divide heads, multiply heads or subtract heads, but we can do it with actual numbers. So something is off here, either X is not a function as I would like to imagine it to be, or the event space itself is ill defined.
How do I wrap my head around this?
I'm afraid it is your idea of what a function is that is wrong rather then anything in probability theory. You also don't quite have the right probability spaces and functions but that's easier to fix.
Let's first go back to what a function is. In mathematics to have a function you first need at least one set. Preferably we will have two sets $A$ and $B$, ($A$ might be equal to $B$ though). A function from $A$ to $B$ is a set of ordered pairs $f=\{(a,b)|a\in A, b\in B\}$ such that any $a$ only ever appears as the first coordinate of at most one pair. This is often written as $ ((a,b)\in f) \& ((a,c)\in f)\implies b=c$. This condition also allows us to write $f(a)=b$ as we are used to since there is no confusion about which $b$ we should be getting (the unique $b$ such that $(a,b)\in f$.
Putting the above paragraph in even less mathematical terms we can think of a function as a map from $A$ to $B$ such that each element of $A$ maps to exactly one element of $B$.
Notice that the definition of a function $f$ needs absolutely nothing from the sets. As far as we care the set $A$ may contain fruit and the set $B$ may be the fruit's colour (assuming each fruit only has one colour). We certainly can't normally add fruit (and it doesn't really matter if we are adding apples to oranges or apples to apples, we don't have an addition operation defined on either. The operation is defined on the cardinalities of sets of fruit.)
For an example of a function that's closer to your probability example and still uses fruit you can think of each fruit being assigned it's weight. That gives you a function $w$ from fruit to $\mathbb{R}^+$. We still can't add fruit (the domain set) but we can easily add it's weights (the range set). This is very similar to how probability works, you can't usually add "heads" and "tails" but you can add the probabilities that either will happen.
Now to the slight misunderstanding you had about probability. The probability function (usualy) $P$ is not from the set $\Omega=\{"heads","tails"\}$ rather it's (i'll simplify a bit here) from $\mathcal{P}(\Omega)$. That is it's from the power set of $\Omega$. This is important for a number of reasons one of which is you want to have the event that either "heads" or "tails" will happen.
Another useful thing about having $P$ be from $\mathcal{P}(\Omega)$ is that you do actually get "addition" in some sense on the base set in terms of union. This is also part of the definition of the probability function $P(A\sqcup B)=P(A)+P(B)$ where $\sqcup$ denotes disjoint union. (Again I'm simplifying a bit here the definition of probability space requires countable additivity).