I'm having trouble understanding why "the time until an event occurs" is modeled as:
$S(t) = P[T > t]$ where $T$ is time until event and $t$ is some time.
I don't see how $P[T<t]$ or $P[T=t]$ would be phrased, and I'm confused on how to know if the time till an event occurs is greater/less/equal to some time $t$.
For example, consider the following problem:
In analyzing the risk of a catastrophic event, an insurer uses the exponential distribution with mean $\alpha$ as the distribution of the time until the event occurs. The insured has $n$ independent catastrophe policies of this type. Find the expected time until the insurer will have the first catastrophe claim.
How do I know to use $P[T>t]$ and not $P[T<t]$ or $P[T=t]$?
The reason for this particular formulation lies in the wording of the problem.
We are told that there are $n$ independent and identically distributed (i.i.d.) random variables $T_1, T_2, \ldots, T_n$. We want to find the expected value of the minimum $T_\text{min}$ of these random variables (this might be interpreted as, e.g., the time until the first catastrophe occurs).
We now observe that $T_\text{min} > t$ if and only if $T_1 > t, T_2 > t, \ldots, T_n > t$. This is the critical observation. It permits us to conclude
$$ P(T_\text{min} > t) = P(T_1 > t, T_2 > t, \ldots, T_n > t) $$
From independence, we can then write
$$ P(T_\text{min} > t) = P(T_1 > t) \times P(T_2 > t) \times \cdots \times P(T_n > t) $$
and from the fact that they are identically distributed, we can write
$$ P(T_\text{min} > t) = [P(T_1 > t)]^n $$
(Note that we can not assert $T_\text{min} < t$ if and only if $T_1 < t, T_2 < t, \ldots, T_n < t$, so this useful telescoping does not occur if we approach the problem that way.) Now, for the exponential distribution, fortunately, we have $P(T_1 > t) = 1 - P(T_1 < t) = 1-[1-e^{-\lambda t}] = e^{-\lambda t}$ (where $\lambda = 1/\alpha$), so
$$ P(T_\text{min} > t) = e^{-n\lambda t} $$
That is, the minimum of $n$ i.i.d. exponentially distributed random variables is distributed identically to a single exponentially distributed random variable with $n$ times the rate (or, equivalently, $1/n$ of the mean).
Long story short: The analysis chooses to focus on $P(T > t)$ because the formulation of the problem makes that approach simple. If we were interested in the maximum of those random variables, we would choose $P(T_\text{max} < t)$. Unfortunately, in that case, the exponential distribution does not help us, particularly; the maximum of $n$ i.i.d. exponentially distributed random variables is not itself exponentially distributed.