$\newcommand{\set}[1]{\{#1\}}$ $\newcommand{\mc}{\mathcal}$ $\newcommand{\R}{\mathbf R}$ $\newcommand{\ST}{\mathbf S}$
Introduction
The purpose of this post is to understand the theorem of time dilation in special relativity. Before even stating the theorem, I wanted to make precise the fundamentals so that one can write a rigorous proof. Following is my attempt to come up with an axiomatization of special relativity. I was not able to follow the thought experiments in physics books (though I was able to appreciate them). More precisely, I would not like to use trains and lightning bolts in my arguments since they are not precise mathematical objects.
Also, I have seen a purely mathematical treatment in Gregory Naber's Geometry of Minkowski Spacetime but I was not able to follow the motivation behind the postulation of a bilinear form of a certain index on spacetime. I would like to anchor thing to "speed of light is constant in all inertial frames" and, I believe, the following achieves this. Basically, I would like to have a "well-motivated" development of the subject. Of course, this is inherently a subjective matter. (Here is my previous attempt at formalizing special relativity).
The problem I am facing is mentioned in the last section. Briefly, I am stuck at proving time dilation without having at least two spatial dimensions. Further, the axiom that I right now can come up with to exploit the extra dimension is ugly. More details below.
I have not given the proofs of any of the theorems below for two reasons. One is that this post is already very long and the second is that the proofs of these theorems are routine given the definitions.
Lastly, I hope the motivations of the axioms and definitions are clear to the reader and I will make every effort to clarify any of them if needed.
Axiomatization
We postulate that spacetime is a $2$-dimensional oriented affine space, which we denote as $\ST$, and whose elements are called events, equipped with a collection $\mc P$ of $1$-dimensional affine subspaces, whose elements are called photons.
A particle is any $1$-dimensional affine subspace of $\ST$. Thus each photon is a particle.
Axiom 1. There exist at least two non-parallel photons.
A frame of reference is an orientation preserving affine isomorphism $T:\R^2\to \ST$. The idea is that any frame of reference assigns coordinates to spacetime. We think of the $x$-axis of $\R^2$ as the "spatial axis" and the $y$-axis as the "temporal axis." More formally, given a frame of reference $T:\R^2\to \ST$ and an event $p$ in $\ST$, we define the spatial coordinate of $p$ as measured by $T$ as the $x$-coordinate of $T^{-1}(p)$, and the temporal coordinate of $p$ as measured by $T$ as the $y$-coordinate of $T^{-1}(p)$. The observer corresponding to $T$ is defined as the image of the $y$-axis under $T$. The idea here is that the events to which $T$ assigns $0$ as the spatial coordinate are the ones "experienced by $T$."
Let $T$ be a frame of reference. We say that two events $p$ and $q$, we say that $p$ and $q$ are simultaneous with respect to to $T$ if $p$ and $q$ have the same temporal coordinates with respect to $T$. Let $\alpha$ be a particle. The velocity of $\alpha$ as measured by $T$ is defined as the slope of the line $T^{-1}(\alpha)$. The magnitude of velocity is called speed.
A frame of reference $T$ is said to be admissible if each photon has speed $1$ with respect to it.
Axiom 2. There exists an admissible frame of reference.
Axiom 3. If a particle has speed $1$ with respect to an admissible frame of reference then the particle is a photon.
Theorem 1. Let $\alpha$ and $\beta$ be two photons which are not parallel. Then every particle parallel to $\alpha$ or $\beta$ is a photon, and every photon is parallel to $\alpha$ or $\beta$. Further, through every event there pass exactly two photons.
Now let $S$ and $T$ be two admissible frames of references and assume that $T(0) = S(0) = p$ for simplicity of the discussion. Let $v_1 = T^{-1}(Se_1)$ and $v_2 = T^{-1}(Se_2)$. Let $\ell_i$ be the line passing through the origin and $v_i$. If $m_1$ and $m_2$ be the two photons passing through $p$ and let $r_i = T^{-1}(m_i)$. Since each photon has speed $1$ with respect to $T$, we have $$ r_1 = \set{(x, y)\in \R^2:\ x=y} \quad \text{ and }\quad r_2 = \set{(x, y)\in \R^2:\ x+y=0} $$ Now using the fact that each photon has speed $1$ with respect to $S$, and he fact that both $S$ and $T$ are orientation preserving, we see that $r_1$ passes through $v_1+v_2$ and $r_2$ passes through $v_1-v_2$. From this it follows that the angle which $v_1$ makes with the $x$-axis is same as the angle which $v_2$ makes with the $y$-axis. (The following lemma is not exactly true, for in the figure followed by the lemma we could very well replace $v_1$ by $-v_1$ and $v_2$ by $-v_2$. I still think it is true enough and hence I do not remove it.)
Lemma 2. Let $S$ and $T$ be two admissible frames. Write $v_i = T^{-1}(Se_i)$ for $i=1, 2$. Let $l_i = \set{tv_i:\ t\in \R}$, $m_1$ denote the $x$-axis, and $m_2$ denote the $y$-axis.
- The orientation of $(v_1, v_2)$ is same as the orientation of $(e_1, e_2)$.
- The line joining $v_1$ and $v_2$ is parallel to $\set{(x, y)\in \R^2:\ x+y=0}$, and the line joining the origin and $v_1+v_2$ is parallel to $\set{(x, y)\in \R^2:\ x=y}$.
- The length of the two vectors $v_1$ and $v_2$ are equal.
- The signed angles $\angle(m_1, l_1)$ and $\angle(m_2, l_2)$ are equal.
Given two frames $S$ and $T$, we define the velocity of $S$ with respect to $T$ as the velocity of the observer corresponding to $S$ with respect to $T$.
Theorem 3. Let $S$ and $T$ be two admissible frames. Then the velocity of $S$ with respect to $T$ is same as the negative of the velocity of $T$ with respect to $S$.
Question
Now suppose $S$ and $T$ are two admissible frames with $S$ having speed $v$ with respect to $T$. It is immediately clear that if $p$ and $q$ are two events that are observed to occur in the same place by $T$, then the temporal difference between $p$ and $q$ is measured differently in the two frames. We would like to quantify this difference. The problem then essentially asks to find out the length of $v_2$ (or $v_1$) in the figure above, since the dilation factor is nothing but $XO/BO$. However, the axioms so far cannot be used to derive the length of $v_2$.
In Einstein's thought experiment we have a train housing a light clock which fires a photon perpendicular to the motion of the train and it is tacitly assumed that the dimension perpendicular to the motion of the train `behave the same with respect to both the observes.' I am able to make this thought experiment precise but with a very ugly axiom (that directions perpendicular to relative motion are unaffected). Also, it is unsettling that one needs an extra spatial dimension to establish this factor of time dilation.
EDIT: Andreas Blass had pointed out in the comments that the temporal order of two events may be different for different observers. This amounted to an error in my formalism and resulted in this edit.
