I am told that the definition of a polyhedron is
$$P = \{ x \vert a_j^T x \le b_j, j = 1, \dots, m, c_j^T x = d_j, j = 1, \dots, p \}.$$
I am then told that the compact notation is
$$P = \{ x \vert Ax \preceq b, Cx = d \},$$
where
$$A = \begin{bmatrix} a_1^T \\ \vdots \\ a_m^T \end{bmatrix}, \ \ \ \ \ \ \ \ \ \ C = \begin{bmatrix} c_1^T \\ \vdots \\ c_p^T \end{bmatrix},$$
and the symbol $\preceq$ denotes vector inequality or compenentwise inequality in $\mathbb{R}^m$: $u \preceq v$ means $u_i \le v_i$ for $i = 1, \dots, m$.
I am then told that a simplex is a family of polyhedra. If the $k + 1$ points $v_0, \dots, v_k \in \mathbb{R}^n$ are affinely independent, which means $v_1 - v_0, \dots, v_k - v_0$ are linearly independent, then the simplex determined by them is given by
$$C = \mathbf{\text{conv}}\{v_0, \dots, v_k\} = \{ \theta_0 v_0+ \dots + \theta_k v_k \vert \theta \succeq 0, \mathbf{1}^T \theta = 1 \},$$
where $\mathbf{\text{conv}}$ is the convex hull, and $\mathbf{1}$ denotes the vector with all entries one.
It is then said that, to describe the simplex as a polyhedron (that is, to describe it in the form $P = \{ x \vert Ax \preceq b, Cx = d \}$), we proceed as follows. by definition, $x \in C$ if and only if $x = \theta_0 v_0 + \theta_1 v_1 + \dots + \theta_k v_k$ for some $\theta \succeq 0$ with $\mathbf{1}^T \theta = 1$. Equivalently, if we define $y = (\theta_1, \dots, \theta_k)$ and
$$B = \begin{bmatrix} v_1 - v_0 \dots v_k - v_0 \end{bmatrix} \in \mathbb{R}^{n \times k},$$
we can say that $x \in C$ if and only if
$$x = v_0 + By$$
for some $y \succeq 0$ with $\mathbf{l}^Ty \le 1$. Now we note that affine independence of the points $v_0, \dots, v_k$ implies that the matrix $B$ has rank $k$. Therefore there exists a nonsingular matrix $A = (A_1, A_2) \in \mathbb{R}^{n \times n}$ such that
$$AB = \begin{bmatrix} A_1 \\ A_2 \end{bmatrix} B = \begin{bmatrix} I \\ 0 \end{bmatrix}.$$
I was unable to understand a number of steps here. First of all, I don't understand why, if we define $y = (\theta_1, \dots, \theta_k)$ and
$$B = \begin{bmatrix} v_1 - v_0 \dots v_k - v_0 \end{bmatrix} \in \mathbb{R}^{n \times k},$$
then we can say that $x \in C$ if and only if
$$x = v_0 + By.$$
It is not clear to me why the $x = v_0 + By$ follows from this, and nor do I understand why $x \in C$ if and only if it follows. Secondly, I was unable to understand why affine independence of the points $v_0, \dots, v_k$ implies that the matrix $B$ has rank $k$. If $B$ is subtracting the points as $v_1 - v_0 \dots v_k - v_0$, then does this not create a clear linear dependence? (I'm a novice to affine geometry, so there might be some property of affine spaces that I am not understanding here; I'm just trying to use my basic linear algebra understanding of linear independence in Euclidean spaces to understand this.) Thirdly, how does the author get that
$$AB = \begin{bmatrix} A_1 \\ A_2 \end{bmatrix} B = \begin{bmatrix} I \\ 0 \end{bmatrix}?$$
Sorry for the lengthy question, but I've been trying to understand this for a while, and I have as of yet been unable to make any progress. Thank you.
I will use the usual inequalities, it should be clear what they mean. So let us look at your first question. You want to show that $x \in C$ iff $\exists y\geq 0:\mathbf{1}^Ty\leq1 \text{ and }x=v_0+By$. Let us take a $x \in C$. By definition there exists positive $\theta_i's$ that sum to unity, st. $x=\sum_{i=0}^k\theta_iv_i$. Notice that since $\sum_{i=0}^k \theta_i=1,$ we have that $\theta_0=1-\sum_{i=1}^k \theta_i$. With this in mind let us expand our expression for $x$:
$$x=\sum_{i=0}^k\theta_iv_i=\theta_0v_0+\theta_1v_1+\dots+\theta_kv_k=(1-\sum_{i=1}^k \theta_i)v_0+\theta_1v_1+\dots+\theta_kv_k$$
$$= v_0-\theta_1v_0-\theta_2v_0-\dots-\theta_kv_0+\theta_1v_1+\dots+\theta_kv_k$$
$$=v_0+\theta_1(v_1-v_0)+\theta_2(v_2-v_0)+\dots+\theta_k(v_k-v_0) $$ $$=v_0+B \begin{bmatrix} \theta_1,&\cdots, &\theta_k \end{bmatrix}^T$$ Notice that $\begin{bmatrix} \theta_1,&\cdots, &\theta_k \end{bmatrix}^T$ has non-negative entries since each $\theta_j$ was non-negative. and since the sum of all $\theta_j$ sum to unity, the lack of $\theta_0$ must mean that $$ \mathbf{1}^T\begin{bmatrix} \theta_1,&\cdots, &\theta_k \end{bmatrix}^T \leq 1. $$ The proof of the "if" statement can be done similarly, backwards.
For your second question you ask whether or not the columns of $B$ aren't clearly linearly dependant. Counter-examples to this are easy and you should find them yourself, but the independence of the vectors $\{v_1-v_0,v_2-v_0,\dots,v_k-v_0\}$ comes from requiring affine independence of $\{v_0,v_1,\dots,v_k\}$. You write this yourself before defining the convex hull of a set of vectors.
For your last question we have that $B \in \mathbb{R}^{n\times k}$ has rank $k$ by assumption. Notice $k\leq n$ since we can't have $k>n$ and still have them to be linearly independant. From your first course in Linear Algebra, you know that it is possible to reduce the matrix $B$ to be in reduced row echelon form, where the last $n-k$ rows are zero-rows since all your columns are linearly independent. Each time you perform a row-operation on a matrix, you actually do a right-multiplication with a $n \times n$-elementary matrix. Each elementary matrix is invertible. The product of all your elementary matrices - that correspond to all your row-operations, is your matrix $A$. The following notes gives a good example of using elementary matrices to construct such a matrix: https://people.math.carleton.ca/~kcheung/math/notes/MATH1107/wk05/05_elementary_matrices_example.html