Let $C$ be a linear code with parity check matrix $H$. Then $d(C) = d$ if and only if every set of $d − 1$ columns of $H$ is linearly independent and some set of $d$ columns of $H$ is linearly dependent.
The proof in my script works as follows:
Let $v \in C$ and $i_1, ..., i_u$ be the non-zero components of $v$. Then, the columns $i_1, ..., i_u$ of the parity check matrix must be linear independent. When we choose a vector of weight $d$ for $v$, we receive $d$ linear independent columns of $H$. On the other hand, the condition $u \ge d$ must hold if you can always pick $d-1$ linear independent columns.
I really don't see how this proof works. I understand why the columns $i_1, ..., i_u$ of the parity check matrix must be linear independent when they are the non-zero components of $v$, but the rest of the proof doesn't tell me anything.
The rest of the proof states exactly the opposite of the first part. Consider you have a group of $u<d$ columns of $H$, denoted as $i_1....i_u$, and that they're linear dependent. Therefore you have a group of $k$ different columns that adds up to zero, when $k\le u<d$. You can assemble a new code word $v\in C$ with weight $k$, by taking the $k$ indices as the nonzero indices of the code word. Now, take any code word $v_1\in C$, then $v_1 + v \in C$. Hence: $d(v_1+v,v_1) = k < d$.