How was the determinant of matrices generalized for matrices bigger than $2 \times 2$?
I read a book a very long time ago where it said something like this:
Given a system of two equations with two unknowns:
$$ ax_1+bx_2=y_1 \\ cx_1+dx_2=y_2 $$
Multiplying the first equation by $d$, the second by b and substracting the first from the second we get:
$$ s:(ad-cb)x_1=dy_1-by_2 $$
Then
1) $ad-cb=0 \wedge dy_1-by_2=0 \iff s \text{ has infinite solutions.}$
2) $ad-cb=0 \wedge dy_1-by_2\neq0 \iff s \text{ has no solutions.}$
3) $ad-cb\neq0 \iff s \text{ has a unique solution.}$
And $ad-cd$ is called the determinant of this system of equations (or matrix).
How was this generalized for bigger systems/matrices?

Determinant = product of the eigenvalues. That says it all. All the properties of the determinant flow from this if you understand eigenvalues.
The determinant = 0 if and only if the matrix is singular, which is true if and only if at least one eigenvalue = 0. That's very powerful right there.
Eigenvalues are your friend. If you don't know what they are, pick up a linear algebra book and find out. You can try wading through http://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors , but if you don't have a strong enough background, it might make for tough sledding other than for historical information. A really nifty and very useful result is that the sum of the eigenvalues of a matrix is equal to the sum of the diagonal elements of the matrix, with this sum being known as the trace of the matrix.