A path to truly understanding probability and statistics

39.3k Views Asked by At

I'm embarrassed to say that I have a PhD and hold an asst professorship, but get tripped up when reading statistics research. I am in a field of Business that is similar to IO Psychology or Social Psych. I spend too much time reading applied stats books, but I find even with all the reading I don't have a firm grasp of what I'm actually doing. Everything is very 'seat of the pants.' (As sad as it seems, I think this is not a unique situation among the faculty in the social sciences...) The biggest problem comes when I need to apply a rarely used stat technique. I can find an article from a mathematical stats journal with the equations that would solve my problem, but I don't have the math to convert those into code. I am forever relying on other prof's R packages, and crossing my fingers hoping it will work (I can't even check to verify if it did or not). It's been over 15 years since I took Calculus and Algebra in undergrad, and I think I want to start at the beginning and truly understand probability and statistics.

I am starting with Gelfand's Algebra and Trigonometry books for a quick refresher of the basics -- I know it's hard to believe, but in an applied research field we rarely have use for sin or cos. I'm even trying to finally learn how to correctly do a proof, using the books from Velleman ("How to Prove It") and Houston ("How to Think Like a Mathematician") -- I'm serious about doing this right and understanding the subject. From there I want to move on to (correctly) learn the Calculus and Linear Algebra I need to tackle probability and statistics. I was thinking of using Strang's Calculus and Algebra books. But Apostol's Caculus comes highly recommended as well. After that I am completely at a loss. Further, I don't know how far to go into Calculus or Linear Algebra before I reach diminishing returns. (Apostle introduces Probability in the second half of Vol. 2 -- is it vital that I work through everything preceding it before tackling Probability?)

So my question is: if you had to do it over again with the goal of truly, deeply understanding statistics, where would you start? What books are the modern path to deep understanding? I would like to follow a modern path so that I can understand current research in statistics, including Bayesian approaches. But not in a machine learning context (which seems to be the all the rage at the moment), rather a social science / design and analysis of experiments / multilevel modeling context. Perhaps my goal would be the work of Andrew Gelman; his and Hill's book showed me how I should be looking at modeling and statistics (simulation, uncertainty estimates everywhere, bayesian inference, and so on). How should I go about relearning this material with that end goal in mind?


Update 1: Possible texts, starting from scratch with a focus on proofs and deep understanding. Not necessarily one after another.

Relearn the basics:

Calculus (which one(s), and how deep?):

Linear Algebra (which one(s) and how deep?):

Probability (which one(s)?):

Core Statistics (which one(s)?):

Other suggestions? Again with the goal of understanding and developing (or at least implementing) new methods in hierarchical modelling (generalized and linear).

3

There are 3 best solutions below

7
On

My humble contribution to your book list: Linear Algebra Done Right by Axler. It's a brilliant book that makes a lot of abstract things very clear. It had been recommended to me many times.

Also, I recently found a book entitled Statistical Methods: The Geometric Approach. I haven't read through all of it yet, but it gives a very basic introduction to probability from a linear algebra perspective, which I think is very intuitive (much easier on the eyes than looking at sigmas with a bunch of random indices I feel).

(Sorry, I'm too noob on this website to post a comment.)

4
On

As someone who started out their career thinking of statistics as a messy discipline, I'd like to share my epiphany regarding the matter. For me, the insight came from Linear Algebra, so I would urge you to push in that direction.

Specifically, once you realize that the sum of squares, $\sum_i X_i^2$, and sum of products, $\sum_i X_i Y_i$, are both inner products (aka dot products), you realize that nearly all of statistics can be thought of as various operations from linear algebra.

If you sample $n$ values from a population, you have an $n$-dimensional vector. The sample mean is a projection of this vector onto the $n$-dimensional all-ones vector. The standard deviation is projection onto the $(n-1)$-dimensional hyperplane normal to the all-ones vector (finally an intuitive reason for the "$n-1$" in the denominator!). Specifically, for the sample variance $s^2$ for sample $X$, here is the linear algebra:

First, we work with deviations from the mean. The mean in linear algebra terms is

$\bar{X}=\frac{\langle X,\mathbf{1}\rangle}{\langle \mathbf{1},\mathbf{1}\rangle} \mathbf{1}$

where $\langle \cdot, \cdot \rangle$ is the inner product and $\mathbf{1}$ is the $n$-dimensional ones vector. Then the deviation from the mean is

$x = X - \bar{X}$

Note that $x$ is constrained to an $(n-1)$-dimensional subspace. The usual equation for variance is

$s^2 = \dfrac{\sum_i (X_i - \bar{X})^2}{n-1}$

For us, that's

$s^2 = \dfrac{\langle x, x \rangle}{\langle \mathbf{1}, \mathbf{1} \rangle}$

which, without going into too much detail (too late) is a normalized deviation. The trick there is that the new $\mathbf{1}$ has dimension $n-1$.

The other good example is that correlation between two samples is related to the angle between them in that $n$-dimensional space. To see this, consider that the angle between two vectors $v$ and $w$ is:

$\theta = \arccos \dfrac{\langle v, w \rangle}{\|v\|\|w\|}$

where $\|\cdot\|$ is vector length. Compare this to one of the forms for the Pearson Correlation and you will see that $r = \cos \theta$.

There are many other examples, and these have barely been explained here, but I just hope to give an impression of how you can think in these terms.

0
On

I think to 'truly, deeply understand' statistics, you have to understand probability theory . Here's some resources to gain a strong conceptual foundation:

Harvard Stat 110 http://projects.iq.harvard.edu/stat110 The psets are gold.

The MIT course on Applied probability is of equal quality, and you can find it on edX

https://www.edx.org/course/mitx/mitx-6-041x-introduction-probability-1296

An informative and entertaining read to hone intuition:

http://www.amazon.com/Lady-Luck-Theory-Probability-Mathematics/dp/0486243427