$\hat\beta$ is the least square estimator and $P_x=X(X^TX)^{-1}X^T$
Why does $Y^T P_x Y \sim \sigma^2 \chi^2(p,\sigma^{-2}\beta^TX^TP_xX\beta)$ and RSS $\sim\sigma^2\chi^2(p,\sigma^{-2}\beta^TX^TX\beta)$
The assumption is that $Y$ is normally distributed. If I understand correctly, $Y\sim N(X\beta,I\sigma^2)$
I thought $\frac{Y^TY}{\sigma^2}$ is disributed with $\chi^2$
In particular, $(Y-X\beta)\sigma I^{-1}\sim N(0,1)$ and hence $\sigma I^{-1}(Y-X\beta)^T(Y-X\beta)\sigma I\sim \chi^2$
I'm not sure about degrees of freedom.
$Y^TP_xY=Y^TY-e^Te$ where $e=Y-X\hat\beta$, the error vector. I also know that $\hat\beta \sim N(\beta, \sigma^2(X^TX)^{-1})$
When you write $(Y-X\beta)\sigma I^{-1}\sim N(0,1)$ you're confusing matrices with scalars. The notation $N(0,1)$ is that of a scalar-valued random variable.
Where you write $N(X\beta, \sigma^2 I)$ (I've always seen the scalar factor written to the left of the matrix) the vector $X\beta$ is $n\times 1$ and the matrix $\sigma^2 I$ is $n\times n$. Generally if you have $$ W\sim N(a,M) $$ where $a$ is $n\times 1$ and $M$ is $n\times n$, then for a $k\times n$ matrix $A$ you have $$ AW \sim N(Aa, AMA^T) $$ and so $AMA^T$ is a $k\times k$ matrix. That's the matrix counterpart of a familiar identity for variance of a constant multiple of a random vector. If yo want to multiply be something to get unit variance, you rely on a result from linear algebra that says a positive-definite symmetric matrix $M$ with real entries has a postive-definite symmetric square root, whose inverse let us call $M^{-1/2}$. So if $$ W\sim N(0, M) $$ then $$ M^{-1/2} W \sim N(0, I) $$ and there you have unit variance, the $n\times n$ identity matrix. Also, note that $I^{-1} = I$ and $(\sigma I)^{-1} = \sigma^{-1} I$, not $\sigma I^{-1}$.
You should have $P_X$, not $P_x$, i.e. $P_X = X(X^T X)^{-1} X^T$ is the orthogonal projection onto the column space of $X$.
Since $(I-P_X)X=0$ and $Y \sim N(X\beta, \sigma^2 I)$, you have $$ (I-P_X)Y \sim N(0, \sigma^2 (I-P_X)(I-P_X)^T). $$ an easy bit of algebra tells you that $(I-P_X)(I-P_X)^T = I-P_X$. (Use the fact that $PX^2=P_X^T = P_X$.) So you have $$ (I-P_X)Y \sim N(0, \sigma^2 (I-P_X)) $$ with (note well) expected value $0$.
The matrix $(I-P_X)$ is a projection onto a space of dimension $n-k$ where $k$ is the number of columns of $X$, and the expected value of $Y$, which is $X\beta$, projects to the zero vector. So let us choose an orthonormal basis of the column space of $X$ and extend it to an orthonormal basis of $\mathbb R^n$. In that basis the matrix of $I-P_X$ is $$ \begin{bmatrix} 1 \\ & 1 \\ & & \ddots \\ & & & 1 \\ & & & & 0 \\ & & & & & \ddots \\ & & & & & & 0 \end{bmatrix} $$ with $n-k$ ones and $k$ zeros. Transforming $N(0,\sigma^2(I-P_X))$ into that basis, you see that the square of the norm of a vector with that distribution is distributed as $\sigma^2 \chi^2_{n-k}$.
That is the vector, not of the errors, but of the residuals. The difference is this: \begin{align} \text{vector of errors} & = Y-X\beta, \\[10pt] \text{vector of residuals} & = Y - X\widehat\beta. \end{align}