I'm studying the chi-square test of independence.
According to my understanding, we first hypothesize independence between variables and consider them as being normally distributed. Then we go on to calculate the test statistic as
$$ Z_i = \frac{O_i-E_i}{\sqrt{E_i}}$$ where O is the observed value from table and E is the calculated expectation.
- Is $Z_i \sim \mathcal{N}(0,1)$ ?
- Is $\sum_1^k Z_i^2 \sim \chi^2(k)$ ?
Continuous distributions. First, if $Z \sim N(0,1)$, then $Z^2 \sim Chisq(df=1)$. This is easily proved using direct integration or moment generating functions. Next, if $Z_1, \dots Z_k$ are independently $N(0,1),$ then $\sum_{i=1}^k X_i^2 \sim Chisq(df=k),$ as you say. This is easily proved by noticing the $k$th power of the MGF of $Chisq(1)$ is the MGF of $Chisq(k)$.
Discrete distributions.However, the above results to not apply directly to your question about a chi-squared test of independence or of goodness-of-fit. In this case the $O_i$ are integers and so your $Z_i$ are necessarily discrete (although not generally integer valued). So the distribution of $Z_i$ is at best only approximately $N(0,1),$ so $Z_i^2$ is at best approximately $Chisq(1).$
The usual assumption thus far is that the $O_i$ are Poisson with means $E_i$ and standard deviation $\sqrt{E_i}.$ Accordingly, your $Z_i$ are standardized Poissons, which are approximately standard normal for sufficiently large $E_i.$
Goodness-of-Fit tests. A very simple goodness-of-fit test would be to roll a fair die $n=600$ times so that there are $n$ possible values $1, \dots, 6,$ each with probability $p_i = 1/6$ and $E_i = np_i = 600(1/6) = 100.$ Here, the $O_i$ are the actual number of times each value (face) appears. They are actually binomial, approximately Poisson, and approximately normal.
However, in this case, $\sum_{i=1}^k Z_i^2 \sim Chisq(n-1).$ There are $n-1$ degrees of freedom instead of $n$, roughly speaking, because there is one linear constraint: the $O_i$ must sum to 600.
Tests if Independence. In a typical chi-squared test of independence there are two categorical variables and we are trying to judge their independence. This is not the place for a full review of such tests, but we can briefly state the approximate distribution theory.
If the 'contingency table' is based on $r$ rows and $c$ columns (corresponding to levels of the two categorical variables), then you have the test statistic $Q = \sum_{ij} (O_{ij} - E_{ij})^2 /E_{ij},$ where the $E_{ij}$ are computed assuming independence of the two categorical variables. Then $Q$ is approximately distributed as $Chisq((r-1)(c-1)),$ provided the $E_{ij}$ are sufficiently large. Again here, the number of degrees of freedom is adjusted downward from $cr$ to account for the linear constraints imposed by the structure of the computations.
Simulation studies have shown that the fit to a chi-squared distribution is "sufficiently good" in the right-hand tail (where accept/reject decisions are made), if all the $E_{ij} > 3$ and "all but a few" have $E_{ij} > 5.$