I have two normally distributed populations X and Y that have the same variance and are independent. I took two samples from X and Y and got the following:
For $X:\; n = 10,\; \sum_{i=1}^{10} X_i = 78,\; \sum_{i=1}^{10} = 634.$
For $Y:\; n = 8,\; \sum_{i=1}^8 Y_i = 36,\; \sum_{i=1}^6 = 184.$
Now I need to build a confidence interval with 98% probability for the difference of the means.
I understand that because the sample is not large, I will need to use t-student distribution. My question is how to find the standard deviation to use in the confidence interval formula considering I only have the sum of $X_i$ and the sum of the $X_iˆ2.$
Shall I use that $Var(X) = E(X^2) - E(X)^2?$
I will get you started on this by showing you how to compute the required sample standard deviations, pointing you to the formulas for the CI, and showing you the CI I got from software.
Formulas for sample variance. The sample variance for the $X_i$ can be obtained from the following formula:
$$S^2_x = \frac{\sum_{i=1}^n(X_i-\bar X)^2}{n-1} =\frac{\sum_{i=1}^n X_i^2 - \frac 1n(\sum_{i=1}^n X_i)^2}{n-1}\\ =\frac{\sum_{i=1}^n X_i^2 - n\bar X^2}{n-1}.$$ Take the square root of the sample variance to get the sample standard deviation.
Sample means and standard deviations for your data. Using R as a calculator: The sample mean (average) of the first sample is $\bar X = 7.8$
and (using the formula above) the sample standard deviation of the first sample is $S_x = 1.6865.$
Similarly, for the second sample: $\bar Y = 4.5$ and $S_y = 1.7728.$
Finding the CI for the difference in population means. Now you are ready to find the formulas for the pooled estimate of the common population standard deviation and the formula for a confidence interval of the difference $\mu_x - \mu_y$ in population means.
Instead of doing these computations here, I put sample sizes, means, and standard deviations into a recent release of Minitab in order to get the 95% CI for the difference in population means from Minitab's pooled t test procedure. Results below:
The difference in sample means is $\bar X = \bar Y$ $= 7.80 - 4.50$ $= 3.30,$ which is the center of your 95% CI for the difference in population means $(1.566, 5.034),$ which also uses the pooled estimate $S_p = 1.7248$ of the common population standard deviation. (The pooled method assumes that the two populations have the same standard deviation.)
Note: The methods above are correct, but please proofread my typing, data entry, and computations.