Suppose we have iid samples $X_1,\cdots,X_n$, with the number of samples $n$ unknown, but I can sample from their sum $m=\sum_{i=1}^n X_i$. Further suppose $\mathbb{E}[X_i]=\mu$ and $Var[X_t]=\sigma^2$, with both $\mu$ and $\sigma$ known.
If I want to estimate the number of samples $n$, intuitively, one would find the nearest integer from $\frac{m}{\mu}$ (or is there any better way to estimate $n$?) If I want the estimate to be 95% trust-worthy, I guess there should be some requirements on the variance $\sigma^2$ and the true sample number $n$.
My attempt:
Suppose $n$ is huge and according to central limit theorem, the distribution of $\frac{m}{n}$ is approximately $\mathcal{N}(\mu,\frac{\sigma^2}{n})$. But I have a trouble handling the "rounding function". And probably central limit theorem is probably not proper for this circumstance, since it says what would happen for $n$ goes to infinity, but what we are trying to do here is exactly estimating $n$.
I tried to use Hoeffding's inequality, but since $n$ is stochastic here, I am not sure Hoeffding's inequality is proper for this circumstance.
This may not be the most useful answer, since there aren't any answers yet, I figured it is better than nothing.
If the $X_{i}$ are $N(\mu, \sigma^{2})$, the sum $m$ is distributed as $N(n\mu, n\sigma^{2})$, so the problem of estimating $n$ is equivalent to the problem of estimating $\theta$ for $Y \sim N(\theta, k\theta)$ for known proportionality $k$.
As pointed out in the comments, one approach is the maximum likelihood estimator, which is rather efficient in general. The log-likelihood is \begin{align*} L(\theta | y) &= -\log(k\theta) - \frac{1}{2}\left( \frac{y - \theta}{\sqrt{k\theta}}\right)^{2} + \mathrm{const} \\ &= -\log(\theta) - \frac{1}{2}\left(\frac{y^{2}}{k\theta} - 2 \frac{y}{k} + \frac{\theta}{k} \right) + \mathrm{const} \end{align*} The derivative with respect the $\theta$ is $$ L'(\theta|y) = -\theta^{-1} + \frac{y^{2}}{k} - \frac{1}{2k},$$ so the maximum likelihood estimator is $$ \hat{\theta} = \frac{2y^{2} - 1}{2k} = \frac{y^{2}}{k} - \frac{1}{2k}.$$ Since $k=\sigma^{2}/\mu$ and $n = \theta/\mu$, $$ \hat{n} = \frac{m^{2}}{\sigma^{2}} - \frac{1}{2\sigma^{2}}. $$ Interestingly, this is basically a method of moments estimator like you suggested, but using the second moment instead of the first.
If this were unbiased, it would be a minimum-variance unbiased estimator (since maximum likelihood estimators satisfy the Cramer-Rao bound), but unfortunately it is not since $$ \mathrm{E}[y^{2}] = \theta^{2} + k\theta$$ implies $$ \mathbb{E}[\hat{\theta}] - \theta = \frac{\theta^{2} - 1/2}{k}.$$ However, it is probably still reasonably efficient.
The confidence interval is a bit tougher, since you cannot use the exact distribution of $\hat{n}$, which is non-central $\chi^{2}$ with degrees of freedom $n$. However, if $n$ is large, a normal approximation may be accurate.