Assume we have a vector $X$ consisting out of $N$ independent random variables $X_1, ..., X_N$. We determine ("measure") the value of "k" (say, for simplicity, the first $k$) of them, while the rest is (at least for now) not revealed.
I am interested in statements telling us something about, e.g., the sum of the remaining $N-k$ entries of the vector. An example would be a probabilistic statement along the lines $Pr\left[ \frac{1}{N-k} \sum_{i=1}^{N-k} X_{k+i} \geq f(\epsilon) \frac{1}{k} \sum_{i=1}^{k} X_i\right] \leq \epsilon$, where $f$ is some function (or a constant).
I know that such a statement can be made if the $X_i \text{~} \mathcal{N}(0,\sigma^2)$, so after normalisation, $X$ is uniformly distributed on the unit sphere of $\mathbb{R}^N$, but I wonder if statements like that exist for weaker requirements on $X$?
For example, what if $X_i$ is with probability $p$ drawn from $\mathcal{N}(\mu_1, \sigma^2)$ and with probability $1-p$ from $\mathcal{N}(\mu_2, \sigma^2)$. Then our random variable is only sub-gaussian.
What if we do not know anything about the distribution?
Is "concentration of probability measure" the right name for statements like that or is there a more proper term?
Thank you in advance for your help!
Bayesian statistics lets you draw the probability of some value $x_{k+1}$ given $k$ previously known values and a general model under parameter $\theta$.
$$p(x_{k+1}|x_1, ..., x_k)= \int_{-\infty}^{\infty}p(x_{k+1}|\theta)p(\theta|x_1,...x_k)$$
Let $X_{n-k}$ be the vector made of the still unknown values. You can use maximum likelihood estimation under some assumed high-entropy distribution to find whatever parameter $\theta$ fits your data $x_1, ..., x_k$ best and then find
$$p(X_{n-k}|x_1, ..., x_k)= \int_{-\infty}^{\infty}p(X_{n-k}|\theta)p(\theta|x_1,...x_k)$$
This is the best approximation I can see to your problem, though I do not know how sufficient it ultimately is.