First a warning: this is not the most interesting question but I want to update my understanding of independence now that I'm taking 1st year statistics
I often heard in my 1st year probability class that "independent RVs don't give any information about each other"
Rolling a fair dice $99$ times doesn't give info on the hundredth roll
However, let's say $X_1, ..., X_{100}$ are iid discrete uniform from $1$ to $\theta$ which is unknown
Then by observing $X_1,...,X_{99}$ you can make inferences about the hundredth observation and make a prediction interval
Is it safe to say that: given you know all the parameters or complete pmf/pdf of $F$, independent RVs don't give any information about each other. However, given that there are unknown parameters, independent RVs do give you information about each other. Does that make sense to even say?
I know this is more of an English question versus formal definitions (splitting pmfs/pdfs) but I'd like to try and be precise about this. Thanks for your help and patience.
Note the Nate Eldredge comment on two different views of the unknown $\theta$.
1) If $\theta$ is a random variable then you can say $\{X_1, ..., X_{100}\}$ are conditionally i.i.d. given $\theta$, but they are not i.i.d. because they all depend on the common random variable $\theta$. This is the "Bayesian" view of $\theta$.
2) If $\theta$ is a fixed (but possibly unknown) constant then indeed you can say $\{X_1, ..., X_{100}\}$ are i.i.d. because \begin{align} &P[X_1\leq x_1, ..., X_{100}\leq x_{100}] \\ &= \prod_{i=1}^{100}P[X_i\leq x_i] \quad \forall (x_1, ..., x_{100}) \in \mathbb{R}^{100} \quad (Eq. 1) \end{align} It can be shown that (Eq. 1) implies $$ E[h(X_{100})|X_1,...,X_{99}]=E[h(X_{100})] \quad (Eq. 2)$$ for all (measurable) functions $h$. The equations (Eq. 1) and (Eq. 2) hold regardless of whether or not the value $\theta$ is known. In particular, the expressions in (Eq. 1)-(Eq. 2) may depend on $\theta$, but the expressions exist (and the left-hand-sides are equal to the right-hand-sides) regardless of whether or not $\theta$ is known/unknown to an observer.
You can indeed interpret the Equations (1)-(2) to mean "the variables provide no information about each other": Knowing the outcomes of $X_1, ..., X_{99}$ does not change probabilities or expectations that involve only $X_{100}$. Of course, those probabilities/expectations are themselves unknown if $\theta$ is not known. So at a "higher level" you can indeed say that $X_1, ..., X_{99}$ gives "information" about the unknown $\theta$ and hence "information" about $X_{100}$. For example, if we observe $X_{99}=207$ then we know it is possible for $X_{100}>200$. However, this does not change $P[X_{100}>200]$ because that probability itself depends on $\theta$. It is difficult to quantify what "information" means without taking the "Bayesian" approach of treating the unknown $\theta$ itself as a random variable.
On the other hand, there are some interesting things that can be said about $\theta$ when we just treat it as a constant (not a random variable), such as mean-square-error of approximating $\theta/2$: If $\{X_i\}_{i=1}^{\infty}$ are i.i.d. uniform over $[0,\theta]$ then $$ E\left[\left(\frac{\theta}{2} - \frac{1}{n}\sum_{i=1}^n X_i\right)^2\right] = \frac{Var(X_1)}{n} = \frac{\theta^2}{12 n}$$ Of course this bound itself depends on $\theta$, but if we somehow know that $\theta \leq 100$ then we can say the mean-square-error is no more than $100^2/(12n)$.
Here is an estimator of $\theta$ with an improved mean-square-error: Define \begin{align} \hat{\theta}_n &= \frac{2}{n}\sum_{i=1}^n X_i\\ \tilde{\theta}_n &= \max\left\{\hat{\theta}_n, X_1, X_2, ..., X_n\right\} \end{align} It can be shown that (surely): $$ (\tilde{\theta}_n-\theta)^2 \leq (\hat{\theta}_n -\theta)^2$$ and so $$ E\left[\left(\tilde{\theta}_n-\theta\right)^2\right] \leq E\left[\left(\hat{\theta}_n-\theta\right)^2\right] = \frac{\theta^2}{3n}$$ Some useful notes on other improvements are here:
http://www-stat.wharton.upenn.edu/~dsmall/stat512-s05/notes2.doc