I have been looking for some resources on these types of meta probability questions, but haven't been lucky so far.
The question I was pondering, say I suspect someone to be an alien and I would like to find out if that is true or not. A strategy I have thought of is, gather a bunch of reaction samples from people I am certain of that they aren't aliens, meaning, I ask them "Are you an alien?" and record what the reaction is. I basically assume that the number of reactions a human could give to that question is finite (albeit possibly huge). Then I ask the suspect the question. If I have encountered the reaction before, I guess that someone is actually human. However, if the reaction is different, I guess that the suspect is an alien.
My question is, what is the probability that the guess is correct?
The way I tried to formalize this was:
Let X1,..., Xn be i.i.d variables defined on the same probability space which map into a finite set Z. Let Y be similar, but we do not know its distribution and how it is related to the X variables. Let S be a subset of Z such that the probability that the X variables map into S, but Y doesn't, is nonzero.
Given the event that the X variables map into S, but Y doesn't, what is the probability that Px=Py, where Px and Py are the respective image measures?
However, this isn't really formally correct since I am still missing a probability space for the meta probability. And here is where I am stuck.
Does this whole embargo make sense and can anyone point me in a helpful direction please? Also sorry for the lack of convenient math notation, but I am on the phone and just had to get this question out there.
It's possible to describe a "meta" probability space in which the values of random variables are themselves probability spaces/random variables/etc., but this isn't useful for practical modeling, because random variables in the same probability space can already depend on each other in complex, hierarchical ways. For example, a Bayesian network specifies a collection of random variables in terms of their dependencies and conditional distributions. I'd recommend studying Bayesian methods - they provide a clarifying perspective on most practical inference problems.
For your alien problem, a sensible quantitative approach might be: Ask known humans the question until you get enough duplicates that only 5% of the answers you've heard are unique; infer from this that a human will give one of the known responses with probability $.95$. Then if your suspect gives a novel response, you can reject the hypothesis that they're human at significance level $p=0.05$. This is just a hypothesis test, though. If you want to calculate an actual probability that the suspect is human, you also need a prior probability that the suspect is human as well as a model for aliens' responses (not just humans') in order to apply Bayes' rule:
$$P(\text{human}|\text{unique response})=\frac{P(\text{unique response}|\text{human})P(\text{human})}{P(\text{unique response})}$$
where
$$P(\text{unique response}) =P(\text{unique response}|\text{human})P(\text{human}) +P(\text{unique response}|\lnot\text{human})P(\lnot\text{human}).$$