Situation: I'm currently creating a workflow for the design of experiments to be used for validating a numerical simulation. The simulation outputs a value $\phi$ for all values of $x$ within a domain. To determine if the simulation is accurately describing it's purpose we preform an experiment to get the 'real' value of $\phi$ for certain values of $x$. Because an experiment is always subject to error we perform the experiment multiple times and determine the mean value and standard deviation.
Goal: I want to be able to state with a certain confidence (ex. 95%) that the real-world value of $\phi$ is equal to the found mean value. When I've got the real world value i can determine the difference with the simulation.
Basic Questions: How do I calculate the required amount of times to perform the experiment before i can state with 95% confidence I have found the correct value.
Real Question: What thinking error am I making in the stated workflow. Usually when I can't find any information applicable to my question for an entire day I am doing something wrong, but I can't really find where I am going wrong and what I have to change. I've found info on how to calculate a sample size, but those require the dimensions of the population. It could be I'm wrong on this, but experimental measurements don't have a population size as far as i"m aware.
I hope I've made the question clear enough but if you need any extra information I'm happy to deliver it to the best of my knowledge. Thanks in advance for any responses.