I have a research project on building an object detection model for detecting animals in thermal imageries. I already have a baseline model, and want to do some uncertainty quantification on my predictions.
My idea was to do bootstrap sampling. Given X number of images, create 30 bootstrapped samples each with X images drawn with replacement from the original sample. Then for each bootstrapped samples, build a model and run inference on the same test sets.
For each test images, collect the predicted counts and come up with a prediction interval.
I set the number of bootstrapped samples as 30 to invoke CLT... Let me know if this is completely wrong or if I should increase the number of bootstrapped samples.
Also, if I draw X amount of images with replacement, I would certainly have less than X images in the bootstrapped sample since I might have duplicates. Is this okay?
Thank you!