I want to do a Bayesian bootstrap on sample $x_i$ (size $N$) that is already weighted with weights $w_i$. Weights are assumed to sum to $N$, i.e., $\sum w_i=N$.
Now, when weights equal 1, i.e., $w_i=1$, one would use the Dirichlet distribution with parameter $\alpha_i=1$ to sample weights. It's the continuous equivalent of resampling with replacement.
Now suppose that I the weights are not already all equal to one. What would the equivalent in this case of sampling with replacement? My best guess would be to use a Dirichlet distribution with $\alpha_i=w_i$.
Also, I'm probably not the first one to encounter this problem. So there should probably be a reference on this.
Anyone any ideas?
Say we have $n$ data points $y_1,\dots,y_n$ and weights $w_i, i=1,\dots,n$ for each of them updated on each iteration of the bootstrap procedure.
The weights $w_i$ correspond to each data point $y_i$. For the standard bootstrap, the weights are multiples of $1/n$. Multiplying by $n$ gives the number of times each data point was resampled. This is a multinomial distribution, the multidimensional generation of the binomial distribution. If the quantity of interest is the sample mean, we take the dot product of $\textbf w$ with $\textbf y$. Otherwise, each data point has been sampled $nw_i$ times and you can use this to calculate your statistic of interest.
For the smoother bootstrap, the weights are no longer drawn from multinomial but from Dirichlet, the multivariate generation of beta distribution. In general we use $Dirichlet(\alpha,\dots,\alpha)$ because we want each data point to be "equally likely." However different values of $\alpha$ give different types of bootstrap samples. Common is $\alpha=1$. According to Towards Data Science, "The distribution of the random weight vector does not have to be restricted to the Diri(l, … , 1). Later investigations found that the weights having a scaled Diri(4, … ,4) distribution give better approximations (Tu and Zheng, 1987)."
You can see the different densities of different choices of $\alpha$. For instance: https://upload.wikimedia.org/wikipedia/commons/7/74/Dirichlet.pdf
You can see that when $\alpha_1=\dots=\alpha_K$ the distribution is 'evenly spread' and otherwise it is more concentrated on the higher $\alpha_k$ than the others. $\alpha=1$ corresponds to a uniform distribution, but notice that $\alpha=c$ gives the same 'spread' to each coordinate. If $\alpha<1$ there are spikes at the corners (not shown here). We cannot use unequal $\alpha$'s because this would imply that the data points come from a biased sample.
Thus, for your question if the weights are not all $1$, by definition of weights, they are updated on each iteration of the bootstrap procedure. Or if you mean you would like to give preferential treatment to a particular data point, this isn't allowed as it means your sample is biased?