Consider a maturity $T$, for this maturity I have some implied volatility from market denoted $\sigma^{0}_{i}$. I want to interpolate these volatility using Entropy approach, by using $\sigma^{0}_{i}$ as prior.
From $\sigma^0_i$, market prices of options $C_i$ can be obtained, out of which a probability distribution function $\mathbb{P}_0$ for $S_T$ is derived.
I want to find a distribution $\mathbb{P}$ for $S_T$ such that
- The entropy is maximized
- $\mathbb{P}_0$ is used as prior distribution
- The distribution $\mathbb{P}$ yields prices as near from market prices as possible.
I know the probability distribution $\mathbb{P}$ is given by the ollowing minimization problem:
$\mathbb{E}^{\mathbb{P}}\left[\ln\left(\frac{d\mathbb{P}}{d\mathbb{P}^0}\right)-1\right]+\sum_{i}\omega_i\left(\mathbb{E}^{\mathbb{P}}\left[f_i(S)\right]-C_i\right)^2$
$\mathbb{P}_0$ is the market implied probability distribution of $S_T$ linked to the $\sigma^{0}_{i}$.
$f_i(S)=max(S_T-K,0)$, i.e. the payoff of market call options, with strike $K$
$C_i$ are market prices of call options
$\omega_i$ are weights
Then I understand that the term $\sum_{i}\omega_i\left(\mathbb{E}^{\mathbb{P}}\left[f_i(S)\right]-C_i\right)^2$ imposes a penalty to deviations to market prices.
But I do not understand the term
$\mathbb{E}^{\mathbb{P}}\left[\ln\left(\frac{d\mathbb{P}}{d\mathbb{P}^0}\right)-1\right]$
What is the rationale of this term? Why does it maximize entropy?
AFAIK, continuous entropy is an ambiguous object. The first term is rather the KL divergence between $\Bbb P$ and $\Bbb P_0$ (the $-1$ does not affect optima), so the formula you wrote when optimized balances the KL divergence and prices of the Call options. KL divergence is also known as relative entropy.
Now, to the formulation of your problem. I have seen the term prior when related to distributions only used in Bayesian statistics, when you compute posteriors using the data at hand. Does not seem to match your situation. Also, one in general cannot hope to optimize for two quantities that are not related by an increasing function. To me as close as possible sounds like an optimization problem, where you would like to minimize the distance.
As a result, $$ \mathbb{E}^{\mathbb{P}}\left[\ln\left(\frac{d\mathbb{P}}{d\mathbb{P}^0}\right)-1\right]+\sum_{i}w_i\left(\mathbb{E}^{\mathbb{P}}\left[f_i(S)\right]-C_i\right)^2 = \mathrm{D}_{KL}(\Bbb P\|\Bbb P_0) + \mathrm{MSE}_w( C^\Bbb P, C) $$ does the following: it minimizes the sum of the relative entropy of $\Bbb P$ w.r.t. $\Bbb P_0$ and weighted mean squared error between the market options prices and those implied by $\Bbb P$. That would be the formally correct restatement of the three bullet points optimization problem you've mentioned in the OP.