I'm winding my head around a quite general physics issue for some time. Given some density operator (or its matrix representation) $\rho$, what happens to its von Neumann entropy, when some of the states can be excluded? I'll try to formulate the question as good as I can in mathematical terms.
A density operator is a hermitian operator with trace 1: $\rho=\rho^\dagger$ and $Tr(\rho)=1$. The entropy of $\rho$ is defined as: $S(\rho)=-Tr(\rho\,\ln\rho)$.
Let $P$ be an orthogonal projector i.e., $P^2=P$ and $P\,(1-P)=0$, which is neither empty nor complete on $\rho$: $\rho\neq P\rho\neq0$.
Then another density operator can be defined as $\tilde{\rho}=\frac{1}{1-Tr(P\rho)}\cdot\left(\rho-P\rho\right)$, such that $Tr(\tilde{\rho})=1$, which excludes the subspace projected on by $P$.
What can we say about $S(\tilde{\rho})$?
For all practical purposes the spectrum of $\rho$ is discrete and finite. However, the limit to a discrete but infinite spectrum is definitely interesting. As far as I understand the issue, $S(\tilde{\rho})$ should be defined under these conditions.
I played around a lot to get to a solution. I'd like to outline one attempt to provide an idea of what kind of solution I am looking for.
Without loss of generality $\rho$ may be considered diagonal with $\rho=diag(\rho_0,...,\rho_N)$ for $N+1$ dimensional state space and $\rho_i\in\mathbb{R}^+_0$. Thus $1=Tr(\rho)=\sum_{i=0}^N\rho_i$ and $S(\rho)=-Tr(\rho\,\ln\rho)=\sum_{i=0}^N\rho_i\,\ln\rho_i$. If we chose $P=diag(1,0,...)$, then $Tr(P\rho)=\rho_0$ and $\tilde{\rho}=\frac{1}{1-\rho_0}diag(0,\rho_1,...,\rho_N)$. So, we end up with: $$ \begin{align} S(\tilde{\rho})&=-\frac{1}{1-\rho_0}\sum_{i=1}^N\rho_i\left(\ln\rho_i-\ln(1-\rho_0)\right)\\ &=\frac{\ln(1-\rho_0)}{1-\rho_0}\underbrace{\sum_{i=1}^N\rho_i}_{1-\rho_0}-\frac{1}{1-\rho_0}\underbrace{\sum_{i=1}^N\rho_i\ln\rho_i}_{-S(\rho)-\rho_0\ln\rho_0}\\ &=\ln(1-\rho_0)+\frac{S(\rho)+\rho_0\,\ln\rho_0}{1-\rho_0} \end{align} $$
Is it possible to extend this result for other, non-diagonal choices of $P$ e.g., by something like generalizing $\rho_0\rightarrow Tr(P\rho)$ in the equation above?
Furthermore, I expect $0\leq S(\tilde{\rho})<S(\rho)$; can this be shown? Or, if this is not true in general, what properties are required from $P$ in order to make this inequality true?
I did some numerical exploration and found that there is indeed something about $S(\tilde{\rho})$ that is easily seen. First, in $>10⁶$ random $\rho$ and $P$ with random dimension up to 100 the above inequality was never violated.
More interesting, the average entropy loss $\overline{S(\rho)-S(\tilde{\rho})}= f(Tr(P\rho))$ with its variance tending to $0$ as $\sim N^{-1}$. However, $f$ is a quite interesting function:
$$ \overline{S(\rho)-S(\tilde{\rho})}= f_N(Tr(P\rho)) \text{ with } f_N(x)\approx c_N\cdot\sum_{n=0}^{\infty}x^{2^n} $$
I cut off at $n=8$ for my numerical exploration. In fact, $c_N$ is pretty constant for all matrix dimensions. $c_N\approx\frac{1}{\sqrt{2}}+\frac{.03}{\ln(2N)}$ fits pretty well for $3<N+1<1000$. For smaller dimensions, $c_N$ is significantly smaller than this estimate. Probably this very special function ignites some ideas.
Note that this is different from this question, where $\rho$ lives in a tensor product space. However, if it can be shown that my question is equivalent given that any $\rho$ is produced by a sequence of such a mappings $(\rho\rightarrow\tilde{\rho})$ beginning with $1/(N+1)\cdot\mathbb{1}$, it would also be a solution.