Let $X_1, ..., X_n$ be positive discrete r.v.s such that $p^j_k = \mathbb{P}(X_j=k)$. Let $N \in \{ 1, ..., n\}$ be a discrete r.v. with pmf $a_j=\mathbb{P}(N=j)$. Then $X_N$ is the random variable chosen by sampling a random $p^j$ distribution according to $a_j$.
Calculate $H(N, X_N)$ and, for $p^j_k$ fixed, find its maximum value and the value of $a_1, ..., a_n$ which achieves it.
I believe $$ H(N,X_N)=-\sum_{j=1}^n\sum_{k=1}^\infty a_jp^j_k\log(a_jp^j_k)$$
I would guess that this is maximised by $$a_j=\frac{H(X_j)}{\sum_{j=1}^nH(X_j)}$$ simply because it seems like giving a higher probability to the r.v.s with more entropy would maximise the entropy. How can I prove (or disprove) this, preferably using Lagrange multipliers?
Your problem can be stated as:
$$\max_{a\in K} -\sum_{j=1}^n \sum_{k=1}^{\infty} a_j p_{jk}\log(a_j p_{jk})$$
where $K=\{a_1,\ldots,a_n\ge 0\,\mid\,a_1+\ldots+a_n=1\}$. This is a very favorable case of convex programming (objective is concave, constraints are affine) so for example in that case KKT conditions are necessary and sufficient conditions for optimality. So if we consider:
$$\begin{align} f(a)&=-\sum_{j=1}^n \sum_{k=1}^{\infty} a_j p_{jk}\log(a_j p_{jk})\\ g_i(a)&=-a_i\\ h(a)&=a_1+\ldots+a_n-1\end{align}$$
Notice that:
$$\begin{align}\nabla f(a)&=(-1-\log a_j+H(X_j))^T_{1\le j\le n}\\ \nabla g_i(a)&=-e_i\\ \nabla h(a)&=(1,\ldots,1)^T \end{align}$$
Then at optimal $a^*$:
The only way all these conditions are simultaneous satisfied is when $a^*_j=Ce^{H(X_j)}$ (for some constant $C$), so in the end:
$$a^*_j=\frac{e^{H(X_j)}}{\sum_{j=1}^n e^{H(X_j)}}$$
is the optimal solution. You can get to the same result using Lagrange multipliers but you probably need to handle manually what happens at the boundary.