I would like your help to understand what are the crucial assumptions driving the claim reported below.
Let me start with the notation
$\mathcal{Y}\equiv \{1,2,...,M\}$.
$S$ is a random vector with support $\mathcal{S}$. $\mathcal{S}$ could be finite or not, countable or uncountable.
$\epsilon\equiv (\epsilon_1,...,\epsilon_M)$ is an $M\times 1$ random vector.
For each $s\in \mathcal{S}$, we denote by $G_{s}$ the probability distribution of $\epsilon$ conditional on $S=s$.
For each $s\in \mathcal{S}$, $\theta_s\equiv (\theta_{sy}\text{ }\forall y \in \mathcal{Y})$ is an $M\times 1$ vector of parameters belonging to $\Theta_s\subseteq \mathbb{R}^M$.
Given $s\in \mathcal{S}$ and $\theta_s\in \Theta_s$, consider the region $$ \mathcal{E}_y(s; \theta_s)\equiv \{e\in \mathbb{R}^M: \text{ }\theta_{sy}+e_y\geq \theta_{sz}+e_z \text{ }\forall z \in \mathcal{Y}\} $$
Given $s\in \mathcal{S}$ and $\theta_s\in \Theta_s$, define the $M\times 1$ vector $$ P^{\theta_s}_s\equiv \begin{pmatrix} G_{s}(\mathcal{E}_1(s; \theta_s))\\ ...\\ G_{s}(\mathcal{E}_M(s; \theta_s))\\ \end{pmatrix} $$
Given $s\in \mathcal{S}$, consider the function $H_s: \Theta_s\rightarrow \mathbb{R}$ prescribed by $$ H_s(\theta_s)= E_{G_{s} }\Big[\max_{y\in \mathcal{Y}} (\theta_{sy}+\epsilon_y)\Big]\hspace{1cm} \text{ $\forall \theta_s\in \Theta_s$ } $$ where $E_{G_s}[...]$ denotes the expectation of $[...]$ computed using the probability distribution $G_s$.
Main assumption
(A1) The probability distribution of $\epsilon\equiv (\epsilon_1,...\epsilon_{M})$ conditional on $S$ is known.
We assume that $\forall s \in \mathcal{S}$ $G_{s}$ has full support and is absolutely continuous with respect to the Lebesgue measure.
Claim: Under assumption (A1), for each $s\in \mathcal{S}$ and for each $\theta_s\in \Theta_s$,
$$ \theta_s=\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_s} $$ where the function $H^*_s: \mathbb{R}^M\rightarrow$ is the convex conjugate of the function $H_s$, $\nabla_q$ denotes the $M\times 1$ vector of partial derivatives with respect to $q$, $\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_s}$ denotes the $M\times 1$ vector of partial derivatives with respect to $q$ evaluated at $q=P^{\theta_s}_s$.
Attempted proof with questions highlighted
Fix $s\in \mathcal{S}$. The steps below hold for any $s\in \mathcal{S}$.
Step 0:
Notice that the fact that $G_s$ is absolutely continuous with respect to the Lebesgue measure implies that by summing the components of $P^{\theta_s}_{s}$ we get $1$. This is because the values of $\epsilon$ such that the $argmax_y \theta_{sy}+\epsilon_y$ is not unique are zero probability events.
Step 1: Note that $$ \frac{\partial H_s(\theta_s) }{\partial \theta_{sy}}=G_{s}(\mathcal{E}_y(s; \theta_s)) $$ where $\frac{\partial(...)}{\partial(...)}$ denotes derivative.
Thus, we can write the relation $$ \nabla_{\theta_s} H_s(\theta_s)=P^{\theta_s}_{s} $$
First doubt: how do we know that the function $H_s$ is differentiable? Is this implied by (A1)?
Step 2:
Let $\mathcal{Q}\equiv \{q \in [0,1]^M \text{ s.t. } \sum_{y\in \mathcal{Y}}q_y=1\}$.
Consider the convex conjugate of $H_s$, that is the function
$H^*_s: \mathbb{R}^M \rightarrow \mathbb{R}$ prescribed by
$$
H^*_s(q)=\begin{cases}
\max_{\theta_s \in \mathbb{R}^M}\Big[\sum_{y\in \mathcal{Y}} \theta_{sy} q_y - E_{G_{s} }(\max_{y\in \mathcal{Y}} \theta_{sy}+\epsilon_y) \Big] & \forall q \in \mathcal{Q}\\
\infty & \forall q \in \mathbb{R}^M\setminus \mathcal{Q}
\end{cases}
$$
It holds that for any $a\in \Theta_s$ $$ \nabla_{\theta_s} H_s(\theta_s)\Big|_{\theta_s=a}=P^a_{s} \Leftrightarrow \nabla_{q}H^*_s(q)\Big|_{q=P^a_s}=a $$
Second doubt: does this obviously hold under (A1)? How do we know that the convex conjugate is differentiable?
Step 3:
It follows that, $\forall \theta_s\in \Theta_s$, $$ \theta_s=\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_{s}} $$ which concludes our proof.