Proof that makes use of the differentiability of a function and of its convex conjugate

124 Views Asked by At

I would like your help to understand what are the crucial assumptions driving the claim reported below.


Let me start with the notation

  • $\mathcal{Y}\equiv \{1,2,...,M\}$.

  • $S$ is a random vector with support $\mathcal{S}$. $\mathcal{S}$ could be finite or not, countable or uncountable.

  • $\epsilon\equiv (\epsilon_1,...,\epsilon_M)$ is an $M\times 1$ random vector.

  • For each $s\in \mathcal{S}$, we denote by $G_{s}$ the probability distribution of $\epsilon$ conditional on $S=s$.

  • For each $s\in \mathcal{S}$, $\theta_s\equiv (\theta_{sy}\text{ }\forall y \in \mathcal{Y})$ is an $M\times 1$ vector of parameters belonging to $\Theta_s\subseteq \mathbb{R}^M$.

  • Given $s\in \mathcal{S}$ and $\theta_s\in \Theta_s$, consider the region $$ \mathcal{E}_y(s; \theta_s)\equiv \{e\in \mathbb{R}^M: \text{ }\theta_{sy}+e_y\geq \theta_{sz}+e_z \text{ }\forall z \in \mathcal{Y}\} $$

  • Given $s\in \mathcal{S}$ and $\theta_s\in \Theta_s$, define the $M\times 1$ vector $$ P^{\theta_s}_s\equiv \begin{pmatrix} G_{s}(\mathcal{E}_1(s; \theta_s))\\ ...\\ G_{s}(\mathcal{E}_M(s; \theta_s))\\ \end{pmatrix} $$

  • Given $s\in \mathcal{S}$, consider the function $H_s: \Theta_s\rightarrow \mathbb{R}$ prescribed by $$ H_s(\theta_s)= E_{G_{s} }\Big[\max_{y\in \mathcal{Y}} (\theta_{sy}+\epsilon_y)\Big]\hspace{1cm} \text{ $\forall \theta_s\in \Theta_s$ } $$ where $E_{G_s}[...]$ denotes the expectation of $[...]$ computed using the probability distribution $G_s$.


Main assumption

(A1) The probability distribution of $\epsilon\equiv (\epsilon_1,...\epsilon_{M})$ conditional on $S$ is known.

We assume that $\forall s \in \mathcal{S}$ $G_{s}$ has full support and is absolutely continuous with respect to the Lebesgue measure.


Claim: Under assumption (A1), for each $s\in \mathcal{S}$ and for each $\theta_s\in \Theta_s$,

$$ \theta_s=\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_s} $$ where the function $H^*_s: \mathbb{R}^M\rightarrow$ is the convex conjugate of the function $H_s$, $\nabla_q$ denotes the $M\times 1$ vector of partial derivatives with respect to $q$, $\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_s}$ denotes the $M\times 1$ vector of partial derivatives with respect to $q$ evaluated at $q=P^{\theta_s}_s$.


Attempted proof with questions highlighted

Fix $s\in \mathcal{S}$. The steps below hold for any $s\in \mathcal{S}$.

Step 0:

Notice that the fact that $G_s$ is absolutely continuous with respect to the Lebesgue measure implies that by summing the components of $P^{\theta_s}_{s}$ we get $1$. This is because the values of $\epsilon$ such that the $argmax_y \theta_{sy}+\epsilon_y$ is not unique are zero probability events.

Step 1: Note that $$ \frac{\partial H_s(\theta_s) }{\partial \theta_{sy}}=G_{s}(\mathcal{E}_y(s; \theta_s)) $$ where $\frac{\partial(...)}{\partial(...)}$ denotes derivative.

Thus, we can write the relation $$ \nabla_{\theta_s} H_s(\theta_s)=P^{\theta_s}_{s} $$

First doubt: how do we know that the function $H_s$ is differentiable? Is this implied by (A1)?

Step 2:

Let $\mathcal{Q}\equiv \{q \in [0,1]^M \text{ s.t. } \sum_{y\in \mathcal{Y}}q_y=1\}$.

Consider the convex conjugate of $H_s$, that is the function $H^*_s: \mathbb{R}^M \rightarrow \mathbb{R}$ prescribed by
$$ H^*_s(q)=\begin{cases} \max_{\theta_s \in \mathbb{R}^M}\Big[\sum_{y\in \mathcal{Y}} \theta_{sy} q_y - E_{G_{s} }(\max_{y\in \mathcal{Y}} \theta_{sy}+\epsilon_y) \Big] & \forall q \in \mathcal{Q}\\ \infty & \forall q \in \mathbb{R}^M\setminus \mathcal{Q} \end{cases} $$

It holds that for any $a\in \Theta_s$ $$ \nabla_{\theta_s} H_s(\theta_s)\Big|_{\theta_s=a}=P^a_{s} \Leftrightarrow \nabla_{q}H^*_s(q)\Big|_{q=P^a_s}=a $$

Second doubt: does this obviously hold under (A1)? How do we know that the convex conjugate is differentiable?

Step 3:

It follows that, $\forall \theta_s\in \Theta_s$, $$ \theta_s=\nabla_{q} H^*_s(q)\Big|_{q=P^{\theta_s}_{s}} $$ which concludes our proof.