Suppose we have $N$ independently identically distributed (i.i.d.) samples $X_1,\cdots,X_N$ generated from a sub-Gaussian random variable $X \sim \mathbb{P}$. Then by definition there exists the smallest $\sigma >0$ such that $$\mathbb{E}[\exp(\theta (X - \mathbb{E}[X]))] \le \exp(\theta^2 \sigma^2/2) ~~ \forall \theta \in \mathbb{R}.$$ My question is that how can we estimate such $\sigma$ from the $N$ i.i.d. samples? What statistical properties can we expect for the estimator?
My attempt (a natural idea) is to use the definition with the empirical distribution. Let $\bar X$ be the sample mean and solve the following optimization problems $$\hat \sigma_1 := \sup_{\theta >0} \frac{\sqrt{2\ln (\frac{1}{N} \sum_{i=1}^N e^{\theta(X_i - \bar X)})}}{\theta}$$ and $$\hat \sigma_2 := \sup_{\theta >0} \frac{\sqrt{2\ln (\frac{1}{N} \sum_{i=1}^N e^{\theta(\bar X - X_i)})}}{\theta}$$ and then take the maximum of the two, i.e., let $\hat \sigma := \max(\hat \sigma_1, \hat \sigma_2)$. In this way, we obtain the smallest variance-proxy for the empirical distribution. However, this optimization problems have no closed-form solution. Hence, it is difficult to prove unbiasedness, etc. Is there any other method for estimating the variance-proxy from the data with nice statistical properties? It would be appreciated if there is any reference paper or note. Thank you!