Let $\beta^* \in\mathbb R^p \sim c\exp(-f(\beta))$ be a random variable and $\beta (\lambda) \in\mathbb R^p \sim c_1\exp(-f(\beta) - \lambda \|\beta\|_2^2/2)$ be a random variable for some appropriate constant $c$ and $c_1$, and both the random variables are independent of each other. Also, let $E[\|\beta^*\|^2_2] = \alpha$ and $E[\beta^*] = \delta$. Let $f$ be $m$-strongly convex and $L$-lipschitz smooth. I am trying to find a strong uppr bound on expectation of $\|\beta^* - \beta(\lambda) \|^2 $ in terms of $\lambda, \delta, \alpha, L, m$ and $p$- \begin{align*} E \left[ \|\beta^* - \beta(\lambda) \|^2 \right] \end{align*}
2026-04-06 13:04:03.1775480643
Bounding expectation
274 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in PROBABILITY
- How to prove $\lim_{n \rightarrow\infty} e^{-n}\sum_{k=0}^{n}\frac{n^k}{k!} = \frac{1}{2}$?
- Is this a commonly known paradox?
- What's $P(A_1\cap A_2\cap A_3\cap A_4) $?
- Prove or disprove the following inequality
- Another application of the Central Limit Theorem
- Given is $2$ dimensional random variable $(X,Y)$ with table. Determine the correlation between $X$ and $Y$
- A random point $(a,b)$ is uniformly distributed in a unit square $K=[(u,v):0<u<1,0<v<1]$
- proving Kochen-Stone lemma...
- Solution Check. (Probability)
- Interpreting stationary distribution $P_{\infty}(X,V)$ of a random process
Related Questions in PROBABILITY-THEORY
- Is this a commonly known paradox?
- What's $P(A_1\cap A_2\cap A_3\cap A_4) $?
- Another application of the Central Limit Theorem
- proving Kochen-Stone lemma...
- Is there a contradiction in coin toss of expected / actual results?
- Sample each point with flipping coin, what is the average?
- Random variables coincide
- Reference request for a lemma on the expected value of Hermitian polynomials of Gaussian random variables.
- Determine the marginal distributions of $(T_1, T_2)$
- Convergence in distribution of a discretized random variable and generated sigma-algebras
Related Questions in EXPECTED-VALUE
- Show that $\operatorname{Cov}(X,X^2)=0$ if X is a continuous random variable with symmetric distribution around the origin
- prove that $E(Y) = 0$ if $X$ is a random variable and $Y = x- E(x)$
- Limit of the expectation in Galton-Watson-process using a Martingale
- Determine if an Estimator is Biased (Unusual Expectation Expression)
- Why are negative constants removed from variance?
- How to find $\mathbb{E}(X\mid\mathbf{1}_{X<Y})$ where $X,Y$ are i.i.d exponential variables?
- $X_1,X_2,X_3 \sim^{\text{i.i.d}} R(0,1)$. Find $E(\frac{X_1+X_2}{X_1+X_2+X_3})$
- How to calculate the conditional mean of $E(X\mid X<Y)$?
- Let X be a geometric random variable, show that $E[X(X-1)...(X-r+1)] = \frac{r!(1-p)^r}{p^r}$
- Taylor expansion of expectation in financial modelling problem
Related Questions in UPPER-LOWER-BOUNDS
- Bound for difference between arithmetic and geometric mean
- Show that $\frac{1}{k}-\ln\left(\frac{k+1}{k}\right)$ is bounded by $\frac{1}{k^2}$
- Bounding Probability with Large Variance
- Connectivity of random graphs - proof $\frac{logn}{n}$ is threshold
- Natural log integral inequality
- Spectrum of a matrix after applying an element-wise function (e.g. elementwise log)
- Majorization form for a given set of integers in some interval.
- Proving $(λ^d + (1-λ^d)e^{(d-1)s})^{\frac{1}{1-d}}\leq\sum\limits_{n=0}^\infty\frac1{n!}λ^{\frac{(d^n-1)d}{d-1}+n}s^ne^{-λs}$
- Upper bound for distribution function of the standard normal distribution
- Show $0 < f'(x) \leqslant \frac{1}{2}$
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
To begin, we need some connection between $f$ and $(\alpha,\delta)$. Since $f$ is $m$-strongly convex, it has some minimum, achieved at, say $\beta_0$. Changing $c_0$ as necessary, I assume $f(\beta_0)=0$. Moreover, as $\beta$ moves away from $\beta_0$, $f$ grows at least quadratically, and so the corresponding pdf decays at least as fast as the normal. (That's already very fast; see Putanumonit's discussion, but take his soccer conclusions with a grain of salt.)
Thus we estimate \begin{align*} \mathbb{E}[\|\beta^*-\beta_0\|_2^2]&=\int_{\mathbb{R}^p}{\|\beta-\beta_0\|_2^2\cdot c_0e^{-f(\beta)}\,d^p\beta} \\ &\leq\int_{\mathbb{R}^p}{c_0\|\beta-\beta_0\|_2^2e^{-\frac{m}{2}\|\beta-\beta_0\|_2^2}\,d^p\beta} \\ &=c_0\cdot \text{vol}(S^{p-1})\int_0^{\infty}{r^2e^{-\frac{m}{2}r^2}\cdot r^{p-1}\,dr} \\ &=c_0\cdot\frac{2\pi^{\frac{p}{2}}}{\Gamma\left(\frac{p}{2}\right)}\cdot\frac{1}{2}\left(\frac{2}{m}\right)^{\frac{p}{2}+1}\Gamma\left(\frac{p}{2}+1\right) \\ &=\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1} \end{align*} where $\Gamma$ is the Gamma function, respectively. This is small: $\delta$ and $\beta_0$ roughly coincide.
(Essentially, we just performed Laplace's method.)
We want to do the same with $\beta(\lambda)$, since then the difference is $$\beta(\lambda)-\beta^*=\beta(\lambda)-\beta_0+\beta_0-\beta^*$$ But the mean is not quite so straightforward: if we approximate $\beta(\lambda)$ as a normal, then it is centered around $\left(\frac{m}{m+2\lambda}\right)\beta_0$: $$\frac{m}{2}\|\beta-\beta_0\|_2^2+\lambda\|\beta\|_2^2=\left(\frac{m}{2}+\lambda\right)\left\|\beta-\left(\frac{m}{m+2\lambda}\right)\beta_0\right\|_2^2+\frac{m}{2}\left(1-\frac{1}{m+2\lambda}\right)\|\beta_0\|^2$$ just from completing the square. As before, I will absorb the constant term into $c_1$, so that our bound is $$\text{pdf}_{\beta(\lambda)}(\beta)\leq c_1e^{-\left(\frac{m}{2}+\lambda\right)\left\|\beta-\frac{m}{m+2\lambda}\beta_0\right\|_2^2}$$
Once I have that estimate, though, the exact same argument goes through: $$\mathbb{E}\left[\left\|\beta-\frac{m}{m+2\lambda}\beta_0\right\|_2^2\right]\leq\frac{c_1p}{2\pi}\left(\frac{2\pi}{m+2\lambda}\right)^{\frac{p}{2}+1}$$
Now, by the identity $$\|a+b+c\|^2\leq3(\|a\|^2+\|b\|^2+\|c\|^2)$$ (true in any inner product space), we have \begin{align*} \mathbb{E}[\|\beta^*-\beta(\lambda)\|_2^2]&=\mathbb{E}\left[\left\|\left(\beta^*-\beta_0\right)+\frac{2\lambda}{m+2\lambda}\beta_0+\left(\left(\frac{m}{m+2\lambda}\right)\beta_0-\beta(\lambda)\right)\right\|_2^2\right] \\ &\leq3\left(\mathbb{E}[\|\beta^*-\beta_0\|^2]+\frac{2\lambda}{m+2\lambda}\|\beta_0\|_2^2+\mathbb{E}\left[\left\|\left(\frac{m}{m+2\lambda}\right)\beta_0-\beta(\lambda)\right\|_2^2\right]\right) \\ &\leq3\left(\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}+\frac{2\lambda}{m+2\lambda}\|\beta_0\|_2^2+\frac{c_1p}{2\pi}\left(\frac{2\pi}{m+2\lambda}\right)^{\frac{p}{2}+1}\right) \end{align*} If you want to sharpen this, you can do a little better with calculating out the cross terms. But I don't think they'll be leading-order.
In any case, we're almost done. The only term we haven't computed in terms of our original parameters is $\|\beta_0\|^2$. Well, \begin{align*} \|\beta_0\|_2^2-\|\beta^*\|_2^2&=\left|\|\beta^*+(\beta_0-\beta^*)\|_2^2-\|\beta^*\|_2^2\right| \\ &=\left|\|\beta^*-\beta_0\|_2^2+2\langle\beta^*,\beta_0-\beta^*\rangle\right| \\ &\leq\|\beta^*-\beta_0\|_2^2+2\|\beta^*\|_2\|\beta_0-\beta^*\|_2 \end{align*} where the last line is by Cauchy-Schwarz. Taking expectations, we have $$\|\beta_0\|_2^2-\alpha\leq\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}+2\mathbb{E}[\|\beta^*\|_2]\mathbb{E}[\|\beta_0-\beta^*\|_2]$$ Reversing the order of subtraction on the left and repeating the same proof, we can introduce an absolute value: $$\left|\|\beta_0\|_2^2-\alpha\right|\leq\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}+2\mathbb{E}[\|\beta^*\|_2]\mathbb{E}[\|\beta_0-\beta^*\|_2]$$ Thus it suffices to show that both terms on the right are small. By Jensen's inequality, for any random variable $R$, we have $\mathbb{E}[R]^2\leq\mathbb{E}[R^2]$; equivalently, $\mathbb{E}[R]\leq\sqrt{\mathbb{E}[R^2]}$. Thus $$\left|\|\beta_0\|_2^2-\alpha\right|\leq\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}+2\sqrt{\alpha}\sqrt{\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}}$$
Putting it all together, \begin{align*} \mathbb{E}[\|\beta^*-\beta(\lambda)\|_2^2]&\leq\frac{3p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}\left(c_0+c_1\left(\frac{m}{m+2\lambda}\right)^{\frac{p}{2}+1}\right)+{} \\ &\qquad\frac{6\lambda}{m+2\lambda}\left(\sqrt{\alpha}+\sqrt{\frac{c_0p}{2\pi}\left(\frac{2\pi}{m}\right)^{\frac{p}{2}+1}}\right)^2 \end{align*}