Unconstrained minimizer as a linear combination

100 Views Asked by At

I'm given the following function:

$$f : R^n \rightarrow R$$

$$f(x)=\frac{||Ax-b||^2}{c^Tx+d}$$ where x $\in R^n$ and $dom(f) = \{x|c^Tx+d > 0\}$

Also it's given that $rank(A)=n$ and vector b is not in the range of A.

I'm asked to show: a. $f>0$ $\forall x \in dom(f)$

b. to find explicitly $x_{ls} =argmin_{x} ||Ax-b||^2$

c. show that the $x^*$, the minimizer of $f$, is a linear combination of $x_{ls}$ and $x_{1} \overset{\Delta}{=} (A^TA)^{-1}c$, that is, show that exists $t \in R$ such that $x^{*} = x_{ls} + tx_{1}$

d. show that t is the solution of the following quadratic equation: $$t^2c^Tx_1 + 2t(c^Tx_{ls}+d) - ||Ax_{ls}-b||^2 = 0$$


Here's what I did:

first, since $x \in R^n$ and $rank(A) = n$, it follows that $A \in R^{mxn}, m\geq n$ and $b \in R^m$

A.

$f's$ denominator is strictly positive in it's entire domain. The numerator is a norm that is by definition non-negative. The norm is zero iff $Ax=b$ but it is given that $b \notin Range(A)$ so there's no $x \in dom(f)$ s.t. $Ax=b$, so the numerator is strictly positive, thus $f > 0$

B. $$||Ax-b||^2 = <Ax-b,Ax-b> = (Ax-b)^T(Ax-b) = x^TA^TAx -2b^TAx+b^Tb$$

To find the minimum, I define $g(x) = x^TA^TAx -2b^TAx+b^Tb$, find the gradient and equate to zero.

$$\nabla g(x) = 2A^TAx - 2A^Tb$$

if $\nabla g(x_{ls}) = 0$ then $2A^TAx_{ls} - 2A^Tb = 0$ then $x_{ls} = (A^TA)^{-1}A^Tb$

C.

I proceed similarly. I define $h(x)=(c^Tx+d)^{-1}$

now, $f = g\cdot h$, so by the product rule $$\nabla f(x) = \nabla(g \cdot h)(x) = \nabla g(x) \cdot h + g \cdot \nabla h(x)$$

As in part b, $\nabla g(x) = 2A^TAx - 2A^Tb$. Now $\nabla h(x) = \frac{-c}{(c^Tx+d)^2}$

By substitution, $$\nabla g(x) \cdot h + g \cdot \nabla h(x) = \frac{2A^TAx - 2A^Tb}{c^Tx+d} - \frac{(x^TA^TAx -2b^TAx+b^Tb)\cdot c}{(c^Tx+d)^2} \\ = \nabla g(x)\cdot h(x) - f(x)\cdot h(x)\cdot c$$

Dividing both sides by the positive scalar $h(x)$, I get:

$$\nabla g(x) = f(x)\cdot c$$

The minimizer $x^*$ must satisfy $\nabla g(x^*) = f(x^*)\cdot c$

So,

$$2A^TAx^*-2A^tb = f(x^*)\cdot c$$

Multiplying both sides by $\frac{1}{2}(A^TA)^{-1}$

$$\frac{1}{2}(A^TA)^{-1} \cdot 2A^TAx^* =\frac{1}{2}(A^TA)^{-1}\cdot ( 2A^tb + f(x^*)\cdot c)$$

$$x = x_{ls} + \frac{1}{2}(A^TA)^{-1}\cdot f(x^*)\cdot c$$

$$x = x_{ls} + \frac{f(x^*)}{2}\cdot x_{1}$$

with $t = \frac{f(x^*)}{2} \in R$

D. Now, I tried plugging the value for t that I got into the provided equation but the value does not satisfy the equation. I suspect I got the wrong value for t, but I can't find a flaw in the solution above.

Please help!

1

There are 1 best solutions below

1
On BEST ANSWER

You did a great job deriving it all the way through c, it's flawless!

The only missing step for d is to notice this fact about quadratic functions. $$ g(x)=\|Ax-b\|_2^2 $$ is a quadratic function, i.e. a parabola centered at its global minimum which is $x_{ls}$, then it could be simplified by centering the coordinate at its minimum: it becomes a constant term + a quadratic form of $\Delta x$ $$ g(x)=\tilde g(\Delta x)=\|A(x_{ls}+\Delta x)-b\|_2^2\\ =\|Ax_{ls}-b\|_2^2+\Delta x A^TA\Delta x $$


How this fact helps us? we start with the conclusion of c. which is a fixed point equation about $x^*$. $$ x^*=x_{ls}+\frac{f(x^*)}{2}x_1\;,t\triangleq \frac{f(x^*)}{2} $$ Turn this into an equation of $t$ $$ 2t = f(x^*)=\frac{\|Ax^*-b\|^2}{c^Tx^*+d}\\ 2t(c^T(x_{ls}+tx_1)+d)=\|A(x_{ls}+tx_1)-b\|^2\\ 2c^Tx_1t^2+2c^T(x_{ls}+d)t=\|A(x_{ls}+tx_1)-b\|^2 $$ Look at the RHS, do you recognize sth similar? Yes, we can take $tx_1$ as the $\Delta x$ above and use the fact that $x_{ls}$ is the global minimum of RHS. $$ RHS=\|Ax_{ls}-b\|_2^2+(tx_1)^T A^TA\Delta (tx_1)\\ =\|Ax_{ls}-b\|_2^2+t^2x_1^TA^TA(A^TA)^{-1}c\\ =\|Ax_{ls}-b\|_2^2+t^2c^Tx_1\\ $$ Taking RHS back to the equation, we get $$ 2c^Tx_1t^2+2c^T(x_{ls}+d)t=\|Ax_{ls}-b\|_2^2+t^2c^Tx_1\\ c^Tx_1t^2+2c^T(x_{ls}+d)t-\|Ax_{ls}-b\|_2^2=0 $$

QED.


Really like the design of this problem! where does it come from?