Minimizing sum of square of three distances under monotone transformation -- is the solution unique?

33 Views Asked by At

Let $f:[0,1]\to[0,1]$ be a differentiable, strictly increasing, and concave function with $f(0)=0,$ $f(1)=1$, and $f'(0)<\infty$. Does the function $$ F(x,y) = \alpha (x-y)^2 + f^2(x) + \big(1-f(y)\big)^2 $$ have unique minimum on $[0,1]^2$ for any $\alpha>0$?

Minimizing $F$ involves a compromise between minimizing the squared distance between $x$ and $y$, and minimizing the squared distance of the images under the function $f$ of the point $x$ from 0, and $y$ from 1.

Observations:

  • The function $F$ is not necessarily convex. For instance, consider $f(x)= \frac{3x-x^3}{2}$ and $\alpha = 1$; then $F_{xx}(1,0) = -4$.
  • Any local minimum is at an interior point, because $F_x(0,y) = -2\alpha y < 0$ for any $y>0$; $F_x(1,y) = 2\alpha (1-y) + 2f'(1) > 0$ for any $y<1$; $F_y(x,0) = -2\alpha x - f'(0) < 0$ for any $x\geq 0$; and $F_y(x,1) = 2\alpha (1-x) > 0$ for any $x<1$. Thus it would be sufficient to show that there is unique interior point at which the first-order conditions $F_x(x,y)=F_y(x,y)=0$ are satisfied.

Attempt for solution:

The first-order conditions are: $$ \begin{align} \alpha(y-x) = f(x)f'(x) \tag{1} \\ \alpha(y-x) = \big(1-f(y)\big)f'(y). \tag{2} \end{align} $$

It turns out that there can be multiple points $(x,y)$ satisfying the first order conditions, our hope is that it is possible to exclude the possibility that $F$ would have the same value at both points. Consider that for a given $f$ there are two distinct points $(x_1,y_1)$ and $(x_2,y_2)$ that solve equations (1) and (2). Notice that if $y_1=y_2$, then by (2) also $x_1=x_2$, so the points wont be distinct. Thus, we can WLOG assume that $y_1<y_2$. Since both $y\mapsto 1-f(y)$ and $y\mapsto f'(y)$ are decreasing and positive, $y\mapsto \big(1-f(y)\big)f'(y)$ is decreasing to, and so $$y_1-x_1 > y_2-x_2.$$ \ldots

1

There are 1 best solutions below

0
On

It turns out that there actually can be multiple distinct global minima of the function $F$. I wrote a Matlab program that:

  1. Guesses two points $[x_1,y_1]$ and $[x_2,y_2]$ such that $x_1<x_2<y_1<y_2$ and $y_1-x_1 > y_2-x_2$.
  2. Guesses the function values $f(x_1),f(y_1)$ and $f(x_2),f(y_2)$ at these points consistent with that $f(0)=0, f(1)=1$ and that $f$ is concave.
  3. Finds the value of $\alpha$ which ensures that $F(x_1,y_1) = F(x_2,y_2)$ by solving a linear equation. If the resulting $\alpha$ is negative, guess again.
  4. Calculate the derivatives $f'(x_1), f'(y_2), f'(y_1), f'(y_2)$ from the first-order equations.
  5. Check if these derivatives are consistent with the concavity of $f$. If not, guess again.
  6. Now the points $[x_1,y_1]$ and $[x_2,y_2]$ are local minima of $F$ with equal function values. It remains to find a function $f$ such that those points are also global minima. We construct $f$ such that its graph consists of broken lines passing through the given points with desired slope. We discretize the space of $[x,y]\in [0,1]^2$, and find the minimum $m$ of $F$ over all the grid points. If $F(x_1,y_1)\leq m$, it means that we found the desired contra-example.

For example: $$ [x1, x2, y1, y2] = [0.05, 0.15, 0.25, 0.3], $$ $$ [f(x_1), f(x_2), f(y_1), f(y_2) = [0.17, 0.34, 0.45, 0.49], $$ and $$ f'(x_1), f'(x_2), f'(y_1), f'(y_2) \ \dot{=} \ [2.9782, 1.1168, 0.9205, 0.7445], $$ together with $\alpha = 2.5314$. Illustrated in the graphs below.

Graph of <span class=$f(x)$" />

Graph of the function <span class=$F(x,y)$" />

Zoomed in

Here is the Matlab code (the guessing procedure is not optimized, however despite for using brute-force it finds a solution within minutes):

function unique_minimum()
  clc
  clear
  
  step_x = 0.05; % coasness of the grid of x and y values
  step_f = 0.01; % coasness of the grid of f and g values

  h = waitbar(0,'Please wait...');
  max_iter = 5e6;

  % counters of how many guesses satisfied certain conditions
  df1_success = 0;
  df2_success = 0;
  dg1_success = 0;
  dg2_success = 0;

  success = false; % indicator of finding a solution
  for i = 1:max_iter
    if mod(i,1e4) == 0
      waitbar(i/max_iter,h,...
        sprintf(['Rate of success: ' ... 
        'df1 ... %.2f, '...
        'df2 ... %.2f, '...
        'dg1 ... %.6f, '...
        'dg2 ... %.9f\n'],...
        df1_success/i, df2_success/i,...
        dg1_success/i, dg2_success/i));
    end
    % generate two points [x,y] such that
    % x1 < x2 < y1 < y2 
    % and y2-x2 < y1-x1 
    x1 = rnd(0,1,1,6,step_x);
    y1 = rnd(x1,1,3,2,step_x);
    x2 = rnd(x1,y1,2,1,step_x);
    y2 = rnd(y1,min(1,y1+x2-x1),1,1,step_x);
    if isnan(y2), error('Unexpected problem'), end

    % function value f(x1)
    f1 = rnd(x1,1,1,3,step_f);
    if isnan(f1), continue; end

    % function value f(x2) 
    f2_min = interp1([x1 1],[f1 1],x2);
    f2_max = min(1,interp1(...
      [0 x1],[0 f1],x2,'linear','extrap'));
    f2 = rnd(f2_min,f2_max,0,2,step_f);
    if isnan(f2), continue; end

    % function value f(y1)
    g1_min = interp1([x2 1],[f2 1],y1);
    g1_max = min(1,interp1(...
      [x1 x2],[f1 f2],y1,'linear','extrap'));
    g1 = rnd(g1_min,g1_max,0,1,step_f);
    if isnan(g1), continue; end

    % function value f(y2)
    g2_min = interp1([y1 1],[g1 1],y2);
    g2_max = min(1,interp1(...
      [x2 y1],[f2 g1], y2,'linear','extrap'));
    g2 = rnd(g2_min,g2_max,0,0,step_f);
    if isnan(g2), continue; end

    alp = ((f1^2 + (1-g1)^2) - (f2^2 + (1-g2)^2)) / ...
      ( (x2-y2)^2 - (x1-y1)^2 );
    if alp <= 0, continue, end

    kxy1 = (1-g1) / f1;
    kxy2 = (1-g2) / f2;
    k12 = (y1-x1) / (y2-x2) * f2/f1;

    df1 = alp * (y1-x1) / f1;
    df2 = df1 / k12;
    dg1 = df1 / kxy1;
    dg2 = df2 / kxy2;

    df1_min = (f2-f1) / (x2-x1); 
    df1_max = f1 / x1;
    df2_min = (g1-f2) / (y1-x2); 
    df2_max = (f2-f1) / (x2-x1);
    dg1_min = (g2-g1) / (y2-y1);
    dg1_max = (g1-f2) / (y1-x2);
    dg2_min = (1-g2) / (1-y2);
    dg2_max = (g2-g1) / (y2-y1);
  
    if df1 < df1_min || df1 > df1_max
      continue
    end
    df1_success = df1_success + 1;

    if df2 < df2_min || df2 > df2_max
      continue
    end
    df2_success = df2_success + 1;
    
    if dg1 < dg1_min || dg1 > dg1_max
      continue
    end
    dg1_success = dg1_success + 1;

    if dg2 < dg2_min || dg2 > dg2_max
      continue
    end
    dg2_success = dg2_success + 1;

    xx = [0 x1 x2 y1 y2 1];
    ff = [0 f1 f2 g1 g2 1];
    dff = [2*df1 df1 df2 dg1 dg2 dg2/2];
    f = @(t) arrayfun (interp_slope(xx,ff, dff),t);
    F = @(x,y) alp * (x-y).^2 + f(x).^2 + (1-f(y)).^2;
    
    N=500;
    [xxx, yyy] = meshgrid(linspace(0,1,N),linspace(0,1,N));
    VV = F(xxx,yyy);
    M =  min(VV(:));
    if F(x1,y1)<=M && F(x2,y2) <= M
      success = true;
      break
    end
  end
  close(h);
  
  if success
    i
    disp('x1, x2, y1, y2')
    disp([x1 x2 y1 y2])
    disp('f1, f2, g1, g2')
    disp([f1 f2 g1 g2])
    disp('df1, df2, dg1, dg2')
    disp([df1 df2 dg1 dg2])
     % [f1/x1, df1, (f2-f1)/(x2-x1), (g1-f2)/(y1-x2), ...
     %   (g2-g1)/(y2-y1), (1-g2)/(1-y2)]
     % [f1/x1, df1, (f2-f1)/(x2-x1), df2, (g1-f2)/(y1-x2), ...
     %   dg1, (g2-g1)/(y2-y1), dg2, (1-g2)/(1-y2)]
     % [f1 * df1, (1-g1) * dg1]
     % [f2 * df2, (1-g2) * dg2]
     % [(y1-x1) / (y2-x2), f1 * df1 / f2 / df2]
     % 
    alp = f1 * df1 / (y1-x1)

    %%
    tt = linspace(0,1,100);
    plot(xx,ff,'ro'); hold on;
    labels = {'$[x_1,f(x_1)]$', '$[x_2,f(x_2)]$',...
      '$[y_1,f(y_1)]$', '$[y_2,f(y_2)]$'};
    for i = 1:4
        text(xx(i+1), ff(i+1), labels{i},...
          'VerticalAlignment','top',...
          'HorizontalAlignment','left',...
          'Interpreter','latex');
    end
    plot(tt, arrayfun(f,tt),'b')
    title('Function $f$', 'Interpreter','latex')

    hold off
    
    figure;
    VV = (VV - M).^0.5;
    contour(xxx,yyy,VV,200); hold on;
    plot(x1, y1,'.')
    text(x1,y1,'$[x_1,y_1]$',...
          'VerticalAlignment','top',...
          'HorizontalAlignment','left',...
          'Interpreter','latex');
    plot(x2, y2,'.')
    text(x2,y2,'$[x_2,y_2]$',...
          'VerticalAlignment','top',...
          'HorizontalAlignment','left',...
          'Interpreter','latex');
    xlabel('$x$','Interpreter','latex')
    ylabel('$y$','Interpreter','latex')
    title('Function $F(x,y)$','Interpreter','latex')
    hold off;   
  else
    disp('Failed')
  end
end

function y = rnd(m,M,left,right,step) 
% a function that generates random numbers 
% between m and m with step ='step' and
% keeping 'left' / 'right' grid point on
% the left / right
  eps = 1e-14;
  i_m = floor((m+eps)/step) + left;
  i_M = ceil((M-eps)/step) - right;
  
  if i_m > i_M
    y = NaN;
  else
    i = randi([i_m,i_M]);
    y = i * step;
  end
end

function fun = interp_slope(x,f,df)
% returns function that interpolates
%  function values f and derivatives df
% at points x
% x, f, and df have to have the same dimensions

  n = length(x);
  xx = zeros(n+1,1);
  ff = zeros(n+1,1);
  xx(1) = 0;
  ff(1) = 0;
  for i = 1:(n-1)
    % intersect lines:
    % from point [x(i),f(i)] with slope df(i)
    % from point [x(i+1),f(i+1)] with slope df(i+1)
    xx(i+1) = (f(i+1) - f(i) + df(i) * x(i)...
      - df(i+1) * x(i+1))...
      / (df(i)-df(i+1));
    ff(i+1) = f(i) + (xx(i+1)-x(i)) * df(i);
    if xx(i+1)<0 || xx(i+1)>1 ||...
        ff(i+1)<0 || ff(i+1)>1 
      error('Out of bounds');
    end
  end
  xx(end) = 1;
  ff(end) = 1;
  fun = @(z) interp1(xx,ff,z);
end