Estimating/Tuning a Coefficient from an Objective Function so Optimal Solution Reflects Data

20 Views Asked by At

I am working on a problem that is a modified version of a two-knapsack knapsack problem. I am able to find the optimal choices by using Gurobi. However, I would like to estimate a coefficient that is added linearly to my objective function so that the results most closely match my data. I will describe the problem below, but the thoughts I have had so far:

  1. Brute force/use grid search to try several parameter values by reoptimizing each time. This, however, would be computationally too expensive to be feasible.
  2. Use Gurobi's multiple scenarios option. However, this suffers from a similar problem, as the coefficient can take any real value (although it realistically will be between roughly [-10,10]). This option also feels too computationally expensive/would lack precision unless I could try different scenarios based on the results of previous scenarios (ie to focus on trying values closest to the results with the best outcomes.
  3. Setting up a bilevel optimization problem by minimizing the sum of squared errors, where the values come from the first-level optimization problem. Then use a Kuhn-Tucker/Lagrangean transformation. However, I have been having trouble with this approach.
  4. Any other options that people can think of would be incredibly helpful!

The problem involves many schools needing to be placed into one of two knapsacks, $X$ or $Y$. By being placed in either knapsack, they will get value from data, $w_{i}$, and will get no value by being out of the knapsacks. Knapsack $Y$ must have an average weight (average of the w's of the schools in the knapsack) above 0.7, and knapsack $X$ must have an average weight below 0.7. Additionally, the value a school gives from being in knapsack $Y$ is 0.7. However, the value from being in knapsack $X$ is the average weight (again, the average of all $w_{i}$s in the knapsack). This problem I can solve in Gurobi.

However, I want to assume that a school has a value from being in either knapsack, $\theta$. I am interested in the minimum $\theta$ required for it to be optimal for a school to be placed in a knapsack (or maximum such that it won't be in a knapsack). I want to estimate theta such that my model's prediction if a school is in any knapsack most closely matches the reality from the data. I would be satisfied estimating a $\theta$ for each school or one shared $\theta$ for the entire problem.

Below is my formulation of the single-level and bilevel problems. $A$ is the average weight of the $X$ knapsack, $X_{i}$ and $Y_{i}$ represent being placed in a knapsack, and for each school, their sum has to be less than or equal to 1 (can only go in at most one knapsack). $P_{i}$ represents "participating" or being placed into either knapsack. $P_{i}$ is observed in my data, and I am trying to minimize the squared error from comparing my model's prediction of going into a knapsack ($X_{i} + Y_{i}$) to the data, $P_{i}$. As $P_{i}$, $X_{i}$, and $Y_{i}$ are binary, the first equation in the bilevel problem represents the sum of squared errors.

Does anyone have any thoughts on how I could go about estimating theta to match my data? Either a method to reformulate the problem, a trick Gurobi (or similar software) can do, tips for the Kuhn-Tucker reformulation, or something different would be greatly appreciated.

Note, this is a simplified version of my problem, but it contains the core details. I will also include screenshots from the more complicated version of this problem if you are curious/able to provide support on the real version.

Original Optimization Problem: \begin{equation} \max_{X_{i},Y_{i}} \sum_{i=1}^{n} A \cdot X_{i} + 0.7 \cdot Y_{i} + (X_{i} + Y_{i})\theta \end{equation} such that: \begin{equation} A = \frac{\sum_{j=1}^{n} w_{j}X_{j}}{\sum_{j=1}^{n} X_{j}} \leq 0.7 \end{equation} \begin{equation} \frac{\sum_{j=1}^{n} w_{j}Y_{j}}{\sum_{j=1}^{n} Y_{j}} \geq 0.7 \end{equation} \begin{equation} \forall i\quad X_{i} + Y_{i} \leq 1 \end{equation} $\forall i \quad X_{i}$ and $Y_{i}$ are Binary Variables, $\theta \in \mathbb{R}$ and $\forall j \quad w_{j} \in [0, 1]$

Bilevel Problem for Estimating $\theta$:

\begin{equation} \min_{\theta} \sum_{i=1}^{n} P_{i}(1-X_{i}-Y_{i}) + (1-P_{i})(X_{i}+Y_{i}) \end{equation} such that: \begin{equation} [X_{i},Y_{i}]_{i=1}^{n} = arg\max_{X_{i},Y_{i}} \sum_{i=1}^{n} A \cdot X_{i} + 0.7 \cdot Y_{i} + (X_{i} + Y_{i})\theta \end{equation} \begin{equation} A = \frac{\sum_{j=1}^{n} w_{j}X_{j}}{\sum_{j=1}^{n} X_{j}} \leq 0.7 \end{equation} \begin{equation} \frac{\sum_{j=1}^{n} w_{j}Y_{j}}{\sum_{j=1}^{n} Y_{j}} \geq 0.7 \end{equation} \begin{equation} \forall i\quad X_{i} + Y_{i} \leq 1 \end{equation}

Real Bilevel Problem

My attempt at a KKT Reformulation, to be used in Gurobi (with the M and $\pi$ terms for dealing with the bilinearity of the complementary slackness conditions