Try to find minimal value of weighted numbers

73 Views Asked by At

Given a data set, I need to find the minimal value of the total values divided by the total weights. The data sets are of arbitrary length (each data set may have a different length, but the data set and the corresponding weights have the same lengths).

Considering a simple example:

 DataSetOne [500, 45]
 WeightsOfSetOne = [10, 1]

 DataSetTwo [850, 90]
 WeightsOfSetTwo = [10, 1]

Given these data sets, you can get 4 results.

     [1] (500 + 850)/(10+10) = 1350/20 = 67.5
     [2] (500 + 90)/(10+1) = 590/11 = 53.6
     [3] (45 + 850)/(1+10) = 895/11 = 81.4
     [4] (45 + 90)/(1 + 1) = 135/2 = 67.5

In this case, we see that [2] gives the optimal solution. However, this is a brute force method and I would rather calculate it Mathematically (if possible) because brute force simply is not possible since the number of possibilities scales exponentially with the number of data sets.

The approach I tried was as follows:

For each of the values with their corresponding weights of Data_set_one:

Start with a (value + weight) from the first Data set. Then for all the other data sets:

Traverse each data set and choose the option in the data set that gives the minimum value for Total value so far/Total Weight so far.

However. This approach failed with the data set below.

            Data_set_one_values: [50, 72, 9]
            Data_set_one_weights [52, 32, 3]
            Data_set_two_values: [87, 63, 0]
            Data_set_two_weights [43, 88, 36]
            Data_set_three_values: [83, 16, 94]
            Data_set_three_weights [38, 22, 56]
            Data_set_four_values: [22, 61, 37]
            Data_set_four_weights: [25, 13, 55]
            Data_set_five_values: [1, 15, 52]
            Data_set_five_weights: [53, 80, 43]

If there is no exact way to calculate this except a brute-force method by calculating all combinations, is there a way perhaps that I can get an approximation that works?

2

There are 2 best solutions below

4
On BEST ANSWER

One approach is to use mixed integer linear programming. For each set $i$ and item $j$, let $v_{i,j}$ denote the value and $w_{i,j}$ denote the weight, and let binary decision variable $x_{i,j}$ indicate whether the item is selected. The problem is to minimize $$\frac{\displaystyle{\sum_{i,j} v_{i,j} x_{i,j}}}{\displaystyle{\sum_{i,j} w_{i,j} x_{i,j}}}$$ subject to $$\sum_j x_{i,j} = 1$$ for all $i$.

You can linearize the fractional objective via a Charnes-Cooper transformation.

Here is SAS code for your second instance:

data indata;
   input set item value weight;
   datalines;
1 1 50 52
1 2 72 32
1 3  9  3
2 1 87 43
2 2 63 88
2 3  0 36
3 1 83 38
3 2 16 22
3 3 94 56
4 1 22 25
4 2 61 13
4 3 37 55
5 1  1 53
5 2 15 80
5 3 52 43
;

proc optmodel;
   /* declare parameters and read data */
   set <num,num> SET_ITEM;
   num value {SET_ITEM};
   num weight {SET_ITEM};
   read data indata into SET_ITEM=[set item] value weight;
   set SETS = setof {<i,j> in SET_ITEM} i;

   /* declare original variable and constraint */
   var X {SET_ITEM} binary;
   con OneItemPerSet {i in SETS}:
      sum {<(i),j> in SET_ITEM} X[i,j] = 1;

   /* declare linearization variables, objective, and constraint */
   var T >= 0 <= 1 / min {<i,j> in SET_ITEM} weight[i,j];
   var Y {SET_ITEM} >= 0 <= T.ub;
   min LinearizedRatio = sum {<i,j> in SET_ITEM} value[i,j] * Y[i,j];
   con DenominatorOne:
      sum {<i,j> in SET_ITEM} weight[i,j] * Y[i,j] = 1;
   con Linearize1 {i in SETS}:
      sum {<(i),j> in SET_ITEM} Y[i,j] = T;
   con Linearize2 {<i,j> in SET_ITEM}:
      Y[i,j] <= Y[i,j].ub * X[i,j];

   /* call MILP solver */
   solve;

   /* report solution */
   print X value weight;
quit;

The resulting optimal solution, with minimum ratio $48/139=0.345$, is: \begin{matrix} i &j &x &v &w\\ \hline 1 &1 &0 &50 &52 \\ 1 &2 &0 &72 &32 \\ 1 &3 &1 &9 &3 \\ 2 &1 &0 &87 &43 \\ 2 &2 &0 &63 &88 \\ 2 &3 &1 &0 &36 \\ 3 &1 &0 &83 &38 \\ 3 &2 &1 &16 &22 \\ 3 &3 &0 &94 &56 \\ 4 &1 &1 &22 &25 \\ 4 &2 &0 &61 &13 \\ 4 &3 &0 &37 &55 \\ 5 &1 &1 &1 &53 \\ 5 &2 &0 &15 &80\\ 5 &3 &0 &52 &43 \end{matrix}

0
On
Using the Accepted answer of Rob Pratt, 
an installation guide: https://www.youtube.com/watch?v=oA86HCkCg5k (till 2:40)
and https://www.youtube.com/watch?v=sf59_7r8QSY 
to get a general idea on how to work with the IBM CPlex.

I get the following code. (Java)

 import ilog.concert.IloLinearNumExpr;
import ilog.concert.IloNumExpr;
import ilog.concert.IloNumVar;
import ilog.cplex.IloCplex;

import java.util.*;

public class TestFileTwo {

public static long Time = 0L;

public static void main(String args[]){
    boolean TestApproximation = true;
    boolean TestMILPs = true;
    boolean TestSimulation = true;
    boolean PrintDataSet = false;
    boolean TestDataOne = false;
    boolean TestDataTwo = false;

    ArrayList<int[]> ValueList = new ArrayList<>();
    ArrayList<int[]> WeightList = new ArrayList<>();
    int DataSets = 10;
    int ElementsPerDataSet = 3;
    if(!TestDataOne && !TestDataTwo) {
        for (int i = 0; i < DataSets; i++) {
            int[] ValueArray = new int[ElementsPerDataSet];
            int[] WeightArray = new int[ElementsPerDataSet];
            for (int j = 0; j < ElementsPerDataSet; j++) {
                ValueArray[j] = GetRandom(100);
                WeightArray[j] = GetRandom(100);
            }
            ValueList.add(ValueArray);
            WeightList.add(WeightArray);
        }
    }

    if(PrintDataSet) {
        for (int i = 0; i < DataSets; i++) {
            System.out.println("ValueList.add(new int[] " + Arrays.toString(ValueList.get(i)).replace("[", "{").replace("]", "};"));
            System.out.println("WeightList.add(new int[] " + Arrays.toString(WeightList.get(i)).replace("[", "{").replace("]", "};"));
        }
    }
    if(TestDataOne) {
        ValueList.add(new int[]{50, 72, 9});
        WeightList.add(new int[]{52, 32, 3});
        ValueList.add(new int[]{87, 63, 0});
        WeightList.add(new int[]{43, 88, 36});
        ValueList.add(new int[]{83, 16, 94});
        WeightList.add(new int[]{38, 22, 56});
        ValueList.add(new int[]{22, 61, 37});
        WeightList.add(new int[]{25, 13, 55});
        ValueList.add(new int[]{1, 15, 52});
        WeightList.add(new int[]{53, 80, 43});
    }else if(TestDataTwo) {
        ValueList.add(new int[]{500, 45, 50, 510});
        WeightList.add(new int[]{10, 1, 1, 11});
        ValueList.add(new int[]{850, 90});
        WeightList.add(new int[]{10, 1});
    }

    long CurrentNanonTime = System.nanoTime();

    if(TestSimulation) {
        GetLowestFrom(ValueList, WeightList);
        System.out.println("Calculating From Simulations: " + (System.nanoTime() - CurrentNanonTime)/Math.pow(10,9));
        CurrentNanonTime = System.nanoTime();
    }
    if(TestApproximation) {
        GetLowestFromUsingMath(ValueList, WeightList);
        System.out.println("Approximation by deciding optimal averages: " + (System.nanoTime() - CurrentNanonTime)/Math.pow(10,9));
        CurrentNanonTime = System.nanoTime();
    }
    if(TestMILPs) {
        Time = System.nanoTime();
        SolveModel(ValueList, WeightList);
        System.out.println("Mixed-Integer Linearisation problem solver:" + (System.nanoTime() - CurrentNanonTime)/Math.pow(10,9));
    }
}

public static int GetRandom(int Max){
    return ((int) (Math.random() * (Max + 1)));
}


public static double GetLowestFromUsingMath(ArrayList<int[]> ValueList, ArrayList<int[]> WeightList){
    int Datasets = ValueList.size();
    int[] FirstValues = ValueList.get(0);
    int[] FirstWeights = WeightList.get(0);
    int[][] Vars = new int[FirstValues.length][];
    for(int i = 0; i < Vars.length; i++){
        Vars[i] = new int[] {FirstValues[i], FirstWeights[i]};
    }


    ArrayList<int[]> FinalList = new ArrayList<>();
    ArrayList<int[]> FinalPathList = new ArrayList<>();
    for(int Startingpoint = 0; Startingpoint < Vars.length; Startingpoint++) {
        int[] PathsTaken = new int[Datasets];
        PathsTaken[0] = Startingpoint;
        int[] Total = Arrays.copyOf(Vars[Startingpoint], Vars[Startingpoint].length);
        for (int i = 1; i < Datasets; i++) {
            int[] Values = ValueList.get(i);
            int[] LowestValues = new int[Values.length];
            int[] Weights = WeightList.get(i);
            double Average = Integer.MAX_VALUE;
            int LowestNumberIndex = 0;
            for(int ValueTest = 0; ValueTest < Values.length; ValueTest++){
                double NewVar = (((double) Total[0] + Values[ValueTest])/(Total[1] + Weights[ValueTest]));
                if(NewVar < Average){
                    Average = NewVar;
                    LowestNumberIndex = ValueTest;
                    LowestValues[0] = Values[ValueTest];
                    LowestValues[1] = Weights[ValueTest];
                }
            }
            PathsTaken[i] = LowestNumberIndex;
            Total[0] = Total[0] + LowestValues[0];
            Total[1] = Total[1] + LowestValues[1];
        }
        FinalList.add(Total);
        FinalPathList.add(PathsTaken);
    }

    double Average = Integer.MAX_VALUE;
    int[] Dummy = new int[] {0,0};
    int[] OptimalPath = null;
    for(int i = 0; i < FinalList.size(); i++){
        int[] Values = FinalList.get(i);
        double AVar = ((double) Values[0])/Values[1];
        if(AVar < Average){
            Average = AVar;
            Dummy = Values;
            OptimalPath = FinalPathList.get(i);
        }
    }

    System.out.println("[Approximation] "+Arrays.toString(Dummy) + " With average of: " + Average + " Path taken: " + Arrays.toString(OptimalPath));
    return Average;

}

public static double GetLowestFrom(ArrayList<int[]> ValueList, ArrayList<int[]> WeightList){
    int[] MaxArray = new int[ValueList.size()];
    int[] OptimalArray = new int[ValueList.size()];
    int[] IterationArray = new int[ValueList.size()];
    for(int i = 0; i < ValueList.size(); i++){
        MaxArray[i] = ValueList.get(i).length  - 1;
        IterationArray[i] = 0;
    }

    double Outcome = Integer.MAX_VALUE;
    while(true){
        double Value = 0;
        double Weight = 0;
        for(int i = 0; i < IterationArray.length; i++){
            Value += ValueList.get(i)[IterationArray[i]];
            Weight += WeightList.get(i)[IterationArray[i]];
        }
        double Average = Value/Weight;
        if(Average < Outcome){
            Outcome = Average;
            OptimalArray = Arrays.copyOf(IterationArray, IterationArray.length);
        }
        IterationArray = GetIncrementedArray(MaxArray, IterationArray);
        if(IterationArray == null){
            break;
        }
    }

    System.out.println("[Simulation] "+"The optimal Array is: " + Arrays.toString(OptimalArray) + " With average of: " + Outcome);
    return -1;
}

public static void SolveModel(ArrayList<int[]> ValueList, ArrayList<int[]> WeightList){
    double TotalWeight;
    double TotalValues = 0;

    int Datasets = ValueList.size();
    int[] NumberTaken = new int[Datasets];
    int[] DatasetLength = new int[Datasets];
    int[][] Values = new int[Datasets][];
    int[][] Weights = new int[Datasets][];
    for(int i = 0; i < Datasets; i++){
        int[] DummyValArray = ValueList.get(i);
        int[] DummyWeiArray = WeightList.get(i);
        int[] ValueArray = Arrays.copyOf(DummyValArray, DummyValArray.length);
        Values[i] = ValueArray;
        int[] WeightArray = Arrays.copyOf(DummyWeiArray, DummyWeiArray.length);
        Weights[i] = WeightArray;
        DatasetLength[i] = DummyValArray.length;
    }

    try {
        IloCplex model = new IloCplex();

        IloNumVar y[][] =new IloNumVar[Datasets][];
        /*
         * Defining
         * This means that I define the t (inverse of sum of weights of path taken)
         * as a variable between 0 and 2.1 Billion
         * */
        IloNumVar t = model.numVar(0, Integer.MAX_VALUE, "t");

        /*
         * Defining
         * This means that I define all y values as variables between 0 and 2.1 Billion
         * */
        for(int i=0;i<Datasets;i++) {
            y[i] = new IloNumVar[DatasetLength[i]];
            for (int j = 0; j < DatasetLength[i]; j++) {
                String varName = "y" + (i + 1) + "" + (j + 1) + "";
                y[i][j] = model.numVar(0, Integer.MAX_VALUE, varName);
            }
        }

        /*
         * Defining
         * This means you define the model as a lineair numerical expression
         * */
        IloLinearNumExpr objective = model.linearNumExpr();

        /*
         * Linear expression.
         * This means that I sum over all the Values
         * */
        for(int datasetIndex = 0; datasetIndex < Datasets; datasetIndex++){
            for(int elementIndex = 0; elementIndex < DatasetLength[datasetIndex]; elementIndex++){
                objective.addTerm(y[datasetIndex][elementIndex], Values[datasetIndex][elementIndex]);
            }
        }

        /*
         * Constraint
         * This means that the sum of all y's of one data set is variable t.
         * */
        for(int datasetIndex = 0; datasetIndex < Datasets; datasetIndex++) {
            model.addEq(model.sum(y[datasetIndex]), t);
        }

        /*
         * Constraint
         * This part means that the Multiplication of weight and y must be one
         * */
        IloNumExpr[] wys = new IloNumExpr[Datasets];
        for(int i=0;i<Datasets;i++) {
            IloNumExpr[] elements = new IloNumExpr[DatasetLength[i]];
            for (int j = 0; j < DatasetLength[i]; j++) {
                elements[j] = model.prod(y[i][j] ,Weights[i][j]);
            }
            wys[i] = model.sum(elements);
        }
        model.addEq(model.sum(wys), 1);

        /*
         * Objective
         * This part signals that the objective is to minimize the function.
         * */
        model.addMinimize(objective);

        if(model.solve()){
            System.out.println("t = " + model.getValue(t));

            for(int datasetIndex = 0; datasetIndex < Datasets; datasetIndex++){
                for(int elementIndex = 0; elementIndex < DatasetLength[datasetIndex]; elementIndex++){
                    if(model.getValue(y[datasetIndex][elementIndex]) != 0){
                        NumberTaken[datasetIndex] = elementIndex;
                        TotalValues += Values[datasetIndex][elementIndex];
                    }
                }
            }
            TotalWeight = Math.pow(model.getValue(t), -1);
            System.out.println("[MILP] "+"Total Value: " + TotalValues + " Total Weight: " + TotalWeight + " Value/Weight: " + TotalValues/TotalWeight);
            System.out.println("[MILP] "+"Path: taken " + Arrays.toString(NumberTaken));
            System.out.println("[MILP] "+"Time taken = " + ((System.nanoTime() - Time)/(Math.pow(10,9))));
        }else{
            System.out.println("[MILP] "+"An optimal solution has not been found.");
        }

    }catch(Exception e){
        e.printStackTrace();
    }
}

public static int[] GetIncrementedArray(int[] MaxArray, int[] ArrayToIncrement){
    if(GetIndex(MaxArray, ArrayToIncrement) == -1){
        return null;
    }
    int[] ResultingArray = Arrays.copyOf(ArrayToIncrement, ArrayToIncrement.length);
    ResultingArray[0]++;
    for(int i = 0; i < ArrayToIncrement.length; i++){
        if(ResultingArray[i] > MaxArray[i]){
            ResultingArray[i] = 0;
            ResultingArray[i+1]++;
        }else{
            return ResultingArray;
        }
    }
    return null;
}


public static int GetIndex(int[] Array1, int[] Array2){
    for(int i = 0; i < Array1.length; i++){
        if(Array1[i] != Array2[i]){
            return i;
        }
    }
    return -1;
}

}

This code contains my failed attempt to calculate it mathematically (called approximation, Calculating by simulation (brute force) and the mixed-integer linear programming (MILP) solver. I hope this is useful to someone :)