how to set up an optimization problem to split a group of people into two groups, with several constraints

Question

how to set up an optimization problem to split a group of people into two groups, with several constraints

508 Views Asked by Bumbble Comm At 29 Mar 2026 - 10:14

I am a bit stuck with this problem; I found a temporary (an perhaps suboptimal) solution using Excel, but I'd like to hear your opinion /advice, please.

9 people want to form a group and go on on a trip together.
They have 5 possible destinations and 5 possible dates.
Each person gives a score for each destination and each date (separately, so 10 scores in total per person, not 25).
A score equal to 0 means that the person won't participate if the corresponding destination or date is chosen.
All scores different from 0 are normalized to sum up to 1 for each person, by destination and by date.

Here are the data (in R):

tr_op <- structure(list(Dest_A = c(0.333333333, 0.285714286, 0.1, 0.333333333, 
0.263157895, 0.2, 0.2, 0.25, 0), Dest_B = c(0.266666667, 0, 0.5, 
0.333333333, 0.105263158, 0.2, 0.2, 0, 0), Dest_C = c(0.133333333, 
0.214285714, 0.3, 0, 0.263157895, 0.2, 0.2, 0.25, 0.5), Dest_D = c(0.2, 
0.357142857, 0.1, 0, 0.105263158, 0.2, 0.2, 0.25, 0.5), Dest_E = c(0.066666667, 
0.142857143, 0, 0.333333333, 0.263157895, 0.2, 0.2, 0.25, 0), 
    Date_1 = c(0.119047619, 0.294117647, 0.2, 0.238095238, 0.111111111, 
    0, 0.2, 0.095238095, 0.333333333), Date_2 = c(0.166666667, 
    0.058823529, 0.2, 0.238095238, 0.111111111, 0, 0.2, 0.095238095, 
    0), Date_3 = c(0.238095238, 0.294117647, 0.2, 0.047619048, 
    0.111111111, 0, 0.2, 0.095238095, 0.333333333), Date_4 = c(0.238095238, 
    0.058823529, 0.2, 0.238095238, 0.111111111, 0, 0.2, 0.238095238, 
    0), Date_5 = c(0.238095238, 0.294117647, 0.2, 0.238095238, 
    0.555555556, 1, 0.2, 0.476190476, 0.333333333)), .Names = c("Dest_A", 
"Dest_B", "Dest_C", "Dest_D", "Dest_E", "Date_1", "Date_2", "Date_3", 
"Date_4", "Date_5"), class = "data.frame", row.names = c(NA, 
-9L))

Each row is one person.
Each cell is the normalized score expressed by the person in the corresponding row, for the destination or date in the corresponding column name.

As there is no single destination or date everyone agrees on, they can't all go together, so they decide to split into two groups and do two trips (2 different destinations and 2 different dates).

The problem is to assign each person to a destination and a date, at the same time choosing the destination-date combinations that maximize the sum of normalized scores.
There is no need to include all the people.
However, there can't be more than 2 destinations or more than 2 dates, and obviously the people in the same group must all agree on the destination and on the date (i.e. a person can't be assigned to a destination or a date for which they gave score 0).

I used Excel's Solver, setting up 9+9+5+5+5+5 = 38 binary cells as solution vectors, and constraining them and their sums as needed; it concluded that the people corresponding to row 1, 3 and 4 should go to destination B on date 4, and the rest to destination D on date 5.
However, the problem as I set it up is (apparently) not linear programming. I can only guess that's due to some vector products I had to use, but I'm not sure. In any case, Excel says it can't be solved by LP simplex; even the standard solver failed, and only the evolutionary method worked. So I am not sure I got a globally optimal answer.

How would you set this up?
Can it be made into a linear programming problem?
Do you see any analogy with already known problems?
Can you suggest further reading / other posts or literature I could consult?

Thanks!

EDIT 1 (following up from comment by skr)

Two groups A and B must be formed from the 9 people.
I encoded that as two (9x1) binary vectors $a$ and $b$.
Each person can be at most in one group, so:

$a_i + b_i \le 1, \forall i \in [1,9]$

Two destinations must be chosen, one for each group.
I encoded that as two (5x1) binary vectors $wa$ and $wb$.
Each destination can be at most in one group, so:

$wa_j + wb_j \le 1, \forall j \in [1,5]$

Two dates must be chosen, one for each group.
I encoded that as two (5x1) binary vectors $da$ and $db$.
Each date can be at most in one group, so:

$da_k + db_k \le 1, \forall k \in [1,5]$

Each group must have only 1 date and only 1 destination, so:

$\sum_j {wa_j} = 1$
$\sum_j {wb_j} = 1$
$\sum_k {da_k} = 1$
$\sum_k {db_k} = 1$

The objective function (to maximize) is:

$Obj = \sum_j {[ \sum_i W_{i,j} \cdot ( a_i \cdot wa_j + b_i \cdot wb_j) ] } + \sum_k {[ \sum_i D_{i,k} \cdot ( a_i \cdot da_k + b_i \cdot db_k) ] }$

where $W_{i,j}$ is the (9x5) matrix of destination scores and $D_{i,k}$ is the (9x5) matrix of date scores, as in the data I provided.$

I suppose the problem is nonlinear because there is a product of solution vectors in the objective function(?).
And I can see that my formulation does not guarantee that dates or destinations scored 0 are not included in the solution, it only 'discourages' that from happening by requiring that the sum of scores be maximal.

If you can please comment on whether the setup of the problem can be improved, and especially if it can be formulated as a standard LP problem, that would be great.

If I expanded the destination x dates cases and formulated this as a transport problem (9 x 25 matrix of 0's and 1's), essentially deciding what person to have on which trip, it might work, but then in the standard implementations I could not constrain the total number of distinct destinations and distinct dates to 2.

EDIT 2 (after further study)

Perhaps this can be written as a linear programming problem.
I will describe it below, but I would still like to have feedback on whether this the only possible way, or perhaps there are more efficient approaches.

First, I'll simplify the initial matrix so the intermediate steps are not too large and can be visualized:

df <- tr_op[c(1,2,9),c(1,2,4,6,10)]
df
     Dest_A    Dest_B    Dest_D    Date_1    Date_5
1 0.3333333 0.2666667 0.2000000 0.1190476 0.2380952
2 0.2857143 0.0000000 0.3571429 0.2941176 0.2941176
9 0.0000000 0.0000000 0.5000000 0.3333333 0.3333333

I want to form 2 groups out of the 3 persons (1,2,9).
Each group will go to one destination on one date. The destinations and the dates must be different for the two groups.

The only way I am able to turn this into a LP problem is by generating all possible combinations of people, destinations and dates:

df2 <- expand.grid(Person=rownames(df),Dest=colnames(df)[1:3],Date=colnames(df)[4:5])
df2
   Person   Dest   Date
1       1 Dest_A Date_1
2       2 Dest_A Date_1
3       9 Dest_A Date_1
4       1 Dest_B Date_1
5       2 Dest_B Date_1
6       9 Dest_B Date_1
7       1 Dest_D Date_1
8       2 Dest_D Date_1
9       9 Dest_D Date_1
10      1 Dest_A Date_5
11      2 Dest_A Date_5
12      9 Dest_A Date_5
13      1 Dest_B Date_5
14      2 Dest_B Date_5
15      9 Dest_B Date_5
16      1 Dest_D Date_5
17      2 Dest_D Date_5
18      9 Dest_D Date_5

Each of these 18 options will correspond to a score, composed by the date and destination score for each person. First I stack the initial data:

dfdestscores <- stack(df[1:3])
colnames(dfdestscores) <- c("Dest_score","Dest")
dfdestscores["Person"] <- rep(c(1,2,9),3)
dfdestscores
  Dest_score   Dest Person
1  0.3333333 Dest_A      1
2  0.2857143 Dest_A      2
3  0.0000000 Dest_A      9
4  0.2666667 Dest_B      1
5  0.0000000 Dest_B      2
6  0.0000000 Dest_B      9
7  0.2000000 Dest_D      1
8  0.3571429 Dest_D      2
9  0.5000000 Dest_D      9

dfdatescores <- stack(df[4:5])
colnames(dfdatescores) <- c("Date_score","Date")
dfdatescores["Person"] <- rep(c(1,2,9),2)
dfdatescores
  Date_score   Date Person
1  0.1190476 Date_1      1
2  0.2941176 Date_1      2
3  0.3333333 Date_1      9
4  0.2380952 Date_5      1
5  0.2941176 Date_5      2
6  0.3333333 Date_5      9

Then I merge:

df3 <- merge(df2,dfdestscores)
df3 <- merge(df3,dfdatescores)
df3
   Person   Date   Dest Dest_score Date_score
1       1 Date_1 Dest_A  0.3333333  0.1190476
2       1 Date_1 Dest_D  0.2000000  0.1190476
3       1 Date_1 Dest_B  0.2666667  0.1190476
4       1 Date_5 Dest_B  0.2666667  0.2380952
5       1 Date_5 Dest_D  0.2000000  0.2380952
6       1 Date_5 Dest_A  0.3333333  0.2380952
7       2 Date_1 Dest_A  0.2857143  0.2941176
8       2 Date_1 Dest_B  0.0000000  0.2941176
9       2 Date_1 Dest_D  0.3571429  0.2941176
10      2 Date_5 Dest_A  0.2857143  0.2941176
11      2 Date_5 Dest_B  0.0000000  0.2941176
12      2 Date_5 Dest_D  0.3571429  0.2941176
13      9 Date_1 Dest_B  0.0000000  0.3333333
14      9 Date_1 Dest_D  0.5000000  0.3333333
15      9 Date_1 Dest_A  0.0000000  0.3333333
16      9 Date_5 Dest_A  0.0000000  0.3333333
17      9 Date_5 Dest_D  0.5000000  0.3333333
18      9 Date_5 Dest_B  0.0000000  0.3333333

The 'trip_score', i.e. the score of a particular destination+date combination, is the sum of the two scores, but must be 0 if either of the two scores is 0:

df3["trip_score"] <- with(df3,(Dest_score+Date_score)*(Dest_score*Date_score != 0))
df3
   Person   Date   Dest Dest_score Date_score trip_score
1       1 Date_1 Dest_A  0.3333333  0.1190476  0.4523810
2       1 Date_1 Dest_D  0.2000000  0.1190476  0.3190476
3       1 Date_1 Dest_B  0.2666667  0.1190476  0.3857143
4       1 Date_5 Dest_B  0.2666667  0.2380952  0.5047619
5       1 Date_5 Dest_D  0.2000000  0.2380952  0.4380952
6       1 Date_5 Dest_A  0.3333333  0.2380952  0.5714286
7       2 Date_1 Dest_A  0.2857143  0.2941176  0.5798319
8       2 Date_1 Dest_B  0.0000000  0.2941176  0.0000000
9       2 Date_1 Dest_D  0.3571429  0.2941176  0.6512605
10      2 Date_5 Dest_A  0.2857143  0.2941176  0.5798319
11      2 Date_5 Dest_B  0.0000000  0.2941176  0.0000000
12      2 Date_5 Dest_D  0.3571429  0.2941176  0.6512605
13      9 Date_1 Dest_B  0.0000000  0.3333333  0.0000000
14      9 Date_1 Dest_D  0.5000000  0.3333333  0.8333333
15      9 Date_1 Dest_A  0.0000000  0.3333333  0.0000000
16      9 Date_5 Dest_A  0.0000000  0.3333333  0.0000000
17      9 Date_5 Dest_D  0.5000000  0.3333333  0.8333333
18      9 Date_5 Dest_B  0.0000000  0.3333333  0.0000000

I need to select 3 out of these 18 entries, maximizing the trip_score sum, within the above constraints. This IMO is a binary LP problem.

Imposing the constraint that each person goes to one and only one trip can be done by making a table:

df3["Entry"] <- 1:18
table(df3$Entry,df3$Person)

     1 2 9
  1  1 0 0
  2  1 0 0
  3  1 0 0
  4  1 0 0
  5  1 0 0
  6  1 0 0
  7  0 1 0
  8  0 1 0
  9  0 1 0
  10 0 1 0
  11 0 1 0
  12 0 1 0
  13 0 0 1
  14 0 0 1
  15 0 0 1
  16 0 0 1
  17 0 0 1
  18 0 0 1

The dot product of the solution vector by each of these columns must be == 1.

However, AFAIK imposing that 2 distinct destinations and 2 distinct dates are chosen can't be done this way, because one doesn't know beforehand which specific destinations or dates will be chosen, and especially how many people will participate to each trip.

A table like this:

table(df3$Entry,df3$Dest)

     Dest_A Dest_B Dest_D
  1       1      0      0
  2       0      0      1
  3       0      1      0
  4       0      1      0
  5       0      0      1
  6       1      0      0
  7       1      0      0
  8       0      1      0
  9       0      0      1
  10      1      0      0
  11      0      1      0
  12      0      0      1
  13      0      1      0
  14      0      0      1
  15      1      0      0
  16      1      0      0
  17      0      0      1
  18      0      1      0

counts the number of people who have been assigned to each destination, so it can't be used to set constraints.
Setting its columns to <= 1 could yield a solution where each person has been assigned to a different destination, for a total of 3, not the desired 2.

Even for the date, where there are only two possibilities, this table counts the number of people who have been assigned to each date:

table(df3$Entry,df3$Date)

     Date_1 Date_5
  1       1      0
  2       1      0
  3       1      0
  4       0      1
  5       0      1
  6       0      1
  7       1      0
  8       1      0
  9       1      0
  10      0      1
  11      0      1
  12      0      1
  13      1      0
  14      1      0
  15      1      0
  16      0      1
  17      0      1
  18      0      1

Thus, setting each column to == 1 would yield an unfeasible problem, given the constraint on the people (in one group there are by necessity 2 people).
Setting each column to <= 2 would work in this specific case, because it would stop the algorithm from choosing the same date for all 3 people, and as there are overall 2 possible dates, a total of 2 dates would necessarily be selected; but if there were more than 2 dates, this strategy would fail.

For me this implies that the solution vector must be extended in order to count the number of distinct destinations and dates.

The new rows of the matrix can be made like this:

dfDest <- data.frame(Dest=colnames(df)[1:3],trip_score=0,Entry=19:21)
dfDest
    Dest trip_score Entry
1 Dest_A          0    19
2 Dest_B          0    20
3 Dest_D          0    21

dfDate <- data.frame(Date=colnames(df)[4:5],trip_score=0,Entry=22:23)
dfDate
    Date trip_score Entry
1 Date_1          0    22
2 Date_5          0    23

And then merged:

df4 <- merge(df3,dfDest,all=T)
df4 <- merge(df4,dfDate,all=T)
df4 <- df4[order(df4$Entry),]
df4
   trip_score Entry   Date   Dest Person Dest_score Date_score
15  0.4523810     1 Date_1 Dest_A      1  0.3333333  0.1190476
12  0.3190476     2 Date_1 Dest_D      1  0.2000000  0.1190476
13  0.3857143     3 Date_1 Dest_B      1  0.2666667  0.1190476
16  0.5047619     4 Date_5 Dest_B      1  0.2666667  0.2380952
14  0.4380952     5 Date_5 Dest_D      1  0.2000000  0.2380952
17  0.5714286     6 Date_5 Dest_A      1  0.3333333  0.2380952
18  0.5798319     7 Date_1 Dest_A      2  0.2857143  0.2941176
1   0.0000000     8 Date_1 Dest_B      2  0.0000000  0.2941176
20  0.6512605     9 Date_1 Dest_D      2  0.3571429  0.2941176
19  0.5798319    10 Date_5 Dest_A      2  0.2857143  0.2941176
2   0.0000000    11 Date_5 Dest_B      2  0.0000000  0.2941176
21  0.6512605    12 Date_5 Dest_D      2  0.3571429  0.2941176
3   0.0000000    13 Date_1 Dest_B      9  0.0000000  0.3333333
22  0.8333333    14 Date_1 Dest_D      9  0.5000000  0.3333333
4   0.0000000    15 Date_1 Dest_A      9  0.0000000  0.3333333
5   0.0000000    16 Date_5 Dest_A      9  0.0000000  0.3333333
23  0.8333333    17 Date_5 Dest_D      9  0.5000000  0.3333333
6   0.0000000    18 Date_5 Dest_B      9  0.0000000  0.3333333
7   0.0000000    19   <NA> Dest_A   <NA>         NA         NA
8   0.0000000    20   <NA> Dest_B   <NA>         NA         NA
9   0.0000000    21   <NA> Dest_D   <NA>         NA         NA
10  0.0000000    22 Date_1   <NA>   <NA>         NA         NA
11  0.0000000    23 Date_5   <NA>   <NA>         NA         NA

The sum of elements 19:21 of the binary solution vector is the count of distinct chosen destinations, and the sum of elements 22:23 is the count of distinct chosen dates.
So the required constraints are easily imposed by creating matrix columns of length 23, with 0's everywhere except in the appropriate positions where 1's apply (19:21 and 22:23, respectively), and setting each to == 2.

However, further constraints are needed, because there is a relationship between elements 1:18 and elements 19:21 and 22:23. E.g. if Entry 1 is set to 1 in the solution, Dest_A and Date_1 have to be set to 1, too.
I posted a question on how to achieve this elsewhere; it got no answers, so I am going to assume it's fine, and I will implement it.

The constraint matrix to force the choice of a destination once the choice of a trip is made is the Dest table with its elements 19:21 negated and multiplied by the total number of destinations:

cmDest <- table(df4$Entry,df4$Dest)
cmDest[19:21,] <- -cmDest[19:21,]*length(unique(dfDest$Dest))
cmDest

     Dest_A Dest_B Dest_D
  1       1      0      0
  2       0      0      1
  3       0      1      0
  4       0      1      0
  5       0      0      1
  6       1      0      0
  7       1      0      0
  8       0      1      0
  9       0      0      1
  10      1      0      0
  11      0      1      0
  12      0      0      1
  13      0      1      0
  14      0      0      1
  15      1      0      0
  16      1      0      0
  17      0      0      1
  18      0      1      0
  19     -3      0      0
  20      0     -3      0
  21      0      0     -3
  22      0      0      0
  23      0      0      0

The dot product of each of these columns dot by the solution vector must be <= 0.
[This corresponds to imposing: 'trip to destination X selected' -> 'destination X selected'. E.g. if Entry 1 were selected and Entry 19 were not, the first column of the above matrix would be > 0].

The condition 'no trip to destination X selected' -> 'destination X not selected' could be imposed by adding the same constraint matrix as above and setting it to >= -(number of destinations)+1.
Alternatively, it's sufficient to set the trip_score of rows 19:21 to -1, as this is a maximization problem:

df4[19:21,"trip_score"] <- -1

Same for the Date:

cmDate <- table(df4$Entry,df4$Date)
cmDate[22:23,] <- -cmDate[22:23,]*length(unique(dfDate$Date))
cmDate

     Date_1 Date_5
  1       1      0
  2       1      0
  3       1      0
  4       0      1
  5       0      1
  6       0      1
  7       1      0
  8       1      0
  9       1      0
  10      0      1
  11      0      1
  12      0      1
  13      1      0
  14      1      0
  15      1      0
  16      0      1
  17      0      1
  18      0      1
  19      0      0
  20      0      0
  21      0      0
  22     -2      0
  23      0     -2

And:

df4[22:23,"trip_score"] <- -1

Putting everything together:

#initializing
N_Person <- length(unique(df3$Person))
N_Dest <- length(unique(df3$Dest))
N_Date <- length(unique(df3$Date))
cm <- NULL
cdir <- NULL
crhs <- NULL

#constraining each person to be in 1 and only 1 trip
cm <- cbind(cm,table(df4$Entry,df4$Person))
cdir <- c(cdir,rep("==",N_Person))
crhs <- c(crhs,rep("1",N_Person))

#constraining the choice of exactly 2 distinct destinations
cm <- cbind(cm,(!is.na(df4$Dest))*(is.na(df4$Person)))
cdir <- c(cdir,rep("==",1))
crhs <- c(crhs,rep("2",1))

#constraining the choice of exactly 2 distinct dates
cm <- cbind(cm,(!is.na(df4$Date))*(is.na(df4$Person)))
cdir <- c(cdir,rep("==",1))
crhs <- c(crhs,rep("2",1))

#linking the trips to the destinations
cm <- cbind(cm,cmDest)
cdir <- c(cdir,rep("<=",N_Dest))
crhs <- c(crhs,rep("0",N_Dest))

#linking the trips to the dates
cm <- cbind(cm,cmDate)
cdir <- c(cdir,rep("<=",N_Date))
crhs <- c(crhs,rep("0",N_Date))

#setting up to objective to maximize
obj <- df4$trip_score

#solving using lpSolve
require(lpSolve)
sol <- lp("max",obj,cm,cdir,crhs,transpose.constraints=FALSE,all.bin=TRUE,num.bin.solns=1)
df4[as.logical(sol$solution),]
   trip_score Entry   Date   Dest Person Dest_score Date_score
17  0.5714286     6 Date_5 Dest_A      1  0.3333333  0.2380952
18  0.5798319     7 Date_1 Dest_A      2  0.2857143  0.2941176
3   0.0000000    13 Date_1 Dest_B      9  0.0000000  0.3333333
7  -1.0000000    19   <NA> Dest_A   <NA>         NA         NA
8  -1.0000000    20   <NA> Dest_B   <NA>         NA         NA
10 -1.0000000    22 Date_1   <NA>   <NA>         NA         NA
11 -1.0000000    23 Date_5   <NA>   <NA>         NA         NA

EDIT 3 (addendum to EDIT 2)

The solution in EDIT 2 does not meet all the initial criteria.
Although one person is assigned to 1 and only 1 trip, and 2 distinct dates and 2 distinct destinations are used, I did not consider that:

this didn't imply that in total there would be 2 unique trips (2 Dest * 2 Date yields 4 possible trips, and indeed in the above solution 3 of them are chosen)
there was still the possibility to choose an Entry with score 0 (meaning a trip the corresponding person would not be able to attend)
in fact it was NOT sufficient to use scores of -1; the condition >= -N+1 had to be used.

Finally, the solution that seems to address these issues and solves the original problem is the following:

df <- tr_op

#making the combinatorial table df2 from the initial df
NDests <- 5
NDates <- 5
NPersons <- nrow(df)
df2 <- expand.grid(Person=rownames(df),Dest=colnames(df)[1:NDests],Date=colnames(df)[(NDests+1):(NDests+NDates)])

#extracting the Dest and Date scores from the initial df
dfdestscores <- stack(df[1:NDests])
colnames(dfdestscores) <- c("Dest_score","Dest")
dfdestscores["Person"] <- rep(rownames(df),NDests)
dfdatescores <- stack(df[(NDests+1):(NDests+NDates)])
colnames(dfdatescores) <- c("Date_score","Date")
dfdatescores["Person"] <- rep(rownames(df),NDates)

#merging the Dest and Date scores into df2
df3 <- merge(df2,dfdestscores)
df3 <- merge(df3,dfdatescores)
df3["trip_score"] <- with(df3,(Dest_score+Date_score)*(Dest_score*Date_score != 0))

#tagging the rows of the initial table
df3["Tag"] <- c("comb")

#creating the Trip column in df3
df3["Trip"] <- with(df3,paste0(Dest,Date))

#making the rows for the count of unique Dest, Date and trip
dfDest <- data.frame(Dest=colnames(df)[1:NDests],trip_score=0,Tag="Dest")
dfDate <- data.frame(Date=colnames(df)[(NDests+1):(NDests+NDates)],trip_score=0,Tag="Date")
dfTrip <- expand.grid(Dest=colnames(df)[1:NDests],Date=colnames(df)[(NDests+1):(NDests+NDates)],trip_score=0)
dfTrip["Trip"] <- with(dfTrip,paste0(Dest,Date))
dfTrip <- dfTrip[c("Trip","trip_score")]    
dfTrip["Tag"] <- "Trip"
NTrips <- nrow(dfTrip)

#merging df3 with the new rows
df4 <- merge(df3,dfDest,all=T)
df4 <- merge(df4,dfDate,all=T)
df4 <- merge(df4,dfTrip,all=T)

#removing invalid entries
df4 <- df4[(df4$Tag != "comb" | (df4$trip_score != 0)),]

#sorting df4 and creating an Entry column
df4 <- df4[order(df4$Tag,df4$Trip),]
df4["Entry"] <- 1:nrow(df4)

#making the constraint matrices for Dest, Date and Trip
cmDest <- table(df4$Entry,df4$Dest)
cmDestSum <- apply(cmDest,2,sum)-1
cmDest[(df4$Tag == "Dest"),] <- -diag(cmDestSum)
cmDate <- table(df4$Entry,df4$Date)
cmDateSum <- apply(cmDate,2,sum)-1
cmDate[(df4$Tag == "Date"),] <- -diag(cmDateSum)
cmTrip <- table(df4$Entry,df4$Trip)
cmTripSum <- apply(cmTrip,2,sum)-1
cmTrip[(df4$Tag == "Trip"),] <- -diag(cmTripSum)

#initializing the LP matrices
cm <- NULL
cdir <- NULL
crhs <- NULL

#constraining each person to be in 1 and only 1 trip
cm <- cbind(cm,table(df4$Entry,df4$Person))
cdir <- c(cdir,rep("==",NPersons))
crhs <- c(crhs,rep(1,NPersons))

#exception: Person 1 can be in 2 trips
crhs[1] <- 2

#constraining the choice of exactly 2 distinct destinations
cm <- cbind(cm,as.numeric(df4$Tag == "Dest"))
cdir <- c(cdir,rep("==",1))
crhs <- c(crhs,rep(2,1))

#constraining the choice of exactly 2 distinct dates
cm <- cbind(cm,as.numeric(df4$Tag == "Date"))
cdir <- c(cdir,rep("==",1))
crhs <- c(crhs,rep(2,1))

#constraining the choice of exactly 2 distinct trips
cm <- cbind(cm,as.numeric(df4$Tag == "Trip"))
cdir <- c(cdir,rep("==",1))
crhs <- c(crhs,rep(2,1))

#linking the Entries to the destinations
cm <- cbind(cm,cmDest)
cdir <- c(cdir,rep("<=",NDests))
crhs <- c(crhs,rep(0,NDests))
cm <- cbind(cm,cmDest)
cdir <- c(cdir,rep(">=",NDests))
crhs <- c(crhs,-cmDestSum+1)

#linking the Entries to the dates
cm <- cbind(cm,cmDate)
cdir <- c(cdir,rep("<=",NDates))
crhs <- c(crhs,rep(0,NDates))
cm <- cbind(cm,cmDate)
cdir <- c(cdir,rep(">=",NDates))
crhs <- c(crhs,-cmDateSum+1)

#linking the Entries to the trips
cm <- cbind(cm,cmTrip)
cdir <- c(cdir,rep("<=",NTrips))
crhs <- c(crhs,rep(0,NTrips))
cm <- cbind(cm,cmTrip)
cdir <- c(cdir,rep(">=",NTrips))
crhs <- c(crhs,-cmTripSum+1)

#setting up to objective to maximize
obj <- df4$trip_score

#solving using lpSolve
require(lpSolve)
sol <- lp("max",obj,cm,cdir,crhs,transpose.constraints=FALSE,all.bin=TRUE,num.bin.solns=1)
df4[as.logical(sol$solution),]

#optionally,using Rsymphony, which is sometimes faster
require(Rsymphony)
solrs <- Rsymphony_solve_LP(obj,t(cm),cdir,crhs,types="B",max=TRUE)
df4[as.logical(solrs$solution),]

If anyone can suggest a better approach, that'd be great.

Original Q&A

how to set up an optimization problem to split a group of people into two groups, with several constraints

Related Questions in OPTIMIZATION

Related Questions in LINEAR-PROGRAMMING

Trending Questions

Popular # Hahtags

Popular Questions