Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dependency matrix to constrains #450

Open
RKonstantinR opened this issue Apr 15, 2024 · 19 comments
Open

add dependency matrix to constrains #450

RKonstantinR opened this issue Apr 15, 2024 · 19 comments

Comments

@RKonstantinR
Copy link

Hello, thanks for the great package!

I'm trying to use your library to develop a model for creating an optimal project portfolio. I was able to set constraint on the total number of projects and write the resulting function that maximizes profit.

But in addition to standard constraints, I’m trying to add project dependency matrix into the model. A number of projects depend on each other. The implementation of two projects out of three does not make economic sense.

There is a way to set constraints based on a dependency matrix and check that all related projects are present in the resulting list of projects?

library(dplyr)
library(ROI)
library(ROI.plugin.symphony)
library(ompr)
library(ompr.roi)
library(ROI.plugin.glpk)

project_list <- tibble(name = c("a", "b", "c", "d"), profit = c(100, 10, 5, 50)) %>% 
  mutate(item = row_number())

max_project <- 3

# dependency matrix (if 1 - the project depends on another project)
my_matrix <- matrix(
  c(0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0), 
  nrow = 4,   
  ncol = 4,        
  byrow = TRUE         
)

rownames(my_matrix) <- c("a", "b", "c", "d")
colnames(my_matrix) <- c("a", "b", "c", "d")

m <- MIPModel() %>%
  add_variable(x[i], i=project_list$item, type="binary") %>%
  add_constraint(sum_over(x[i], i=project_list$item) <= max_project) %>% 
  set_objective(sum_over(project_list$profit[i]*x[i], i=project_list$item),"max") %>% 
  solve_model(with_ROI(solver = "symphony", verbosity=1))

This model selects projects a, b, d, but given dependencies, projects a, b, c should be selected.

@sbmack
Copy link

sbmack commented Apr 15, 2024

With dependent binary variables you have to define the types of dependencies between the variables. The side constraints are formulated based on type. For example:

x1 requires x2:
x1 <= x2
x1 – x2 <= 0

x1, x2 mutually dependent:
x1 – x2 = 0

x1, x2 mutually exclusive:
a + b <= 1

There are other cases.

Dependency constraints with more than 2 variables can be chained together. The specific constraint you need will be a function of the dependencies among a, b, c, d. Edit your question to include that info for more help.

@RKonstantinR
Copy link
Author

RKonstantinR commented Apr 16, 2024

Thanks for the reply!

Dependency constraints with more than 2 variables can be chained together. The specific constraint you need will be a function of the dependencies among a, b, c, d. Edit your question to include that info for more help.

In my case, projects that have a 1 in the dependency matrix are mutually dependent on each other. I need to add a constraint that the sum of dependencies for each row of a matrix containing only selected projects must be equal to 1.

The problem is that I don't understand how to properly include this constraint in the model.

full matrix (normalized)
  a b   c    d  Sum Profit
a 0 0.5 0.5  0    1 100
b 1 0   0    0    1 10
c 1 0   0    0    1 5 
d 0 0   0    0    0 50

subset matrix containing 3 projects with maximum profit without taking into account mutual dependence
  a b   d    Sum
a 0 0.5 0    0.5 < 1 - error
b 1 0   0    1 = 1 - ok
d 0 0   0    0 = 1 - ok

subset matrix containing 3 projects with maximum profit with taking into account mutual dependence
  a b   c    Sum
a 0 0.5 0.5  1 = 1 - ok
b 1 0   0    1 = 1 - ok
c 1 0   0    1 = 1 - ok

@sbmack
Copy link

sbmack commented Apr 16, 2024

In my case, projects that have a 1 in the dependency matrix are mutually dependent on each other. I need to add a constraint that the sum of dependencies for each row of a matrix containing only selected projects must be equal to 1.

The problem is that I don't understand how to properly include this constraint in the model.

full matrix (normalized)
  a b   c    d  Sum Profit
a 0 0.5 0.5  0    1 100
b 1 0   0    0    1 10
c 1 0   0    0    1 5 
d 0 0   0    0    0 50

subset matrix containing 3 projects with maximum profit without taking into account mutual dependence
  a b   d    Sum
a 0 0.5 0    0.5 < 1 - error
b 1 0   0    1 = 1 - ok
d 0 0   0    0 = 1 - ok

subset matrix containing 3 projects with maximum profit with taking into account mutual dependence
  a b   c    Sum
a 0 0.5 0.5  1 = 1 - ok
b 1 0   0    1 = 1 - ok
c 1 0   0    1 = 1 - ok

Again, you have to specify the type of each dependency because it determines the signs of the coefficients, the type of inequality/equality and the value of the right hand side (RHS). So for example row 1 in your matrix, what is the dependency relationship between b and c? And also the other rows?

@RKonstantinR
Copy link
Author

Again, you have to specify the type of each dependency because it determines the signs of the coefficients, the type of inequality/equality and the value of the right hand side (RHS). So for example row 1 in your matrix, what is the dependency relationship between b and c? And also the other rows?

Projects depend on each other if the matrix at their intersection has a value greater than 0. For example:

First row (project a):
  a b   c    d   Sum
a 0 0.5 0.5  0   1 - project "a" depends on project "b" and "c"

Second row (project b):
  a b   c    d   Sum
b 1 0   0    0   1 - project "b" depends on project "a"

Last row (project d):
  a b   c    d  Sum
d 0 0   0    0  0 - project "d" does not depend on any project

For example, let’s take three projects: “a”, “b”, “d” and do this check manually. To do this, you need to take the original matrix, exclude the row and column with project “с” and calculate the sum for each row.

  a b   d    Sum
a 0 0.5 0    0.5 < 1 - error
b 1 0   0    1 = 1 - ok
d 0 0   0    0 = 1 - ok

Project "a" depends on project "c" and if it is excluded, the row sum for project "a" will be less than 1.

@sbmack
Copy link

sbmack commented Apr 16, 2024

Projects depend on each other if the matrix at their intersection has a value greater than 0. For example:

But exactly HOW do they depend on each other?

If project a requires project b and project c then that maps to case 1 above:
x1 requires x2:
x1 <= x2
x1 – x2 <= 0

So for the a requires b AND c case could be written as:

a - b <= 0
a - c <= 0
or
2*a - b - c <= 0

So you can see that the coefficients, inequality and RHS are dependent on the type of constraint.

@RKonstantinR
Copy link
Author

So you can see that the coefficients, inequality and RHS are dependent on the type of constraint.

Thank you for your help and patience.

This is my first experience using this package, as well as designing MILP models, so a number of my questions may be quite primitive or I may be mistaken.

The inequality you proposed for project "b" correctly describes the dependencies.

But this is only true for one project; for other projects it will be different.

What if there are 1000 such projects? for each of them is it necessary to write down such an inequality?

Or should I specify a matrix as the decision variable?

@sbmack
Copy link

sbmack commented Apr 16, 2024

Or should I specify a matrix as the decision variable?

You could use matrices with 1 matrix block for each constraint type. The values in a matrix are the coefficients of the contraints. So with: the a requires b AND c case could be written as:

a - b <= 0
a - c <= 0
or
2*a - b - c <= 0

Better to use the 2 constraint version with the coefficients being:

a b c d
1, -1, 0, 0
1, 0, -1, 0

For the block of constraints the inequality would be <= and the RHS would be 0

@RKonstantinR
Copy link
Author

For the block of constraints the inequality would be <= and the RHS would be 0

Thanks to your hint, I improved the coefficient matrix and constrain type.

The a requires b AND c and they are completely dependent on each other:
a + b + c == 1 or 1 - a - b - c == 0

Coefficients would be:

coef_matrix <- matrix(
  c(1/3, 1/3, 1/3, 0, 0.5, 0.5, 0, 0, 0.5, 0, 0.5, 0, 0, 0, 0, 1), 
  nrow = 4,   
  ncol = 4,        
  byrow = TRUE         
)

rownames(coef_matrix ) <- c("a", "b", "c", "d")

colnames(coef_matrix ) <- c("a", "b", "c", "d")

Then I try to set the constrain:

sum(coef_matrix[,1]) + sum(coef_matrix[,2]) + sum(coef_matrix[,3]) == max_project
[1] TRUE
sum(coef_matrix[,1]) + sum(coef_matrix[,2]) + sum(coef_matrix[,4]) == max_project
[1] FALSE

But I get an error Error in add_constraint(): ! Some constraints are not proper linear constraints or are not true.:

library(dplyr)
library(ROI)
library(ROI.plugin.symphony)
library(ompr)
library(ompr.roi)
library(ROI.plugin.glpk)

project_list <- tibble(name = c("a", "b", "c", "d"), profit = c(100, 10, 5, 50)) %>% 
  mutate(item = row_number())

max_project <- 3

coef_matrix <- matrix(
  c(1/3, 1/3, 1/3, 0, 0.5, 0.5, 0, 0, 0.5, 0, 0.5, 0, 0, 0, 0, 1), 
  nrow = 4,   
  ncol = 4,        
  byrow = TRUE         
)

rownames(coef_matrix) <- c("a", "b", "c", "d")
colnames(coef_matrix) <- c("a", "b", "c", "d")

m <- MIPModel() %>%
  add_variable(x[i], i=project_list$item, type="binary") %>%
  add_constraint(sum_over(coef_matrix[,i], i = project_list$item) == max_project) %>% 
  add_constraint(sum_over(x[i], i=project_list$item) <= max_project) %>% 
  set_objective(sum_over(project_list$profit[i] * x[i], i=project_list$item),"max") %>%
  solve_model(with_ROI(solver = "symphony", verbosity=1))

@sbmack
Copy link

sbmack commented Apr 17, 2024

But I get an error Error

You have a bunch of errors in your model. Here's an update:

project_list <- tibble(name = c("a", "b", "c", "d"), profit = c(100, 10, 5, 50)) %>% 
  mutate(item = row_number())

nprojs <- nrow(project_list)

max_project <- 3

# coef_matrix <- matrix(
#   c(1/3, 1/3, 1/3, 0, 0.5, 0.5, 0, 0, 0.5, 0, 0.5, 0, 0, 0, 0, 1), 
#   nrow = 4,   
#   ncol = 4,        
#   byrow = TRUE         
# )

# rownames(coef_matrix) <- c("a", "b", "c", "d")
# colnames(coef_matrix) <- c("a", "b", "c", "d")

# constraint vector:
cvec <- c(1, 1, 1, 0)


m <- MIPModel() %>%
  # add_variable(x[i], i=project_list$item, type="binary") %>%
  add_variable(x[i], i = 1:nprojs, type = 'binary') |> 
  # add_constraint(sum_over(coef_matrix[,i], i = project_list$item) == max_project) %>% 
  add_constraint(sum_over(cvec[i] * x[i], i = 1:nprojs) == 1) |> 
  # add_constraint(sum_over(x[i], i=project_list$item) <= max_project) %>% 
  add_constraint(sum_over(x[i], i = 1:nprojs) <= max_project) |> 
  # set_objective(sum_over(project_list$profit[i] * x[i], i=project_list$item),"max")
  set_objective(sum_over(project_list$profit[i] * x[i], i = 1:nprojs),"max")

  result <- solve_model(m, with_ROI(solver = "symphony", verbosity=1))
  1. You must index over a range. You indexed over a single value
  2. The range is set from 1 to the number of projects (nprojs)
  3. All constraints must be summed over the declared variable x. a, b, c are not variables
  4. The binary side constraint coefficients are contained in a vector (cvec) since there is only 1
  5. Note that your a + b + c = 1 constraint forces the solution to be less than 3 projects, so the max_project constraint in inactive
  6. I separated the model execution from the piping and set the value to result because it's easier to review post-optimization.

@RKonstantinR
Copy link
Author

RKonstantinR commented Apr 18, 2024

Thank you for your feedback.

I was able to add the required constraint to the model (#259 (comment)).

Here is the working code:

library(dplyr)
library(ROI)
library(ROI.plugin.symphony)
library(ompr)
library(ompr.roi)
library(ROI.plugin.glpk)

project_list <- tibble(name = c("a", "b", "c", "d"), 
                       profit = c(100, 10, 5, 50)) %>% 
  mutate(item = row_number(),
         num = 1)

max_project <- 3

coef_matrix <- matrix(
  c(1/3, 1/3, 1/3, 0, 0.5, 0.5, 0, 0, 0.5, 0, 0.5, 0, 0, 0, 0, 1), 
  nrow = 4,   
  ncol = 4,        
  byrow = TRUE         
)

rownames(coef_matrix) <- c("a", "b", "c", "d")
colnames(coef_matrix) <- c("a", "b", "c", "d")

sum_func <- function(i){
  a_sub <- coef_matrix[, i] 
  a_vec <- sum(as.vector(a_sub)) 
  return(a_vec)
}

m <- MIPModel() %>%
  add_variable(x[i], i=project_list$item, type="binary") %>%
  add_constraint(sum_over(sum_func(i) * x[i], i = project_list$item) == sum_over(project_list$num[i] * x[i], i=project_list$item)) %>%
  add_constraint(sum_over(x[i], i=project_list$item) <= max_project) %>% 
  set_objective(sum_over(project_list$profit[i] * x[i], i=project_list$item),"max")

result <- solve_model(m, with_ROI(solver = "symphony", verbosity=1))

@sbmack
Copy link

sbmack commented Apr 18, 2024

Thank you for your feedback.

The use of the coef_matrix does not make sense since you have only a single constraint. Moreover you stated:

The a requires b AND c and they are completely dependent on each other:
a + b + c == 1 or 1 - a - b - c == 0

What you are saying there is that a OR b OR c must be selected. That means only 1 solution variable of the 3. Since d has no dependencies, you do not have to include d in a constraint. Simple visual inspection of your profit vector shows that a and d should be selected since a = 100 and d = 50. However, run the code:

get_solution(result, x[i]) and you see that a, b, c are selected which violates your intention.

Please clarify what you really want.

@RKonstantinR
Copy link
Author

The model generates the right solutions for my task.

Perhaps I didn't describe the constraints quite correctly:

a + b + c + d + ... + n == 1 - This condition must be met for each row of the matrix

I will try to describe the problem in simple words without mathematical equations:

It is required to select no more than three out of four projects that will bring maximum profit, and if the projects depend on each other, then they can only be included together.

In the given example, the solution of the model is projects a, b and с. These are three interdependent projects with a total profit of 115.

If change the conditions and assign a profit of 500 to the project d, then the solution of the model will be the project d. Thus, the model takes into account the dependency matrix.

project_list$profit[4] <- 500

get_solution(result, x[i])
  variable i value
1        x 1     0
2        x 2     0
3        x 3     0
4        x 4     1

@sbmack
Copy link

sbmack commented Apr 18, 2024

The model generates the right solutions for my task.

That constraint type is usually solved using auxiliary binary variables:

project_list <- tibble(name = c("a", "b", "c", "d"), profit = c(100, 10, 5, 50)) %>% 
  mutate(item = row_number())

nprojs <- nrow(project_list)

max_project <- 3

cvec <- c(1, 1, 1, 0)
abcsum <-3

m <- MIPModel() %>%
  add_variable(x[i], i = 1:nprojs, type = 'binary') |>
  add_variable(abc, type = 'binary') |> 
  add_constraint(sum_over(cvec[i] * x[i], i = 1:nprojs) - abcsum * abc == 0) |> 
  add_constraint(sum_over(x[i], i = 1:nprojs) <= max_project) |> 
  set_objective(sum_over(project_list$profit[i] * x[i], i = 1:nprojs),"max")

wd <- getwd()
file_out <- paste(wd, "test.txt", sep = "/")
ROI_write(as_ROI_model(m), file_out, "lp_cplex")

result <- solve_model(m, with_ROI(solver = "symphony", verbosity=1))

In the above code the defined coefficient abcsum is k of j linked projects permitted in the solution. In this case all 3, a, b, c. If you changed abcsum to 2, only 2 out of 3 would be selected if any. abc is the auxillary binary variable. The secondary constraint ensures that the linked projects are selected as a group or not at all. Note that no maxtrix as input or link to an external function are required.

@RKonstantinR
Copy link
Author

Your code works on this data, but I tried to add another group of interdependent projects (e and f) and the solution did not match my expectations.

project_list <- tibble(name = c("a", "b", "c", "d", "e", "f"), 
                                 profit = c(100, 10, 5, 50, 100, 15)) %>% 
                       mutate(item = row_number())

nprojs <- nrow(project_list)
max_project <- 3

cvec <- c(1, 1, 1, 0, 1, 1)
abcsum <- 1

I tried to use the model on real data. I have a data set for 1800 projects and a dependency matrix of 1800x1800.

I waited about 4 hours and, without waiting for an answer, stopped the calculation.

Is there a way to estimate the approximate time it will take to find a solution? Or maybe it's worth trying to use a different solver?

@sbmack
Copy link

sbmack commented Apr 19, 2024

Your code works on this data, but I tried to add another group of interdependent projects (e and f) and the solution did not match my expectations.

It worked for me:

> result
Status: success
Objective value: 150
> get_solution(result, x[i])
  variable i value
1        x 1     1
2        x 2     0
3        x 3     0
4        x 4     1
5        x 5     0
6        x 6     0

Your constraint says that projects 1:3, 5:6 are linked. Only one project of those 5 can be selected, so project 1 and project 4 which has no dependencies are basic. The solution is degenerate since project 5 has the same profit coefficient as project 1. A MIP problem with 1800 variables and 1800 binary constraints is considered small and should solve very quickly with any solver.

@RKonstantinR
Copy link
Author

Your constraint says that projects 1:3, 5:6 are linked. Only one project of those 5 can be selected, so project 1 and project 4 which has no dependencies are basic.

There are two groups of interdependent projects in the data set (c(a, b, c) & c(e, f) and one independent project (d).

One of the constrain is “all projects from the same group or none must be selected”.

The correct solution would be to select project d and a group of projects e and f. Changing the abcsum parameter does not produce the desired result.

Model calculation parameters

Starting Preprocessing...
Preprocessing finished...
 	 variables fixed: 569
	 variables aggregated: 568
Problem has 
	 2 constraints 
	 1226 variables 
	 2398 nonzero coefficients

Total Presolve Time: 0.000000...

Solving...

granularity set at 1.000000
solving root lp relaxation
The LP value is: -341194.000 [0,182]

@sbmack
Copy link

sbmack commented Apr 19, 2024

I

There are two groups of interdependent projects in the data set (c(a, b, c) & c(e, f) and one independent project (d).

One of the constrain is “all projects from the same group or none must be selected”.

It should be obvious that you have to add a second auxiliary constraint.

@RKonstantinR
Copy link
Author

It should be obvious that you have to add a second auxiliary constraint.

I modified your solution to take into account the dependency matrix. This could only be achieved using a loop.
Thanks for the help.

project_list <- tibble(name = c("a", "b", "c", "d", "e", "f"), 
                       profit = c(200, 10, 5, 50, 100, 15)) %>% 
  mutate(item = row_number())

nprojs <- nrow(project_list)

max_project <- 3

coef_matrix <- matrix(
  c(1, 1, 1, 0, 0, 0, 
    1, 1, 1, 0, 0, 0,
    1, 1, 1, 0, 0, 0,
    0, 0, 0, 1, 0, 0,
    0, 0, 0, 0, 1, 1,
    0, 0, 0, 0, 1, 1), 
  nrow = 6,   
  ncol = 6,        
  byrow = TRUE         
)

m <- MIPModel() %>%
  add_variable(x[i], i = 1:nprojs, type = 'binary') |>
  add_constraint(sum_over(x[i], i = 1:nprojs) <= max_project) |> 
  set_objective(sum_over(project_list$profit[i] * x[i], i = 1:nprojs),"max")

for(j in 1:nprojs) {
  
  m <- add_constraint(m, sum_over(coef_matrix[i,j] * x[i], i = 1:nprojs) - sum(coef_matrix[,j]) * x[j] == 0)
}

result <- solve_model(m, with_ROI(solver = "symphony", verbosity=1))

@sbmack
Copy link

sbmack commented Apr 24, 2024

I modified your solution to take into account the dependency matrix.

It's not clear to me what you are trying to do. I ran the model and the solution is a, b, c = 1. In coef_matrix constraint rows 1:3 are identical and so are rows 5:6. If your intent is a + b + c <= 1 and e + f <= 1 it can be formulated like this:

project_list <- tibble(name = c("a", "b", "c", "d", "e", "f"), 
                       profit = c(200, 10, 5, 50, 100, 15)) %>% 
  mutate(item = row_number())

nprojs <- nrow(project_list)

max_project <- 3

coef_matrix <- matrix(
  c(1, 1, 1, 0, 0, 0, 
    0, 0, 0, 0, 1, 1), 
  ncol = 6,        
  byrow = TRUE         
)

crows = nrow(coef_matrix)

m <- MIPModel() %>%
  add_variable(x[i], i = 1:nprojs, type = 'binary') |>
  add_constraint(sum_over(x[i], i = 1:nprojs) <= max_project) |> 
  set_objective(sum_over(project_list$profit[i] * x[i], i = 1:nprojs),"max") |> 
  
  add_constraint(sum_over(coef_matrix[i, j] * x[j], j = 1:nprojs) <= 1, i = 1:crows)

result <- solve_model(m, with_ROI(solver = "symphony", verbosity=1))

You don't need a for loop. You index down rows of constraints by indexing on the RHS of the constraint set. In this case i = 1:crows.

You're welcome...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants