Maximally Diverse Grouping Problem

Gallego, Laguna, Martí and Duarte (2011)
Optsicom project, University of Valencia (Spain)

Problem Description

The maximally diverse grouping problem (MDGP) consists of grouping a set of M elements into Gmutually disjoint groups in such a way that the diversity among the elements in each group is maximized. The diversity among the elements in a group is calculated as the sum of the individual distance between each pair of elements, where the notion of distance depends on the specific application context. The objective of the problem is to maximize the overall diversity, i.e., the sum of the diversity of all groups. Feo and Khellaf (1990) proved that the MDGP is NP-hard.

In order to formulate the MDGP in mathematical terms, we assume that each element can be represented by a set of attributes. Let sik be the state or value of the kth attribute element, where k= 1,…,K and i=1,…,M. Then, the distance dij between element i and j may be simply defined by the Eucliden calculation:

We have identified two variants of the MDGP. The first one (MDGP1) is the better known and forces all groups to have the same number of elements, with S=M/G. The second variant (MDGP2) allows the size Sg of each group g to be in the interval [ag,bg], where ag <= bg for g = 1,…,G. Clearly, MDGP1 is a special case of the MDGP2 for which ag = bg for all g.

Both variants can be formulated as quadratic integer programs with binary variables xig that take the value of 1 if element i is in group g and 0 otherwise. A quadratic integer programming formulation of the MDGP1 is:

The objective function adds the distance of all pairs of elements that belong to the same group. The first set of constraints forces the assignment of each element to a group. The second set of constraints forces the size of all groups to be equal to S. In the more general case, MDGP2, the second set of constraints is replaced with:

State of the Art Methods

The most relevant approximated methods developed to solve this problem are:

  • LCW: Multistart algorithm composed by a random constructive and an improvement method. The improvement method is developed as a modified version of LC method due to Lofti and Cerveny (1991). Weitz and Lakshminarayanan (1998)
  • LSGA: Hybrid genetic algorithm that combines a genetic algorithm and a local search procedure. The genetic aspect of LSGA is based on the encoding scheme for grouping problems proposed by Falkenauer (1998). The local search within LSGA implements a best improvement strategy based on exchanging elements between groups. Fan et al. (2011)
  • T‐LCW: Multistart algorithm composed by a greedy constructive (GC) and T-LCW, a tabu version of LCW improvement method. Gallego et al. (2011)
  • SO: Multistart algorithm composed by a greedy constructive (GC) and strategic oscillation improvement based on T-LCW. Gallego et al. (2011)

A Java 6 implementation of all state of the art methods is available for download as binary. For usage instructions execute .jar file with the command:

java -jar mdgp_jors_2011.jar

Please make sure that you have Java 6 SE installed in your system.

Instances (MDGPLIB)

We have compiled a comprehensive set of 480 benchmark instances representative of the collections previously used for computational experiments in the MDGP. We call this benchmark MDPGLIB. Furthermore we have included new hard instances. A brief description of the origin and the characteristics of the set of instances follows:

  • RanReal: This set consists of 160 M x M matrices in which the distance values dij are real numbers generated using a Uniform distribution U(0,100). The number of elements M, the number of groups Gand the limits of each group ag and bg are shown in Table 1. There are 20 instances for each combination of parameters (i.e., each row in Table 1), 10 for instances with equal group size (EGS) and 10 for instances with different group size (DGS). For the 10 instances in EGS, the group size is equal for all instances and is calculated as ag = ag = M / G. For the 10 instances in DGS, the limits of each group (ag and bg) for each instance are generated randomly in the predefined interval. That is, the value of ag is generated in the interval [agmin,agmax] and the value of bg is generated in the interval [bgmin,bgmax]. This data set was introduced by Fan et al. (2011) with M ranging from 10 to 240. The larger instances with M=480 and M=960 had been generated for inclusion in MDGPLIB.

Table 1. Summary of parameters to generate problem instances

  • RanInt: This set has the same structure and size as RanReal (shown in Table 1) but distances are generated with an integer Uniform distribution.
  • Geo: This set follows the same structure and size as the previous two, however, values are calculated as Euclidean distances between pair of points with coordinates randomly generated in [0,10]. The number of coordinates for each point is generated randomly in the 2 to 21 range.

Download MDGPLIB. The format of the instances is described here

Computational Experience

We have performed a computational comparison of the state of the art methods on 240 instances (5 EGS instances and 5 DGS instances for M=10 to 960). All methods were stopped using the same time limit, which varied according to problem size, as specified in Table 2.

Table 2. CPU time limits

In this experiment we compute for each instance and each method the relative deviation Dev (in percent) between the best solution value, Value, obtained with the method and the best known value of this instance, BestValue. For each method, we also report the number of instances #Best for which the method obtains the best value.

All algorithms were implemented in Java 6 and the experiment was performed in a Intel Core 2 Quad CPU Q 8300 with 6 GiB of RAM and Ubuntu 9.04 64 bits OS.

Download the raw data used to calculate this table.


References

  • Fan, Z. P., Y. Chen, J. Ma and S. Zeng (2011) “A hybrid genetic algorithmic approach to the maximally diverse grouping problem”, Journal of the Operational Research Society, vol. 62, pp. 92‐99.
  • Feo, T. and M. Khellaf (1990) “A class of bounded approximation algorithms for graph partitioning”, Networks, vol. 20, pp. 181‐195.
  • Lotfi V. and R. Cerveny (1991) “A final exam scheduling package”, Journal of the Operational Research Society, vol. 42, pp. 205‐216.
  • Weitz, R. R. and S. Lakshminarayanan (1998) “An empirical comparison of heuristic methods for creating maximally diverse groups”, Journal of the Operational Research Society, vol. 49, no. 6, pp. 635‐646.
  • Gallego, M., Laguna, M., Martí, R. and Duarte, A. (2011) “Tabu Search with Strategic Oscillation for the Maximally Diverse Grouping Problem”, Journal of the Operational Research Society. Accepted. Download.

Maximum Diversity Problem

Martí, Gallego and Duarte (2010)
Optsicom project, University of Valencia (Spain)

Problem Description

The maximum diversity problem (MDP) consists of selecting a subset of m elements from a set of nelements in such a way that the sum of the distances between the chosen elements is maximized. The definition of distance between elements is customized to specific applications. In most applications, it is assumed that each element can be represented by a set of attributes. Let sik be the state or value of the kth attribute of element i, where k = 1, …, K. Then the distance between elements i and j may be defined as:

In this case, dij is simply the Euclidean distance between i and j. The distance values are then used to formulate the MDP as a quadratic binary problem, where variable xi takes the value 1 if element i is selected and 0 otherwise, i = 1, …, n:

Kuo, Glover and Dhir (1993) use this formulation to show that the clique problem (which is known to be NP-complete) is reducible to the MDP.

State of the Art Methods

The most relevant approximated methods developed to solve this problem, categorized by methodology, are:

  • GRASP based methdos
    • GRASP_D2+I_LS: Constructive method coupled with a first-improvement local search. Duarte and Martí (2007).
    • GRASP+PR: Hybrid method that combines GRASP methodology with Path-Relinking. Silva et al. (2007).
  • Local search based methods
    • ITS: Iterated Tabu Search that alternates tabu search with perturbation procedures. Palubeckis (2007).
    • A_VNS: Variable neighborhood search method with a basic tabu search improvement method. Aringhieri and Cordone (2006).
  • Population based methods
    • G_SS: Scatter Search algorithm with memory structures. Gallego et al. (2009).
    • MA: Memetic Algorithm that combines a population based approach with a local search approach. Katayama and Narihisa (2005).

Instances (MDPLIB)

We have compiled a comprehensive set of benchmark instances representative of the collections previously used for computational experiments in the MDP. We call this benchmark MDPLIB. Furthermore we have included new hard instances. A brief description of the origin and the characteristics of the set of instances follows:

  • SOM: This data set consists of 70 matrices with random numbers between 0 and 9 generated from an integer uniform distribution.
    • SOM-a: These 50 instances were generated by Martí et al. (2010) with a generator developed by Silva et al. (2004). The instance sizes are such that for n = 25, m = 2 and 7; for n = 50, m = 5 and 15; for n = 100, m = 10 and 30; for n = 125, m = 12 and 37; and for n = 150, m = 15 and 45.
    • SOM-b: These 20 instances were generated by Silva et al. (2004) and used in most of the previous papers (see for example Aringhieri et al. 2008). The instance sizes are such that for n = 100, m = 10, 20, 30 and 40; for n = 200, m = 20, 40, 60 and 80; for n = 300, m = 30, 60, 90 and 120; for n= 400, m = 40, 80, 120, and 160; and for n = 500, m = 50, 100, 150 and 200.
  • GKD: This data set consists of 145 matrices for which the values were calculated as the Euclidean distances from randomly generated points with coordinates in the 0 to 10 range.
    • GKD-a: Glover et al. (1998) introduced these 75 instances in which the number of coordinates for each point is generated randomly in the 2 to 21 range. The instance sizes are such that for n = 10, m = 2, 3, 4, 6 and 8; for n = 15, m = 3, 4, 6, 9 and 12; and for n = 30, m = 6, 9, 12, 18 and 24.
    • GKD-b: Martí et al. (2010) generated these 50 matrices for which the number of coordinates for each point is generated randomly in the 2 to 21 range and the instance sizes are such that for n = 25, m = 2 and 7; for n = 50, m = 5 and 15; for n = 100, m = 10 and 30; for n = 125, m = 12 and 37; and for n = 150, m = 15 and 45.
    • GKD-c: Duarte and Martí (2007) generated these 20 matrices with 10 coordinates for each point and n = 500 and m = 50.
  • MDG: This data set consists of 100 matrices with real numbers randomly selected between 0 and 10 from a uniform distribution.
    • MDG-a: Duarte and Martí (2007) generated these 40 matrices, 20 of them with n = 500 and m = 50 and the other 20 with n = 2000 and m = 200. These instances were used in Palubeckis (2007).
    • MDG-b: This data set consists of 40 matrices generated by Duarte and Martí (2007). 20 of them have n = 500 and m = 50, and the other 20 have n = 2000 and m = 200. These instances were used in Gallego et al. (2009) and Palubeckis (2007).
    • MDG-c: This is a new data set with 20 matrices with n = 3000 and m ∈ {300, 400, 500, 600}. These are the largest instances reported of the MDPLIB. They are similar to those used in Palubeckis (2007).

The MDPLIB contains 315 instances and the best known values for the largest ones (SOM-b, GKD-c, MDG-a, MDG-b and MDG-c). They have been obtained by running G_SS and ITS for 2 hours of CPU time (in each instance) in a workstation Intel Core 2 Quad CPU Q 8300 with 6 GiB of RAM and Ubuntu 9.04 64 bits OS.

Download MDPLIB.

Computational Experience

We have performed a computational comparison of the state of the art methods on several of these the instances. We have considered a time limit in our comparative of 10 minutes per instance. In this experiment we compute for each instance and each method the relative deviation Dev (in percent) between the best solution value, Value, obtained with the method and the best known value of this instance, BestValue. For each method, we also report the number of instances #Best for which the method obtains the best value.

The experiment was performed in a Intel Core 2 Quad CPU Q 8300 with 6 GiB of RAM and Ubuntu 9.04 64 bits OS.

References

  • R. Aringhieri, R. Cordone, and Y. Melzani. Tabu search vs. grasp for the maximum diversity problem. 4OR: A Quarterly Journal of Operations Research, 6(1):45–60, 2008.
  • R. Aringhieri and R. Cordone. Better and faster solutions for the maximum diversity problem. Technical report, Universit degli Studi di Milano, Polo Didattico e di Ricerca di Crema, 2006.
  • A. Duarte and R. Martí. Tabu search and grasp for the maximum diversity problem. European Journal of Operational Research, 178:71–84, 2007.
  • M. Gallego, A. Duarte, M. Laguna, and R. Martí. Hybrid heuristics for the maximum diversity problem. Computational Optimization and Applications, 44(3):411, 2009.
  • F. Glover, C.C. Kuo and K.S. Dhir. Heuristic algorithms for the maximum diversity problem. Journal of Information and Optimization Sciences, 19(1):109–132, 1998.
  • K. Katayama and H. Narihisa. An evolutionary approach for the maximum diversity problem. In W. Hart, N. Krasnogor, and J.E. Smith, editors, Recent advances in memetic algorithms, volume 166, pages 31–47. Springer, Berlin, 2005.
  • C.C. Kuo, F. Glover, and K. S. Dhir. Analyzing and Modeling the Maximum Diversity Problem by Zero-One programming. Decision Sciences, 24(6):1171–1185, 1993.
  • R. Martí, M. Gallego, and A. Duarte. A branch and bound algorithm for the maximum diversity problem. European Journal of Operational Research, 200(1):36–44, 2010.
  • G. Palubeckis. Iterated tabu search for the maximum diversity problem. Applied Mathematics and Computation, 189:371383, 2007.
  • G.C. Silva, L.S. Ochi, and S.L. Martins. Experimental Comparison of Greedy Randomized Adaptive Search Procedures for the Maximum Diversity Problem. In Experimental and Efficient Algorithms, volume 3059 of Lecture Notes in Computer Science, pages 498–512. Springer Berlin / Heidelberg, 2004.
  • G.C. Silva, M.R.Q. Andrade, L.S. Ochi, S.L. Martins, and A. Plastino. New heuristics for the maximum diversity problem. Journal of Heuristics, 13(4):315–336, 2007.

MaxCut Problem

Martí, Duarte and Laguna (2009)
University of Valencia (Spain), University Rey Juan Carlos (Spain) and University of Colorado at Boulder (USA)

Problem Description

Consider a graph G = (VE) with vertex set V = [1,…,n] and edge set E. Let wij be the weight associated with edge (ij) in E. A cut(SS’) is a partition of V into two sets SS’ = V \ S, and its value cut(SS’) is given by the expression:

The max-cut problem consists of finding a cut in G with maximum value. Karp (1972) showed that the max-cut is an NP-hard problem.

State of the Art Methods

The most relevant metaheuristics developed to solve this problem are:

  • SS: Scatter Search. Martí et al (2009)
  • CirCut: Heuristic based on problem relaxations. Burer et al (2001)
  • VNSPR: Variable Neighbourhood Search and Path Relinking. Festa et al (2002)

MaxCut Instances

We have compiled a comprehensive set of benchmark problems including all problem instances which have so far been used for conducting computational experiments for the MaxCut problem. We give a brief description of the origin and the characteristics of the groups of problems.

  • Set 1:This set consists of instances generated by Helmberg and Rendl (2000) who used rudy, a machine-independent graph generator by Giovanni Rinaldi, to create 54 instances ranging from n = 800 to 3,000. They consist of toroidal, planar, and random graphs with weights taking the values 1, 0, or −1. Burer et al. (2001) and Festa et al. (2002) used these graphs in their experiments. Download.
  • Set2: This set contains 30 instances described in Festa et al. (2002). The first 10 are small-size instances with an average of 128 vertices and 300 edges, the second 10 instances are medium-size graphs with 1,000 vertices and density equal to 0.60%, and the last 10 are large-size instances with 2,744 vertices and density equal to 0.22%. The weight values are either 1, 0, or −1. Download.
  • Set 3: The four instances in this set come from the 7th DIMACS Implementation Challenge. The number of vertices ranges from 512 to 3,375 and the number of edges from 1,536 to 10,125. Burer et al. (2001) used all four instances in their experiments. Festa et al. (2002) considered only one.Download.

Computational Experience

We have performed an intensive experimentation with the best know methods to compute the best known values for the instances. We have also reviewed related literature searching for better values than these. The final best known values for the instances can be downloaded here.

We performed a computational comparison of the state of the art methods on the instances. The experiments were performed in a Dual Intel Xeon at 3.06 GHz with 3.2 GiB of RAM. It can be downloaded in Excel.


References

  • Burer, S., R. D. C. Monteiro, Y. Zang. 2001. Rank-two relaxation heuristics for max-cut and other binary quadratic programs. SIAM J. Optim. 12 503–521
  • Festa, P., P. M. Pardalos, M. G. C. Resende, C. C. Ribeiro. 2002. Randomized heuristics for the max-cut problem. Optim. Methods Software 7 1033–1058
  • Helmberg, C., F. Rendl. 2000. A spectral bundle method for semidefinite programming. SIAM J. Optim.10 673–696
  • Karp, R. M. 1972. Reducibility among combinatorial problems. R. Miller, J. Tatchers, eds. Complexity of Computer Computations. Plenum Press, New York, 85–103
  • Martí, R., A. Duarte, M. Laguna. 2009. Advanced Scatter Search for the Max-Cut Problem. INFORMS Journal on Computing 21(1), pp. 26–38

Equitable Dispersion Problem

Martí R., Sandoya F. (2012)

University of Valencia (Spain), Escuela Superior Politécnica del Litoral (Ecuador)

Problem Description

The Equitable Dispersion Problem (EDP) consists in selecting a subset of elements, whose cardinality is not fixed in advance, from a given set in such a way that maximizes the equity among elements. There are several equity measures in the framework of the EDP. These equity measures may also arise in the contexts ofdiverse/similar group selection,identifying diverse groups of people, and diversificationor assimilation among members of a network.

In particular, we target the Max-Mean dispersion model in which the average distance between the selected elements is maximized, this measure is used to balance the dispersion among individual elements. This model is especially suited for instances in which the distances represent affinity and are not restricted to take non-negative values.

In mathematical terms, given a set N of n elements and dij the affinity between any elements i and j, the Max-Mean DP consists of selecting a subset M of N in such a way that the dispersion mean dm(M), in terms of the affinity values, is maximized.

The dm(M) value reflects an equity measure based on an average dispersion. The Max-Mean Dispersion Problem can be trivially formulated with binary variables as:

Where variable xi takes the value 1 if element i is selected and 0 otherwise.  Note that the number of selected elements, |M|, is not set a priori as in the rest of diversity models, such as the MDP or the MMDP, where it is pre-specified as a problem constraint.

Prokopyev (2009) showed that the Max-Mean dispersion problem is strongly NP-hard if the distances (diversity measure) take both positive and negative values.

State of the Art Methods

The metaheuristics developed to solve the Max-Mean dispersion problem are:

  • GRASP1: GRASP based heuristic developed as a modified version of heuristic1 proposed by Prokopyev (2009)for solving Max-Minsum DP, which is a somewhat similar problem. Martí and Sandoya (2012)
  • GRASP2: GRASP based method developed as a modified version of GRASP_C2, proposed by Duarte and Martí (2007) for the Max-Sum model. Martí and Sandoya (2012)
  • GRASP3+PR: Hybrid method that combines GRASP methodology with a Variable Neighborhood Search improvement method and Path-Relinking. Martí and Sandoya (2012)
  • TS: Tabu Search with dynamic tenure and a diversification stage. Carrasco et al. 2014.

Instances

Most of the previous works on diversity limit themselves to problems with non-negative distances. However, as described in Glover et al. (1998), the diversity measure can be something in the nature of an affinity relationship, which expresses a relative degree of attraction between the elements as arises in settings with a behavioral component. We have constructed two sets of instances where, as explained above, the distances can take both negative and positive values.

  • Type I: This data set consists of 60 symmetric matrices with random numbers between -10 and 10 generated from a uniform distribution. They can represent the affinity between individuals in a social network. We generate 10 instances for each size of n = 20, 25, 30, 35, 150, 500, 750, and 1000.
  • Type II: This data set consists of 60 symmetric matrices with random numbers in [-10,-5] U [5,10]. These instances can reflect the polarization that occurs when people get together in groups. We can identify clusters of individuals with a high attraction within clusters and a high repulsion between clusters and with no room for indifference. We generate 10 instances for each size of n= 20, 25, 30, 35, 150, 500, 750, and 1000.

You can download the instances here.

Computational Experience

We performed extensive computational experiments, with 160 instances, the final optimum values or best values for the instances can be downloaded here.


References

Carrasco, R., Anthanh P. T., Gallego M., Gortázar F., Duarte A., Martí R. “Tabu Search for the Max-Mean Dispersion Problem“. Submitted to Knowledge Based Systems.

Martí R., Sandoya F. “GRASP and Path Relinking for the Equitable Dispersion Problem”. Computers & Operations Research, 2012.To appear

Prokopyev O.A., Kong N, Martinez-Torres DL. ”The equitable dispersion problem”, European Journal of Operational Research, 197: 59-67, (2009)

Duarte A, Martí R. “Tabu Search and GRASP for the Maximum Diversity Problem”, European Journal of Operational Research; 178, 71-84, (2007)

Page, S.E. (2007), The difference – How the power of diversity creates better groups, firms, schools, and societies, Princeton University Press.

Resende, M.G.C. and Ribeiro, C.C. “Greedy Randomized Adaptive Search Procedures”, State of-the-art Handbook in Metaheuristics, F. Glover and G. Kochenberger (Eds.), Kluwer Academic Publishers, Boston, pp. 219-250, (2001)

Glover, F., Kuo, C.C. and Dhir K.S., ”Heuristic Algorithms for the Maximum Diversity Problem”, Journal of Information and Optimization Sciences, vol. 19 (1), 109-132, (1998)

Min-Max Arc Crossing

T. Pastore, A. Martinez-Gavara, A. Napoletano, P. Festa, and R. Martí (2018)
Optsicom project, University of Valencia (Spain)

Problem Description

The crossing minimization problem in hierarchical digraphs has received a lot of attention. Even the problem in bipartite graphs has been extensively studied for more than 40 years, beginning with the Relative Degree Algorithm introduced in Carpano [6]. Early heuristics were based on simple ordering rules, reflecting the goal of researchers and practitioners of quickly obtaining solutions of reasonable quality. However, the field of optimization has recently evolved introducing complex methods, both in exact and heuristic domains. The crossing problem has benefited from these techniques, and advanced solution strategies have been proposed in the last 20 years to solve it.

We focus on a variant of the crossing problem recently introduced in Stallmann (2012). In particular, this author identified some applications in the context of VLSI circuits design in which it is more appropriate to minimize the maximum number of crossings over all edges (min-max) rather than the sum of the edge crossings over all the graph (min-sum). This application motivated the work by Stallman, originally called the bottleneck crossing minimization, where a heuristic algorithm is proposed to minimize the maximum number of crossings among all edges. More recently, Martí et al. (2018) applied a metaheuristic algorithm to solve the problem defined in Stallman’s work, which they called the max-min crossing problem. Remarkably, the solution of the min-max problem is also useful in general graph drawing softwares, where zooming highlights a specific area of the graph where it is then desirable to have a low number of crossings.


State of the Art Methods

Two previous heuristic algorithms were developed to solve this problem:

  • MCE: method based on the sifting principle, which performs passes over the layers of the graph, trying to relocate the endpoints of edges with an elevated number of crossing, until no further improvement can be found. Stallmann (2012)
  • SO: iterated greedy heuristic based on the Strategic Oscillation methodology. This procedure alternates three steps: constructive phase, neighborhood search and destructive phase, in which tries to improve the number of crossing of edges whose cost is close to the maximum (near critical edges). Martí et al. (2018).

Instances

All the instances employed in the experimentation were generated using the generator provided by Stallmann (2012). The benchmark set is made up by a total of 149 networks of heterogeneous size, with graph densities ranging from low values, with a minimum density of 1.5, to high values, with a maximum density of 14.5.

There is 1 folder in the MinMaxGDPlib.zip file: with 60 instances. The file format follows:

The first line has the following fields separed with spaces:

  • n: Integer indicating the total number of vertices in the graph
  • m: Integer indicating the total number of edges in the graph
  • k: Integer indicating the total number of layers in the graph

Starting from the second line, the number of nodes in each layer are reported. The nodes are numbered sequentially. Each of the following lines represent an edge among two adjacent layers of the graph, in the format: starting_node end_node. For example:6 4 2 3 3 1 5 2 4 3 6 3 5

is a graph with six total nodes, divided in two layers of three nodes each. The total number of edges is four, and the vertices in the first layer are labeled as 1 2 3, while the vertices in the second layer are labeled 4 5 6. the connections are between node 1 and node 5, node 2 and node 4, node 3 and 6, and node 3 and 5.

Download Min-Max Arc Crossing Instances (MinMaxGDPlib).

Computational Experience

The computational experiments described in this section were performed to assess the performances of the Tabu Search implementation proposed in this paper. The comparative testing can be divided in a two-round competitive experimentation: in the first round of experiments, we investigate the comparison between CPLEX (v12.8), MCE and our Tabu Search, while solving small-sized instances with low density. In the second round of experiments we undertake the comparison of the three heuristic methods (MCE, SO and Tabu Search), while drawing more complex networks with higher graph densities.

Min-Max Arc Crossing results


References

  • Carpano (1980). Automatic display of hierarchized graphs for computer-aided decision analysis. IEEE Transactions on Systems, Man, and Cybernetics, 10(11):705-715.
  • Martí, R., V. Campos, A. Hoff, and J. Peiró (2018). Heuristics for the min-max arc crossing problem in graphs. Expert Systems with Applications 109, 100-113.
  • Stallmann, M.F. (2012). A heuristic for bottleneck crossing minimization and its performance on general crossing minimization: Hypothesis and experimental study. ACM Journal of Experimental Algorithms, 17(1): 1-30.

Dynamic Bipartite Graph Drawing Problem

Martí R., Martínez-Gavara, A., Sánchez-Oro J., and Duarte A. (2017)
Optsicom project, University of Valencia (Spain)

Problem Description

Drawings of graphs have many applications and they are nowadays well-established tools in computer science in general, and optimization in particular. Project scheduling is one of the many areas in which representation of graphs constitutes an important instrument. The experience shows that the main quality desired for drawings of graphs is readability, and crossing reduction is a fundamental aesthetic criterion to achieve it. Incremental or dynamic graph drawing is an emerging topic in this context, where we seek to preserve the layout of a graph over successive drawings. In this paper, we target the edge crossing reduction in the context of incremental graph drawing. Specifically, we apply a mathematical programming formulation and several heuristic methods based on the tabu search methodology to solve it. In line with the previous paper on this topic, we consider bipartite graphs in our experimentation. The extensive computational experiments with more than 1,000 instances show the superiority of our proposals in both, quality and computing time.


Instances (DBDPLIB)

We employed two sets of instances in our experimentation. The first one contains 120 instances generated according to Martí and Estruch (2001), while the second one has 1000 instances and of instances based on the original number of vertices in each layer, (n1, n2), and the graph 12 was proposed by Stallmann et al. (2001). In line with previous papers, we generated the first set density d in the interval [0.065, 0.175]. Additionally, as in Martí and Estruch (2001), the instances are incremented adding vertices and edges up to pre-established numbers. These numbers are calculated as a percentage δ of the quantities in the original graph (|IVi| = δ|Vi| for each i=1,2, and |IE|=δ|E|. The second set contains 1000 instances obtained with the generator described in Stallmann et al. (2001), which is publicly available. The size of the first layer is in the range [10, 377], while the size of second layer ranges from 10 to 190 nodes. The number of edges is in the range [20, 950]. These instances are bipartite graphs, and we convert them in incremental bipartite graphs by considering a percentage of their nodes as the new nodes added to the original graph. In this way, we kept the structure and density of the instance. In particular, for each original instance we have generated three new instances obtained by selecting as new nodes the 10%, 20%, and 30% percent of the original nodes.

The DBDPLIB contains 1120 instances:

Download DBDP instances (DBDPLIB).

Computational Experience

The computational experiments described in this section were performed to test the effectiveness and efficiency of the procedures discussed above. The previous GRASP method, called prev_GRASP, by Martí and Estruch (2001), and our new procedures were implemented in Java SE 8, and the experiments were conducted on a computer with a 2.8 GHz Intel Core i7 processor with 16 GB of RAM. In particular, we report the results obtained with our constructive method, tabu search, and path relinking post-processing. Additionally, the mathematical programming formulation described in Section 2 was solved with Gurobi

Dynamic Bipartite Drawing Problem results

Uncapacitated r-Allocation p-Hub Median Problem

Peiró J., Corberán A., Martí R. (2013)
Optsicom project, University of Valencia (Spain)

Problem Description

The Uncapactitated r-Allocation p-Hub Median Problem (UrApHMP) is a variant of the classical p-hub problem. Hub location problems arise when given a set of nodes with pairwise traffic demands, we have to choose p of them as hub locations and route all the traffic through these hubs at a minimum cost. For each pair of nodes i and j, there is a traffic tij that needs to be transported. It is generally assumed that direct transportation between non-hub nodes is not possible, and the tij traffic travels on a path i→hi→hj→j, where hi and hj are hubs assigned to i and j, respectively.

Recently, Yaman (2011) proposed the UrApHMP, a very interesting variant of this classical location problem in which each node can be connected to at most r of the p hubs. The motivation of this variant comes from the fact that the single allocation version, in which a node is connected (assigned) to a single hub is too restricted for real-world situations, while the multiple allocation variant, where each node can use any of the p hubs to route its traffic, results in high fixed costs and complicated networks. The r-allocation p-hub problem, being r≤p, generalizes both versions of the p-hub median problem. When r=1we are at the single allocation version, whereas if r=p, we have the multiple allocation version.

In mathematical terms, given a network with a set of nodes N and a set of arcs A, let tij be the amount of traffic to be routed from node i to node j, i.e., through the arc (i,j), and let dij be its associated unit routing cost. The UrApHMP is then formulated (Yaman, 2011) in terms of the following variables: Given a node kzkk=1 if the node is a hub (i.e., if a hub is set or located at this node), and zkk=0 otherwise. Given a non-hub node i and a hub kzik=1 if node i is assigned or allocated to node k, and 0 otherwise. Finally, fijkl is the proportion of the traffic tij from node i to node j that travels along the path i→k→l→j, where k and l are hubs. With these variables, the problem can be formulated as follows:

The p-hub median problem belongs to the class of NP-hard problems. Even when the set of hubs is given, the sub-problem of optimal allocation of non-hub nodes to hubs is also NP-hard (Love, Morris and Wesolowski, 1988).

State of the Art Methods

A GRASP in which we consider three local search procedures. We also implement a filtering mechanism to discard low-quality constructions to help saving running time for large instances.

A scatter search algorithm in which we consider seven diversification generation methods, and path-relinking as an extension of the classical combination method.

Instances

We have tested our algorithms on three sets of instances:

  1. The CAB (Civil Aviation Board) data set. It is based on airline passenger flows between some important cities in the United States. It consists of a data file, presented by O’Kelly in 1987, with the distances and flows of a 25 nodes graph. From this original file, 75 instances with 25 nodes and p= 1,…, 5, and r= 1,…, p have been generated by several authors. The following parameter values have been widely used: χ=1,δ=1, and α= 0.2, 0.4, 0.6, 0.8, and 1.
  2. The AP (Australian Post) data set. It is based on real data from the Australian postal service and was presented by Ernst and Krishnamoorthy in 1996. The size of the original data file is 200 nodes. Smaller instances can be obtained using a code from ORLIB. As with CAB, many authors have generated different instances from the original file. We have extended this set of instances by generating 360 instances with n= 40, 50, 70, 75, 80, 85, 90, 95, 100, 150 and 200 nodes. For those instances with 40≤n≤50, p ranges from 1 to 5. For those with 70≤n≤95, p ranges from 1 to 8, and for those with 100≤n≤200, p takes values between 1 and 20. In all these cases, r<∈{1,…,p}. According with previous articles, cost parameter values are χ=3, α=0.75 and δ=2. Regarding the flows between nodes, these instances do not have symmetric flows (i.e., for a given pair of nodes iand j,tij is not necessarily equal to tji). Moreover, flows from one node to itself can be positive (i.e., tiican be strictly positive for a given i).
  3. The USA423 data set. This is a new family of instances that we introduce here based on real airline data. It consists of a data file concerning 423 cities in the United States, where real distances and passenger flows for an accumulated 3 months period are considered. From the original data, 30 instances have been generated with p∈{3,4,5,6,7} and 2≤r≤p-1. For each combination of parameters p and r, two different values for χ,α,δ have been used: 0.1, 0.07, 0.09, and 0.09, 0.075, 0.08, respectively.

You can download the instances here.

Computational Experience

We performed extensive computational experiments with 465 instances. The best values for the hardest instances can be downloaded here.

References

  • Aiex, R. M., M. G. C. Resende, and C. C. Ribeiro. 2007. TTT plots: A perl program to create time-to-target plots. Optimization Letters 1, no. 4: 355-366.
  • Alumur, S., and Kara, B. Y. Network hub location problems: The state of the art. European Journal of Operational Research 190, 1 (2008), 1-21.
  • Campbell, J. F. 1994. Integer programming formulations of discrete hub location problems. European Journal of Operational Research 72, no. 2: 387-405.
  • Campbell, J. F. and M. E. O’Kelly. 2012. Twenty-five years of hub location research. Transportation Science 46, no. 2: 153-169.
  • Contreras, I., and Fernández, E. General network design: A unified view of combined location and network design problems. European Journal of Operational Research 219, 3 (2012), 680-697.
  • Correia, I., Nickel, S., and Saldanha da Gama, F. Hub and spoke network design with single-assignment, capacity decisions and balancing requirements. Applied Mathematical Modelling 35, 10 (2011), 4841-4851.
  • Díaz, J. A., and Fernández, E. Hybrid scatter search and path relinking for the capacitated p-median problem. European Journal of Operational Research 169, 2 (2006), 570-585.
  • Ernst, A. and M. Krishnamoorthy. 1996. Efficient algorithms for the uncapacitated single allocation p-hub median problem. Location Science 4, no. 3: 139-154.
  • Farahani, R. Z., Hekmatfar, M., Arabani, A. B., and Nikbakhsh, E. Hub location problems: A review of models, classification, solution techniques, and applications. Computers & Industrial Engineering 64, 4 (2013), 1096-1109.
  • Feo, T. A. and M. G. C. Resende. 1989. A probabilistic heuristic for a computationally difficult set covering problem. Operations Research Letters 8, no. 2: 67-71.
  • Feo, T. A. and M. G. C. Resende. 1995. Greedy randomized adaptive search procedures. Journal of Global Optimization 6, no. 2: 109-133.
  • Festa, P. and M. G. C. Resende. 2011. GRASP: Basic components and enhancements. Telecommunication Systems 46, no. 3: 253-271.
  • García, S., M. Landete, and A. Marín. 2012. New formulation and a branch-and-cut algorithm for the multiple allocation p-hub median problem. European Journal of Operational Research 220, no. 1: 48-57.
  • Gelareh, S. and S. Nickel. 2011. Hub location problems in transportation networks. Transportation Research Part E: Logistics and Transportation Review 47, no. 6: 1092-1111.
  • Glover, F. A template for scatter search and path relinking. In Artificial Evolution, J.-K. Hao, E. Lutton, E. Ronald, M. Schoenauer, and D. Snyers, Eds., Lecture Notes in Computer Science 1363, Springer (1998), pp. 13-54.
  • Glover, F. Tabu search and adaptive memory programing – advances, applications and challenges. In Interfaces in Computer Science and Operations Research (1996), Kluwer, pp. 1-75.
  • Glover, F., Laguna, M., and Martí, R. Fundamentals of scatter search and path relinking. Control and Cybernetics 29, 3 (2000), 652-684.
  • Ilić, A., D. Urosević, J. Brimberg, and N. Mladenović. 2010. A general variable neighborhood search for solving the uncapacitated single allocation p-hub median problem. European Journal of Operational Research 206, no. 2: 289-300.
  • Kratica, J., Z. Stanimirović, D. Tosić, and V. Filipović. 2007. Two genetic algorithms for solving the uncapacitated single allocation p-hub median problem. European Journal of Operational Research 182, no. 1: 15-28.
  • Labbé M., and Yaman, H. Solving the hub location problem in a star-star network. Networks 51, 1 (2008), 19-33.
  • Laguna, M. and R. Martí. 1999. GRASP and path relinking for 2-layer straight line crossing minimization. INFORMS Journal on Computing 11, no. 1: 44-52.
  • Laguna, M., and Martí, R. Scatter Search: Methodology and Implementations in C. Kluwer Academic Publishers (2002).
  • Love, R.F., Morris, J.G., and Wesolowski, G.O. 1988. Facilities location: Models and methods. Elsevier Science Publishing Co., New York.
  • Martí, R., Laguna, M., and Glover, F. Principles of scatter search. European Journal of Operational Research 169, 2 (2006), 359-372.
  • Milanović, M. 2010. A new evolutionary based approach for solving the uncapacitated multiple allocation p-hub median problem. In Soft Computing in Industrial Applications, AISC 75 (X. Z. Gao et al., eds.). Springer-Verlag, Berlin, pp: 81-88.
  • O’Kelly, M. E. 1987. A quadratic integer program for the location of interacting hub facilities. European Journal of Operational Research 32, no. 3: 393-404.
  • Pacheco, J. A., and Casado, S. Solving two location models with few facilities by using a hybrid heuristic: A real health resources case. Computers & Operations Research 32, 12 (2005), 3075-3091.
  • Resende, M. G. C., Ribeiro, C. C., Glover, F., and Martí, R. Scatter search and path-relinking: Fundamentals, advances, and applications. In Handbook of Metaheuristics, M. Gendreau and J. Y. Potvin, Eds., International Series in Operations Research & Management Science 146. Springer (2010), pp. 87-107.
  • Resende, M. G. C., and Werneck, R. F. A hybrid heuristic for the p-median problem. Journal of Heuristics 10, 1 (2004), 59-88.
  • Ribeiro, C. C., and Resende, M. G. C. Path-relinking intensification methods for stochastic local search algorithms. Journal of Heuristics 18, 2 (2012), 193-214.
  • Yaman, H. 2011. Allocation strategies in hub networks. European Journal of Operational Research 211, no. 3: 442-451.

Obnoxious p-Median Problem

Colmenar, J. M., Greistorfer, P., Martí, R., Duarte A. (2016)
Optsicom project, University of Valencia (Spain)

Problem Description

Let I be a set of clients, J a set of facilities, and dij the distance between the client i ∈ I and the facility j∈ J. The Obnoxious p-Median (OpM) problem consists in finding a set S with p facilities (with S ⊆ J and p< |J|), such that the sum of the minimum distance between each client and the set of facilities is maximized. In mathematical terms:

State of the Art Methods

The OpM problem was introduced in the 1990s (Cappanera, 1999, Eiselt, Laporte, 1995 and Welch, Salhi, 1997) and, as of today, has only gained relatively few attention. It is NP-hard (Tamir, 1991) and can be formulated as a binary LP. Note that Burkard, Fathali, and Taghizadeh Kakhki (2007) proved that the special case of the OpM problem on a tree can be solved in linear time. A branch and cut method coupled with a tabu search has been recently suggested by Belotti, Labbé, Maffioli, and Ndiaye (2007).

Instances

We have used a set of 144 instances in our experimentation. In particular, they were generated by considering 24 instances of the well-known p-median problem. To this aim we considered the instances from pmed17 to pmed40, where the number of nodes ranges from 400 to 900. These p-median instances are found in the OR-Library (Beasley, J. J Oper Res Soc (1990) 41: 1069. doi:10.1057/jors.1990.166).

In order to transform a p-median instance into an obnoxious p-median one, Belotti et al. (2007) described the following procedure. Given the original instance with n nodes, this method selects n/2 nodes at random to be the set of clients. The remaining n/2 become the set of facilities. Additionally, for each original instance, Belotti et al. (2007) derived three new instances for the OpM by considering three values of p: ⌊n/2⌋, ⌊n/4⌋ and ⌊n/8⌋.

We folowed that process to generate the instances and then, for each one of the resulting files, we generated an A and a B version of them by transposing the table that stores the distances between clients and facilities. This is a simple way of obtaining two different instances with similar charasteristics for each one of the processed files.

Hence, we stored the resulting 144 instances as a library that we called OpM_LIB:

Download OpM_LIB instances.

Computational Experience

All the algorithms were implemented in Java SE 7 and the experiments were conducted on an Intel Core i5 660 CPU (3.3 GHz) and 8 GB RAM. We have considered the OpM_LIB instances described above. The individual results for each instance can be downloaded here in Excel format.

Obnoxious p-Median problem results

References

  • J. E. Beasley. OR-Library: Distributing test problems by electronic mail. Journal of the Operational Research Society, 41 (11) (1990), pp. 1069–1072.
  • P. Belotti, M. Labbé, F. Maffioli, M. Ndiaye. A branch-and-cut method for the obnoxious p-median problem.4OR, 5 (4) (2007), pp. 299–314.
  • R. Burkard, J. Fathali, H. Taghizadeh Kakhki. The p-maxian problem on a tree. Operations Research Letters, 35 (3) (2007), pp. 331–335.
  • P. Cappanera. A Survey on obnoxious facility location problems. Technical report Dipartimento di Informatica, Università di Pisa (1999).
  • H. Eiselt, G. Laporte. Objectives in location problems. Z. Drezner (Ed.), Facility location. a survey of applications and methods., Springer, New York (1995), pp. 151–180.
  • A. Tamir. Obnoxious facility location on graphs. SIAM Journal on Discrete Mathematics, 2 (4) (1991), pp. 550–567.
  • S. Welch, S. Salhi. The obnoxious P facility network location problem with facility interaction. European Journal of Operational Research, 102 (1997), pp. 302–319.