Preface Guidelines for the use of meta-heuristics in combinatorial optimization Alain Hertz a,*, Marino Widmer b a EEcole Polytechnique de Montreeal ­ GERAD, 3000, Chemin de la C^oote-Sainte-Catherine, Montreeal (QC), Canada H3T 2A7 b Universitee de Fribourg ­ DIUF Faucigny 2, CH-1700 Fribourg, Switzerland Abstract The 18th EURO Summer/Winter Institute (ESWI XVIII) took place during the spring 2000 in Switzerland. The topic of ESWI XVIII, ``Meta-heuristics in Combinatorial Optimization'', was selected due to its great current scientific interest: indeed, in recent years, several meta-heuristics have proved to be highly efficient for the solution of difficult combinatorial optimization problems. The Institute was focused more particularly on the development and the use of local search and population search algorithms. Applications of these meta-heuristics on academic or real life problems were also discussed. This special issue of EJOR contains papers written by the participants to ESWI XVIII. These papers have benefited from fruitful discussions among the participants, the organizers and the invited speakers. We have tried to summarize here below some guidelines that should help in the design of successful adaptations of meta-heuristics to difficult combinatorial optimization problems. Ó 2003 Elsevier B.V. All rights reserved. Keywords: Meta-heuristics; Local search methods; Population search methods 1. Introduction Combinatorial optimization plays an important role in decision-making since optimal decisions often depend on a non-trivial combination of various factors. Most combinatorial optimization problems are NP-hard, and sharp bounds on the optimal value are typically hard to derive. This means that partial enumeration based exact algorithms have a slow convergence rate, and they can solve only small instances to optimality. But real-life combinatorial optimization problems are typically of big size, and since exact approaches are inadequate, heuristics are commonly used in practice. There has been a steady evolution over the past forty years in the development of heuristics, which produce solutions of reasonably good quality in a reasonable amount of time. The first proposed heuristics tried to systemize decision-making processes done by hand. With the help of computers which can test a huge amount of combinations in a short amount of time, solutions could easily be generated which turned to be of much better quality when compared to what an expert in the * Corresponding author. E-mail addresses: alain.hertz@gerad.ca (A. Hertz), marino.widmer@unifr.ch (M. Widmer). 0377-2217/$ - see front matter Ó 2003 Elsevier B.V. All rights reserved. doi:10.016/S0377-2217(02)00823-8 European Journal of Operational Research 151 (2003) 247­252 www.elsevier.com/locate/dsw field could produce by hand. In these early heuristics, much of the emphasis was put on quickly obtaining a feasible solution and possibly applying to it a post-optimization procedure. Over the last 15 years much of the research effort has concentrated on the development of metaheuristics, using mainly two principles: local search and population search. In local search methods, an intensive exploration of the solution space is performed by moving at each step from the current solution to another promising solution in its neighbourhood. Simulated annealing [7], tabu search [3] and variable neighbourhood search [8] are the most famous local search methods. Population search consists of maintaining a pool of good solutions and combining them in order to produce hopefully better solutions. Classical examples are genetic algorithms [5] and adaptive memory procedures [9]. Meta-heuristics are general combinatorial optimization techniques, which are not dedicated to the solution of a particular problem, but are rather designed with the aim of being flexible enough to handle as many different combinatorial problems as possible. These general techniques have rapidly demonstrated their usefulness and efficiency in solving hard problems. Success stories are reported in many papers. While meta-heuristics can handle in theory any combinatorial optimization problem, it is often the case that an important effort must be put on finding the right way to adapt the general ingredients of these methods to the particular considered problem. We think that in order to be successful in the adaptation of a meta-heuristic to a combinatorial optimization problem, it is necessary to follow some basic principles. We give in the next sections some guidelines which may help in producing such successful adaptations of local search and population search methods for the solution of difficult combinatorial optimization problems. 2. Guidelines for adaptations of local search meth- ods Let S be a set of solutions to a particular problem, and let f be a cost function that measures the quality of each solution in S. The neighbourhood Ns of a solution s in S is defined as the set of solutions which can be obtained from s by performing simple modifications. Roughly speaking, a local search algorithm starts off with an initial solution in S and then continually tries to find better solutions by searching neighbourhoods. A local search process can be viewed as a walk in a directed graph G S; A where the vertex set S is the set of solutions and there is an arc s; s0 in A if and only if s0 is in Ns. By considering the cost function as an altitude, one gets a topology on G S; A. The efficiency of a local search method depends mostly on the modeling. A fine tuning of parameters will never balance a bad definition of the solution set, of the neighbourhood, or of the cost function. We give in this section some general guidelines for successful adaptations of local search methods. It should be easy to generate solutions in S (a) It is not rare that finding a feasible solution to a combinatorial optimization problem is an NPhard problem. In such a case, it would be a very bad idea to define S as the set of feasible solutions to the considered problem since getting an initial solution would be a complex task, and moving from a solution to a neighbour one would not be easier. For such problems, the search space S should be defined by relaxing some constraints of the original problem, and by adding a component in the cost function that penalizes violations of constraints. As an example to this first principle, consider the course-timetabling problem. The main constraint in this kind of problem requires that courses given by the same teacher, or having students in common do not overlap. It is typically very difficult to generate timetables that do not contain any overlapping situation. By relaxing these constraints, it becomes much easier to produce timetables and neighbour solutions can easily be obtained by moving a course to another period, even if such a move induces many new overlapping situations [4]. 248 A. Hertz, M. Widmer / European Journal of Operational Research 151 (2003) 247­252 For each solution s in S, the graph GS; A should contain a path linking s to an optimal solution s (b) If the search process visits a solution that does not satisfy this condition, then an optimal solution will never be reached. As an illustration, consider the vehicle routing problem in which a fixed number m of vehicles with limited capacity are required to visit customers in order to fulfill known customer requirements. Assume that a solution is defined as a set of m routes which satisfy all customer requirements as well as the capacity constraints, and suppose that a neighbour solution is obtained by either moving a customer from one route to another, or by permuting two clients from different routes. Such a choice for S violates condition (a) since finding a solution in S is equivalent to solving a bin packing problem. Moreover, this adaptation can also violate condition (b). Indeed, assume there are three vehicles which can each transport 4 units of some product, and assume there are four customers with a demand of 2 units, and four customers with a demand of 1 unit. If the first vehicle visits the four customers with a demand of 1 unit, while the two other vehicles both visit two customers with a demand of 2 units, then no move can be performed to modify the first route without violating the capacity constraints. However, the optimal solution might be one in which the first two vehicles both visit two customers with a demand of 1 unit and one customer with a demand of 2 units, while the third vehicle visits two customers with a demand of 2 units. Conditions (a) and (b) can easily be satisfied by relaxing the capacity constraints [2]. The solutions in Ns should be in some sense close to s (c) One could theoretically define Ns as the set containing all solutions s0 6 s in S. Such a neighbourhood induces a complete graph GS; A in which condition (b) is trivially satisfied, and it is possible to move from any initial solution to any optimal one in one step. However, Ns is so big that finding such an optimal move is a task which is as difficult as the original problem. It is important to define neighbourhoods Ns in which it is possible to determine the best solution within a reasonably small amount of time. This goal can be achieved by defining Ns as the set of solutions obtained by performing a simple modification on s. The neighbourhood Ns of s then contains solutions which are in some sense close to s, and it is often possible to compute the value of a neighbour s0 of s by determining the cost difference between s and s0 , instead of re-computing the value of s0 from scratch. For example, consider again the vehicle routing problem, and assume that a neighbour solution s0 of s is obtained by moving a customer C from a route R1 to a route R2. The cost difference between s and s0 can easily be obtained by computing the saving induced by the removal of C from R1, and by computing the cost induced by the insertion of C into R2. There is an additional advantage of defining neighbours s0 2 Ns by simple modifications on s: if a neighbour s0 has a better cost than s, this may indicate that the arc s; s0 belongs to a short path in GS; A from s to an optimal solution, and it becomes then possible to guide the search towards an optimal solution. The topology induced by the cost function on GS; A should not be too flat (d) As explained above, the cost function can be considered as an altitude, and it therefore induces a topology on G S; A with mountains, valleys and plateaus. It is difficult for a local search to escape from large plateaus since any solution that is not in the boarder of such a plateau has the same cost value as its neighbours, and it is therefore impossible to guide the search towards an optimal solution. A common way to avoid this kind of topology on GS; A is to add a component to the cost function which discriminates between solutions having the same value according to the original cost function. As an example, consider the job shop scheduling problem where operations have to be ordered on machines, such that the maximal completion time of all operations, called makespan, is minimized. For a solution s, let Cks denote the time at which all operations are A. Hertz, M. Widmer / European Journal of Operational Research 151 (2003) 247­252 249 completed on machine k. Then the cost f s of solution s (i.e., its makespan) is equal to max fC1s; C2s; . . . ; Cmsg. If two solutions have the same makespan, then the second largest value in fC1s; C2s; . . . ; Cmsg can help discriminating between them. Another possibility is to minimize C1s2 C2s2 Á Á Á Cms2 in order to penalize solutions having too many machines whose completion time is close to the makespan [6]. 3. Guidelines for adaptations of population search methods Population search methods are iterative solution techniques that handle a population of individuals and make them evolve according to some rules that have to be clearly specified. At each iteration, periods of self-adaptation alternate with periods of co-operation. Self-adaptation means that individuals evolve independently while cooperation implies an exchange of information among the individuals. Many different algorithms can be described within this framework. For example, the selection and crossover operators of genetic algorithms can be seen as co-operation procedures while the mutation operator is part of the self-adaptation process. As for local search methods, the efficiency of a population search depends mostly on the modeling. We give in this section some general guidelines for successful adaptations of population search methods. Pertinent information should be transmitted during the co-operation phase (e) In the co-operation phase, groups of individuals exchange pieces of information and new offspring solutions are created that should combine the best features of the parent solutions. This means that the information that is transmitted during the cooperation phase should be pertinent. Consider for example the k-colouring problem where a colour in f1; . . . ; kg must be assigned to each vertex of a graph G so that there are as few edges as possible in G having both endpoints with the same colour, such edges being called conflicting edges. An offspring solution s00 can be created from two parent solutions s and s0 by colouring some vertices (chosen at random) as in s, and the others as in s0 . However, such a combination operator typically produces results of poor quality. Indeed, all colours are equivalent up to a permutation, which means that the colour of a vertex is not a pertinent information. A more useful information is the fact that pairs or subsets of vertices have the same colour. A solution to the k-coloring problem is in fact a partition of the vertex set into k subsets, called colour classes, and the aim is to determine a partition so that as few edges have both endpoints in the same colour class. Nowadays, the best population search algorithms for the k-coloring problem create offspring solutions by copying colour classes in parent solutions [1]. The combination of two equivalent parent solutions should not produce an offspring that is different from the parents (f) Since an offspring solution receives information from parent solutions, it is reasonable to expect that the combination of two parent solutions containing equivalent information produces an offspring which is also equivalent to its parents. To illustrate a situation where condition (f) is not satisfied, consider again the k-coloring problem, and let G be a graph with four vertices v1, v2, v3, v4 and three edges v1v2, v2v3, and v3v4. Let s be the 2coloring of G in which v1 and v3 have colour 1 and v2 and v4 have colour 2, and let s0 be the 2-coloring of G obtained from s by permuting colours 1 and 2. Solutions s and s0 can be considered as equivalent since they correspond to the same colouring, up to a permutation of the two colours. If one now copies the colours of v1 and v2 in s and those of v3 and v4 in s0 , one gets a solution s00 in which vertices v1 and v4 have colour 1 and vertices v2 and v3 have colour 2. Solution s00 is totally different from the two equivalent parent solutions s and s0 since both s and s0 are optimal solutions (i.e., they have no conflicting edge), while v2v3 is a conflicting edge in s00 . This example shows once again that it is a bad idea to create an offspring solution s00 from two 250 A. Hertz, M. Widmer / European Journal of Operational Research 151 (2003) 247­252 parent solutions s and s0 by colouring some vertices (chosen at random) as in s, and the others as in s0 . Consider now the problem of finding the smallest integer k for which there exists a k-coloring without conflicting edge. One can define a solution as an ordering of the vertices: each ordering p can be transformed into a colouring by means of a procedure that sequentially colours the vertices according to the ordering defined by p, always giving the smallest possible colour (i.e., the smallest positive integer which is not yet used in the neighbourhood of the considered vertex). For example, both orderings s v1 < v2 < v3 < v4 and s0 v3 < v4 < v1 < v2 of the above graph G produce a 2-coloring in which v1 and v3 have colour 1 and v2 and v4 have colour 2. These two orderings can therefore be considered as equivalent. An offspring can then be created by placing some vertices in the same position as in one of the parent solutions, and by placing the other vertices according to the ordering in the second parent solution. For example, assume that an offspring is created from s and s0 by copying the positions of v1 and v3 in s. The offspring is then s00 v1 < v4 < v3 < v2, which corresponds to a 3-coloring where vertices v1 and v4 having colour 1, vertex v3 has colour 2, and vertex v2 having colour 3. Hence, a 3-coloring is obtained by combining two equivalent 2-colorings. Such a situation should definitely be avoided. Diversity should be preserved in the population (g) One of the major difficulties observed when using population search algorithms is the premature convergence of the process, all solutions in the population having a natural tendency to become equal to the best solution in it. If this occurs, then the population search behaves more or less like a local search since there is nothing to gain in combining equivalent solutions. In order to prevent such a phenomenon, it is important to implement operators that preserve diversity in the population. The mutation operator in genetic algorithms is an example of such a tool, but it usually generates random outputs, which is not necessarily the best thing to do. A technique that helps avoiding premature convergence is to forbid the introduction of a solution in the population if too many Ôsimilar solutions already exist in it. But this requires the development of measures of similarity that properly detect when there is a danger of premature convergence. 4. Conclusion It is not possible to provide a general scheme for the adaptation of a local search or a population search to a combinatorial optimization problem. We tried in these few pages to give some guidelines that should help in the design of successful adaptations. This special issue of EJOR contains papers written by the participants to ESWI XVIII. Each participant had one hour and a half to describe a preliminary paper and his research on the use of meta-heuristics in combinatorial optimisation. The final version of these papers have benefited from fruitful discussions which took place between the participants, the organizers, and the invited speakers (Martin Grootschel and Manuel Laguna), as well as from the above guidelines. As a conclusion, we want to mention that the fantastic work atmosphere during ESWI XVIII has given the energy to the participants to create a new EURO working group called EU/ME (to be pronounced You and Me): EUropean Chapter on Meta-heuristics. References [1] P. Galinier, J.K. Hao, Hybrid evolutionary algorithms for graph coloring, Journal of Combinatorial Optimization 3 (1999) 379­397. [2] M. Gendreau, A. Hertz, G. Laporte, A tabu search heuristic for the vehicle routing problem, Management Science 40 (1984) 1276­1290. [3] F. Glover, Future paths for integer programming and links to artificial intelligence, Computers & Operations Research 13 (1986) 433­549. [4] A. Hertz, Finding a feasible course schedule using tabu search, Discrete Applied Mathematics 35 (1992) 255­270. [5] J.H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, Ann Arbor, MI, 1975. A. Hertz, M. Widmer / European Journal of Operational Research 151 (2003) 247­252 251 [6] A. Hertz, M. Widmer, An improved tabu search approach for solving the job shop scheduling problem with tooling constraints, Discrete Applied Mathematics 65 (1996) 319­ 346. [7] S. Kirkpatrick, C.D. Gellatt Jr., M.P. Vecchi, Optimization by simulated annealing, Science 220 (1983) 671­680. [8] N. Mladenovic, P. Hansen, Variable neighbourhood search, Computers & Operations Research 34 (1997) 1097­ 1100. [9] Y. Rochat, E.D. Taillard, Probabilistic diversification and intensification in local search for vehicle routing, Journal of Heuristics 1 (1995) 147­167. 252 A. Hertz, M. Widmer / European Journal of Operational Research 151 (2003) 247­252