The main ideas
Reactive Search is a methodology for solving hard optimization problems, both in the discrete and continuous domain, based on the integration of machine learning and optimization in an online manner ("learning while optimizing").Machine learning techniques act on top of search heuristics, in order to let the algorithm self-tune its operating parameters during the search operation. Thus, parameter tuning, which is usually performed offline by the researcher, becomes an integral part of the search algorithm, ensuring flexibility without human intervention.
The "learning" component is implemented as a reactive feedback scheme that uses the past history of the search to increase its efficiency and efficacy.
- Below, you can find a short introduction, with some visual help and simplifications, to the main concepts.
- A partial list of applications is also available.
- A comprehensive description of Reactive Search and its applications can be found in the technical report Battiti, Brunato - Reactive search: Machine learning for memory-based heuristics.
A short introduction
Optimization problems
Reactive search, like all local search techniques, is applied to the problem of
finding the optimal configuration of a system. This configuration usually
consists of continuously or discretely varying parameters, while the optimality criterion
is a numerical value associated to each configuration. In most cases, an
optimization problem can be reduced to finding the (global) minimum of a function
whose arguments are the configuration parameters, seen as free variables in the function's
domain space.
Consider, for instance, the
function whose graph is represented on the side. Finding the global minimum looks quite easy, since we have a "global" view of its graphical representation. However, a computer program can just evaluate one point at a time. A globa vision may not be present at the beginning, but
it can be learnt after a suitable number of evaluations.
|
Local search techniques
In order to find a minimum, a machine can use some "steepest descent" method, where a random
configuration is generated and moved towards lower function values
(see image on the right, where the same function as before is seen from top),
and the search stops when no improvement is possible. The examined configurations are represented by the moving red dot. Such kind of search is guaranteed to end in
a local minimum of the function (where no local improvement is available), but
it is impossible to know when the global minimum is found.
This technique corresponds to put a ball on the surface and let it roll down to the bottom.
Functions that arise from actual optimization problems, however, are much more complex than the depicted one, with several local minima, and repeatedly "throwing a ball" takes a long time before a good solution is found.
|
A more sophisticated example: prohibition-based search
Most real-world problems, however, have some "structure" in them, so that local minima are gathered together and around the global minimum; therefore, a mechanism to continue the search after a local minimum is found can give better results. One possible family of methods is the
prohibition-based search (or "tabu search"): keep searching even when no improvement is possible (i.e. let the ball go upwards) in order to overcome barriers between minima, but prevent the system from going back on its track. In other words, once a move is made, in the next steps that move cannot be undone (as suggested by the figure).
|
Reactive search
Prohibition of moves cannot be permanent, otherwise the system would exhaust all its degrees of freedom and could not evolve; setting the correct duration of prohibitions is crucial, and different problems can take advantage of different prohibition settings. The usual way of finding the correct parameter value is by a "trial-and-error" iterative process: the researcher repeatedly runs the algorithm with different values of the prohibition parameter and selects the most promising one.The Reactive Search technique automates such kind of parameter setting by taking into account the past search history (i.e., by learning). For instance, if some configuration is recognized to be visited too often, then it might be the case to increase the prohibition time. The feedback scheme that modifies the search parameters according to the search results is called reaction and is the core of the Reactive Search technique. Another advantage of a history-based system is the ability to recognize "hopeless" situations in which the system shows no improvement for a long time; an escape mechanism can be implemented in order to restart the system from a new random point if some conditions are verified.
Tabu search is only one example of technique that can benefit from reactive mechanisms; in principle, all algorithms that critically depend on parameters, such as simulated annealing, particle swarm, genetic algorithms, and so on, can take advantage from this heuristic.
History Usage
The focus of the Reactive Search method is on- feedback-based parameter tuning : the algorithm maintains the internal flexibility needed to cover many problems, but the tuning is automated, and executed while the algorithm runs and ``monitors'' its past behavior.
- automated balance of diversification and intensification : the ``exploration versus exploitation'' dilemma is present in many heuristics: is it better to intensify the search in the promising regions, or to diversify it to uncharted territories. An automated heuristic balance can be obtained through feedback mechanisms, for example by starting with intensification, and by progressively increasing the amount of diversification only when there is evidence that diversification is needed.
The three main characteristics of Reactive Search when applied to prohibition-based schemes (tabu search) are:
- Self-adjusted prohibition period
In Reactive Search theprohibition T is determined through feedback (reactive ) mechanisms during the search. T is equal to one at the beginning (the inverse of a given move is prohibited only at the next step), it increases only when there is evidence that diversification is needed, it decreases when this evidence disappears. The evidence that diversification is needed is signaled by the repetition of previously-visited configurations.
An example of the behavior of T during the search is illustrated iabove, for a Quadratic Assignment Problem task [BT94b]. T increases in an exponential way when repetitions are encountered, it decreases in a gradual manner when repetitions disappear.
- Escape mechanism
The basic tabu mechanism based on prohibitions is not sufficient to avoid long cycles In addition, even if ``limit cycles'' (endless cyclic repetitions of a given set of configurations) are avoided, the first reactive mechanism is not sufficient to guarantee that the search trajectory is not confined in a limited region of the search space. A ``chaotic trapping'' of the trajectory in a limited portion of the search space is still possible (the analogy is withchaotic attractors of dynamical systems, where the trajectory is confined in a limited portion of the space, although a limit cycle is not present). For both reasons, to increase the robustness of the algorithm a second more radical diversification step (escape ) is needed. Theescape phase is triggered when too many configurations are repeated too often [BT94b]. A simpleescape consists of a number of random steps executed starting from the current configuration (possibly with a bias toward steps that bring the trajectory away from the current search region). - Fast algorithms for using the search history
The storage and access of the past events is executed through the well-known hashing or radix-tree techniques in a CPU time that is approximately constant with respect to the number of iterations. Therefore the overhead caused by the use of the history is negligible for tasks requiring a non-trivial number of operations to evaluate the cost function in the neighborhood.
Pages hosted by "machine Learning and Intelligent Optimization (LION)" Group - DISI - Università di Trento - Italy.
Last updated: 2012-02-04 11:10:59
Reactive search, like all local search techniques, is applied to the problem of
finding the optimal configuration of a system. This configuration usually
consists of continuously or discretely varying parameters, while the optimality criterion
is a numerical value associated to each configuration. In most cases, an
optimization problem can be reduced to finding the (global) minimum of a function
whose arguments are the configuration parameters, seen as free variables in the function's
domain space.
Consider, for instance, the
function whose graph is represented on the side. Finding the global minimum looks quite easy, since we have a "global" view of its graphical representation. However, a computer program can just evaluate one point at a time. A globa vision may not be present at the beginning, but
it can be learnt after a suitable number of evaluations.
(see image on the right, where the same function as before is seen from top),
and the search stops when no improvement is possible. The examined configurations are represented by the moving red dot. Such kind of search is guaranteed to end in
a local minimum of the function (where no local improvement is available), but
it is impossible to know when the global minimum is found.
This technique corresponds to put a ball on the surface and let it roll down to the bottom.
Functions that arise from actual optimization problems, however, are much more complex than the depicted one, with several local minima, and repeatedly "throwing a ball" takes a long time before a good solution is found.
Most real-world problems, however, have some "structure" in them, so that local minima are gathered together and around the global minimum; therefore, a mechanism to continue the search after a local minimum is found can give better results. One possible family of methods is the
prohibition-based search (or "tabu search"): keep searching even when no improvement is possible (i.e. let the ball go upwards) in order to overcome barriers between minima, but prevent the system from going back on its track. In other words, once a move is made, in the next steps that move cannot be undone (as suggested by the figure).
![[Reactive Search Logo]](images/logo.jpg)

