Optimization problems arise everywhere in our everyday lives; across all geographic regions, industrial branches, and academic disciplines. The better methods we have to tackle these problems, the more efficiently we can exploit our limited resources. Well-designed networks reduce fuel consumption, harmonized schedules conciliate interacting processes, careful planning of medical treatments reduces unwanted side-effects, to name only a few examples.<br />
Unfortunately, most real-world optimization challenges are notoriously difficult problems. They often:<br />
- are difficult to model as explicit optimization problems,<br />
- require a lot of data to describe them in full detail,<br />
- exhibit complex interactions between the individual decision variables,<br />
- need to be solved in a short amount of time, with limited computational resources, or insufficient expert knowledge.<br />
Many practical problems can therefore not be solved by exact analytical means. In these situations, one has to resort to heuristic approaches, which trade the accuracy of a solution for a simple algorithmic structure, fast running times, or otherwise efficient use of computational resources. Under many circumstances heuristics offer the only possibility to obtain any feasible solution at all.<br />
A particularly important class of heuristic optimizers are algorithms that try to find best possible solutions through an iterative process. The class of these iterative optimization heuristics subsumes local search algorithms (including first/steepest ascent, variable neighborhood search, Simulated Annealing, Metropolis algorithms, etc.) and global search heuristics such as evolutionary algorithms, (Quasi-)Monte Carlo algorithms, Bayesian optimization approaches, swarm intelligence, differential evolution, estimation of distribution algorithms, efficient global optimization, etc.<br />
Ideally, one would like to develop algorithms that perform well across broad sets of problems and instances. In practice, however, we observe a distinct complementarity of different algorithmic ideas: algorithms performing very well on some problems may perform poorly on others. Users therefore need to select carefully which heuristic to apply to the problem that they wish to solve. This algorithm selection problem is complicated by the fact that iterative heuristics are parametrized and that they are composed of several building blocks (e.g., different ways to initialize the search, different distributions from which new candidates are sampled, different strategies to decide which points to keep in memory). Different parameter settings and different combinations of these algorithmic building blocks can result in much different performances. Users therefore need not only to decide which algorithm design to apply, but also how to configure it.<br />
To explore and eventually exploit the complementarity of these heuristics and their interactions, but also to design new heuristics that are even more suitable for a given problem instance, supervised machine learning techniques can be applied [1,2]. Such machine-trained approaches require meaningful features that quantify different aspects of the problem instance at hand. Additionally, they need information about parameter configuration and heuristics performance. More generally, automated algorithm design requires an understanding of the problem space, the parameter space, and the performance space. With regard to the problem space, exploratory landscape analysis (ELA) [3] has been introduced with the goal to support automated algorithm design techniques through sets of features which measure various properties of the optimization problem at hand. In the context of numerical black-box optimization, a large number of features has been suggested over the last decades [4], and each of them measures a different characteristic of the problem. Several shortcomings of various feature sets have been discussed in the literature, among them a lack of robustness with respect to the sample size [5] and with respect to isomorphic transformations of the problem instance [6]. It is furthermore well known that a proper feature selection is essential to obtain peak performance in supervised learning approaches [1], not only to remove features with limited or even deceptive information, but also to remove an overestimated relevance of features that are correlated to each other [10]. On the other hand, the performance space should also be explored with great care, to understand the relations between the feature space and the performance space. Our teams have developed different statistical approaches to describe the performance space, for example, via Deep Statistical Comparison [7] and by our tool performance2vec [8]. IOHprofiler [11], and more specifically its analytical component IOHanalyzer [12], is an interactive software to explore the complex performance space of iterative optimization heuristics.<br />
The goals of our project are:<br />
1. To investigate different problem representations going beyond the ELA-based representations by using matrix factorization methods and deep learning models. (extending our joint SSCI 2020 paper [10]) <br />
2. To develop methodologies that will allow us to select a more representative data set of problem instances using their novel representation, which can be used for benchmarking studies to produce reproducible and transferable results. (extending our upcoming GECCO 2022 paper [14])<br />
3. To use and evaluate novel problem representations in automated algorithm selection, configuration, and design by personalizing performance regression models to black-box optimization problems, extending our published GECCO 2021 paper [15].<br />
4. To extend our work on per-run algorithm selection presented at EvoSTAR 2021 [16] (best paper award) and under submission with IEEE WCCI 2022 and with PPSN 2022, respectively. Here, the goal is to compute meaningful features from the samples that the algorithms evaluate for their optimization process, and to use feature-based approaches to decide whether or not to switch to a different solver. To this end, reliable performance prediction is needed – a topic that is complicated by the fact that in this approach the feature computation has to rely on highly stochastic samples.<br />
5. To make benchmarking data interoperable and reusable by developing a semantic data model (i.e., an ontology) to store and share the algorithms’ hyper-parameters, problems’ landscape features, and the performance data achieved. Our recently published work at GECCO 2021 showed that this kind of data management can help in selecting diverse algorithms and problems portfolios [13]. <br />
References:<br />
[1] Kerschke, P., Hoos, H. H., Neumann, F., & Trautmann, H. (2019). Automated algorithm selection: Survey and perspectives. Evolutionary computation, 27(1), 3-45.<br />
[2] Muñoz, M. A., Sun, Y., Kirley, M., & Halgamuge, S. K. (2015). Algorithm selection for black-box continuous optimization problems: A survey on methods and challenges. Information Sciences, 317, 224-245.<br />
[3] Mersmann, O., Bischl, B., Trautmann, H., Preuss, M., Weihs, C., & Rudolph, G. (2011, July). Exploratory landscape analysis. In Proceedings of the 13th annual conference on Genetic and evolutionary computation (pp. 829-836).<br />
[4] Derbel, B., Liefooghe, A., Verel, S., Aguirre, H., & Tanaka, K. (2019, August). New features for continuous exploratory landscape analysis based on the SOO tree. In Proceedings of the 15th ACM/SIGEVO Conference on Foundations of Genetic Algorithms (pp. 72-86).<br />
[5] Renau, Q., Dreo, J., Doerr, C., & Doerr, B. (2019, July). Expressiveness and robustness of landscape features. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 2048-2051).<br />
[6] Škvorc, U., Eftimov, T., & Korošec, P. (2020). Understanding the problem space in single-objective numerical optimization using exploratory landscape analysis. Applied Soft Computing, 90, 106138.<br />
[7] Eftimov, T., Korošec, P., & Seljak, B. K. (2017). A novel approach to statistical comparison of meta-heuristic stochastic optimization algorithms using deep statistics. Information Sciences, 417, 186-215.<br />
[8] Eftimov, T., Popovski, G., Kocev, D., & Korošec, P. (2020). Performance2vec: A step further in explainable stochastic optimization algorithm performance. In Genetic and Evolutionary Computation Conference Companion (GECCO ’20 Companion)<br />
[9] Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.<br />
[10] Eftimov, T., Popovski, G., Renau, Q., Korošec, P., & Doerr, C. (2020, December). Linear matrix factorization embeddings for single-objective optimization landscapes. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 775-782). IEEE.<br />
[11] Doerr, C., Ye, F., Horesh, N., Wang, H., Shir, O. M., & Bäck, T. (2020). Benchmarking discrete optimization heuristics with IOHprofiler. Applied Soft Computing, 88, 106027.<br />
[12] Wang, H., Vermetten, D., Ye, F., Doerr, C., & Bäck, T. (2020). Iohanalyzer: Performance analysis for iterative optimization heuristic. CoRR abs/2007.03953 (2020)<br />
[13] Kostovska, A., Vermetten, D., Doerr, C., Džeroski, S., Panov, P., & Eftimov, T. (2021, July). OPTION: optimization algorithm benchmarking ontology. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 239-240).<br />
[14] Cenikj, G., Lang, R., Engelbrecht, A., Doerr, C., Korošec, P., & Eftimov, T. (2022). SELECTOR: Selecting a Representative Benchmark Suite for Reproducible Statistical Comparison. In The Genetic and Evolutionary Computation Conference GECCO 2022 Boston, USA, 9-13 July, 2022. In Press
[15] Eftimov, T., Jankovic, A., Popovski, G., Doerr, C., & Korošec, P. (2021, June). Personalizing performance regression models to black-box optimization problems. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 669-677).<br />
[16] Jankovic, A.,, Eftimov, T., Doerr, C. (April 2021): Towards Feature-Based Performance Regression Using Trajectory Data. EvoApplications 2021: 601-617 [best paper award]<br />