From Big Data to Good Data: Smart-sized Benchmarking for Trustworthy Artificial Intelligence
Acronym
DATA-TRUST
Type
research
Duration
2026 - 2029
Content
Most studies conducted in the field of AI focus on analyzing all available datasets for a specific learning task or even require the collection of additional data to include in the study. However, this leads to models that overfit the set of selected datasets and cannot be used for new datasets that have different characteristics than the ones involved in the learning process. In addition, the performance assessment of the algorithms’ results is crucial for trustworthy performance, a requirement explained in the ethics guidelines for trustworthy Artificial Intelligence (AI) published by the European Commission. In most cases, evaluating the same algorithm portfolio (i.e., set of algorithms) on different sets of benchmark datasets/problems can result in different winning algorithms. This means that the selection of the benchmark datasets/problems can lead to biased performance analysis (i.e., selecting benchmark problems in favor of the winning algorithm). So this allows researchers to present results that make their newly developed algorithm look superior to the others and indirectly limit the generalization of the learned knowledge. Within the DATA-TRUST project, we will invent, develop, implement, and evaluate a framework for benchmarking that will increase the trust in AI results and make them transferable and generalizable. The framework will consist of methodologies for i) defining a unified representation (i.e., vectors of meta-features) of the problems/datasets that describe their characteristics, ii) handling the inherent bias that originates from the data quality involved in the learning process through a selection of more diverse problems/datasets (i.e., a small portion of data used for learning), and iii) determining quantitative indicators that will enable the estimation of the level of trust/confidence when a developed AI model is applied on a new unseen data. Performing all steps from the DATA-TRUST framework will allow us to select more representative learning data (a subset of all available datasets that will uniformly cover the diversity in the characteristics of the datasets) that will increase the robustness, reproducibility, and the generalizability of the AI models. The development of such methodologies is highly motivated by the continuous growth of real-world optimization problems and time-series data, which requires transferability of the gained knowledge from benchmarking studies into the industry (i.e., finding the best algorithm to solve the new real-world problem).
Funding
ARIS