The task of selecting the best optimization algorithm for a particular problem is known as algorithm selection (AS). This involves training a model using landscape characteristics to predict algorithm performance, but a key challenge remains: making AS models generalize effectively to new, untrained benchmark suites. This study assesses AS models' generalizability in single-objective numerical optimization across diverse benchmark suites. Using Exploratory Landscape Analysis (ELA) and transformer-based (TransOpt) features, the research investigates their individual and combined effectiveness in AS across four benchmarks: BBOB, AFFINE, RANDOM, and ZIGZAG. AS models perform differently based on benchmark suite similarities in algorithm performance distributions and single-best solvers. When suites align, these models underperform against a baseline predicting mean algorithm performance; yet, they outperform the baseline when suites differ in performance distributions and solvers. The AS models trained using the ELA landscape features are better than the models trained using the TransOpt features on the BBOB and AFFINE benchmark suites, while the opposite is true for the RANDOM and ZIGZAG benchmark suites. Using a combination of ELA and TransOpt features provides better results only for the RANDOM and ZIGZAG benchmark suites. Ultimately, the study reveals challenges in accurately capturing algorithm performance through problem landscape features (ELA or TransOpt), impacting AS model applicability.