
JCTAM OPEN ACCESS
Journal of Computer Technology and Applied Mathematics
ISSN:3007-4126 (print) | ISSN:3007-4134 (online) | Publication Frequency: Bimonthly
Benchmarking Learned Cardinality Estimation Techniques for Analytical Query Processing in Data Warehouses
* Corresponding Author1: Jiacheng Hu, E-Mail: jessicar@gmail.com
Publication
Accepted 2026 May 14 ; Published 2026 May 18
Journal of Computer Technology and Applied Mathematics, 2026, 3(3), 3007-4126.
Abstract
Cardinality estimation remains one of the most critical yet error-prone components of query optimization in modern data warehouses. Recent advances in machine learning have produced a diverse family of learned cardinality estimators that demonstrate substantial accuracy improvements on standard benchmarks. Yet existing evaluations predominantly rely on third-normal-form schemas, leaving their effectiveness on star and snowflake schemas—the backbone of analytical data warehousing—largely unexplored. This paper presents a systematic empirical evaluation of seven representative learned cardinality estimation methods spanning three paradigmatic categories: query-driven, data-driven, and hybrid approaches. All methods are benchmarked against the PostgreSQL histogram-based estimator on three complementary datasets: TPC-DS with its native snowflake schema, STATS-CEB with real-world relational data, and IMDB/JOB as the established cross-study reference. The evaluation encompasses estimation accuracy measured by Q-Error and P-Error, inference latency, training cost, model compactness, end-to-end query execution time, and robustness under simulated ETL batch insertions. Results indicate that hybrid methods, particularly FactorJoin, achieve the strongest accuracy on data warehouse workloads with a median Q-Error of 1.74 on TPC-DS, while data-driven methods such as FLAT and BayesCard offer a favorable balance between accuracy and inference speed. BayesCard and FactorJoin exhibit the highest resilience to data updates, with median Q-Error increasing by fewer than 1.5 points after a 50% data insertion. These findings provide actionable guidance for practitioners seeking to deploy learned cardinality estimation in production data warehouse environments.
Keywords
Learned Cardinality Estimation , Data Warehouse , Query Optimization , Benchmark Evaluation .
Metadata
Pages: 1-8
References: 21
Disciplines: Software Systems
Subjects: Other
Cite This Article
APA Style
Hu, J., Wang, X. & Lai, J. (2026). Benchmarking learned cardinality estimation techniques for analytical query processing in data warehouses. Journal of Computer Technology and Applied Mathematics, 3(3), 1-8. https://doi.org/10.70393/6a6374616d.343134
Acknowledgments
Not Applicable.
FUNDING
Not Applicable.
INSTITUTIONAL REVIEW BOARD STATEMENT
Not Applicable.
DATA AVAILABILITY STATEMENT
Not Applicable.
INFORMED CONSENT STATEMENT
Not Applicable.
CONFLICT OF INTEREST
Not Applicable.
AUTHOR CONTRIBUTIONS
Not application.
References
PUBLISHER'S NOTE
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Copyright © 2025 The Author(s). Published by Southern United Academy of Sciences.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Persistent Identifiers





Abstracting and Indexing




Quality Assurance


Archiving Services
t



