Details

SyntheRela : a benchmark for synthetic relational database generation
ID Hudovernik, Valter (Author), ID Jurkovič, Martin (Author), ID Štrumbelj, Erik (Author)

.pdfPDF - Presentation file, Download (13,13 MB)
MD5: 95EE2739509E0982590930C03B450B2F
URLURL - Source URL, Visit https://openreview.net/pdf?id=Mi8XioazWy This link opens in a new window

Abstract
Synthesizing relational databases has started to receive more attention from researchers, practitioners, and industry. The task is more difficult than synthesizing a single table due to the added complexity of relationships between tables. For the same reasons, benchmarking methods for synthesizing relational databases introduces new challenges. Our work is motivated by a lack of an empirical evaluation of state-of-the-art methods and by gaps in the understanding of how such an evaluation should be done. We review related work on relational database synthesis, common benchmarking datasets, and approaches to measuring the fidelity and utility of synthetic data. We combine best practices, a novel robust detection metric, and a novel approach to evaluating utility with graph neural networks into a benchmarking tool. We use this benchmark to compare 6 open-source methods over 8 real-world databases, with a total of 39 tables. The open-source SyntheRela benchmark is available on GitHub with a public leaderboard.

Language:English
Keywords:relational databases, synthetic data, benchmarking, evaluation
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FRI - Faculty of Computer and Information Science
Publication status:Published
Publication version:Version of Record
Year:2026
Number of pages:37 str.
PID:20.500.12556/RUL-182333 This link opens in a new window
UDC:004.65
ISSN on article:2835-8856
COBISS.SI-ID:276886531 This link opens in a new window
Publication date in RUL:07.05.2026
Views:19
Downloads:1
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a journal

Title:Transactions on machine learning research
Shortened title:Transact. mach. learn. res.
Publisher:OpenReview.net
ISSN:2835-8856
COBISS.SI-ID:233590019 This link opens in a new window

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.

Secondary language

Language:Slovenian
Keywords:relacijske baze, sintetični podatki, primerjalna analiza, ocenjevanje

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back