Full publication list can be found at my google scholar. Below are selected publications.

Quantum Data Management

2026

  • (Poster) Giancarlo Gatti and Rihan Hai. Private quantum database. The 29th Annual Quantum Information Processing Conference, January 2026.
    paper poster

2025

  • Rihan Hai, Shih-Han Hung, Tim Coopmans, Tim Littau, and Floris Geerts. Quantum data management in the NISQ era. PVLDB, 18(6):1720–1729, 2025.
    paper codetech report slidesposter

  • (Demo) Tim Littau and Rihan Hai. Qymera: Simulating quantum circuits using RDBMS. In SIGMOD, pages 179–182, 2025.
    paper video

2024

  • Rihan Hai, Shih-Han Hung, and Sebastian Feld. Quantum data management: From theory to opportunities. In ICDE, pages 5376–5381, 2024.
    paper slides

AI in Data Lakes

2026

  • Wenbo Sun, Qiming Guo, Wenlu Wang, and Rihan Hai. TranSQL+: Serving large language models with SQL on low-resource hardware. SIGMOD, 3(6):1–27, December 2025.
    paper

  • Aditya Shankar, Lydia Chen, Arie van Deursen, and Rihan Hai. WaveStitch: Flexible and fast conditional time series generation with diffusion models. SIGMOD, 3(6):1–25, December 2025.
    paper code

  • Aditya Shankar, Yuandou Wang, Rihan Hai, Lydia Y. Chen. Harpoon: Generalised Manifold Guidance for Conditional Tabular Diffusion. ICLR, 2026. To appear.

    paper

2025

  • (🏆 Best SIGMOD demo runner-up) Wenbo Sun, Ziyu Li, and Rihan Hai. Database as runtime: Compiling LLMs to SQL for in-database model serving. In SIGMOD, pages 231–234, 2025
    paper video

2024

  • Ziyu Li, Wenbo Sun, Danning Zhan, Yan Kang, Lydia Chen, Alessandro Bozzon, and Rihan Hai. Amalur: The convergence of data integration and machine learning. TKDE, pages 1–14, 2024.
    paper

  • Ziyu Li, Hilco Van Der Wilk, Danning Zhan, Megha Khosla, Alessandro Bozzon, and Rihan Hai. Model selection with model zoo via graph learning. In ICDE, pages 1296–1309, 2024.
    paper code slides

  • Andra Ionescu, Kiril Vasilev, Florena Buse, Rihan Hai, and Asterios Katsifodimos. AutoFeat: Transitive feature discovery over join paths. In ICDE, pages 1861–1873. IEEE, 2024.
    paper code slides

  • Aditya Shankar, Hans Brouwer, Rihan Hai, and Lydia Chen. SiloFuse: Cross-silo synthetic data generation with latent tabular diffusion models. In ICDE, pages 110–123, 2024.
    paper slides

  • (Demo) Ziyu Li, Wenjie Zhao, Asterios Katsifodimos, and Rihan Hai. LLM-PQA: LLM-enhanced prediction query answering. In CIKM, pages 5239–5243, 2024.
    paper code

  • (Demo) Andra Ionescu, Zeger Mouw, Efthimia Aivaloglou, Rihan Hai, and Asterios Katsifodimos. Human-in-the-loop feature discovery for tabular data. In CIKM, pages 5215–5219, 2024.
    paper video

2023

  • Rihan Hai, Christos Koutras, Christoph Quix, and Matthias Jarke. Data lakes: A survey of functions and systems. TKDE, 35(12):12571–12590, 2023.
    paper poster

  • Rihan Hai, Christos Koutras, Andra Ionescu, Ziyu Li, Wenbo Sun, Jessie van Schijndel, Yan Kang, and Asterios Katsifodimos. Amalur: Data integration meets machine learning. In ICDE, pages 3729–3739, 2023.
    paper slides

During PhD

  • Rihan Hai and Christoph Quix. Rewriting of Plain SO Tgds into Nested Tgds. In VLDB, pages 1526–1538, 2019.
    paper slides

  • (Demo) Rihan Hai, Sandra Geisler, and Christoph Quix. Constance: An intelligent data lake system. In SIGMOD, pages 2097–2100, 2016.
    paper

  • (Book) Mathieu D’Aquin, Alessandro Adamou, Nicolas Arnaud, Houssem Chihoub, Christine Collet, Rihan Hai, et al. Data Lakes. ISTE / Wiley, 2020.
    paper