openHSU logo
Log In(current)
  1. Home
  2. Helmut-Schmidt-University / University of the Federal Armed Forces Hamburg
  3. Publications
  4. 3 - Publication references (without full text)
  5. DiaData: an integrated large dataset for type 1 diabetes and hypoglycemia research

DiaData: an integrated large dataset for type 1 diabetes and hypoglycemia research

Publication date
2025-11-14
Document type
Konferenzbeitrag
Author
Cinar, Beyza  
Maleshkova, Maria  
Organisational unit
Data Engineering  
DOI
10.1051/bioconf/202519503001
URI
https://openhsu.ub.hsu-hh.de/handle/10.24405/21630
Conference
9th International Conference on Biomedical Engineering and Bioinformatics (ICBEB 2025) ; Prague, Czech Republic ; September 19-21, 2025
Publisher
EDP Sciences
Series or journal
BIO Web of Conferences
ISSN
2117-4458
Periodical volume
195
Article ID
03001
Is referenced by
https://openhsu.ub.hsu-hh.de/handle/10.24405/20048
Part of the university bibliography
✅
Additional Information
Language
English
Abstract
Type 1 diabetes (T1D) is an incurable autoimmune disorder, which needs attentive monitoring to avoid high glucose variations. Affected cannot produce sufficient insulin and depend on external insulin injections. Multiple factors impact glucose levels, which can lead to dangerous side effects of hyperglycemia (≥ 180 mg/dL) and hypoglycemia (≤ 70 mg/dL). Data analysis can significantly enhance diabetes care by discovering individual trends and enabling tailored decision support. Particularly, machine learning (ML) approaches provide early alerts and predict glucose levels. However, the main limitation in diabetes research is the unavailability of large datasets. Therefore, this study systematically integrates 15 datasets to create a comprehensive database of 2510 subjects with glucose measurements recorded every 5 minutes. In total, 149 million measurements are included (Euglycemia (58.3%), Hyperglycemia (37.5%), and Hypoglycemia (4.2%)). Moreover, two sub-databases are extracted, including demographics or heart-rate data. The integrated dataset provides an equal distribution of sex and a variety of age levels. As a further contribution, data quality is assessed, revealing that missing values and data imbalance present a significant challenge. Thus, the application of ML models necessitates appropriate preprocessing methods.
Description
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0 (https://creativecommons.org/licenses/by/4.0/).
Version
Published version
Access right on openHSU
Metadata only access

  • Privacy policy
  • Send Feedback
  • Imprint