publications
2026
- On-device cough detection and respiratory disease classification enhanced by generative data augmentationGeorge Kontogiannis, Pantelis Tzamalis, Anastases Giannikopoulos, and 1 more authorComputers in Biology and Medicine, 2026
Background: Cough sounds are accessible, non-invasive biomarkers for respiratory disease assessment and can be captured using consumer-grade smartphones. Existing approaches typically focus solely on cough detection or rely on server-based deep learning for disease classification, which limits deployability and raises privacy concerns. Small, imbalanced cough datasets further hinder model generalization. Objective: To develop a multilayer, smartphone-compatible AI framework for automated cough detection and respiratory disease classification, and to propose a pioneering generative augmentation strategy utilizing a suite of five Variational Autoencoder (VAE) variants and a probabilistic cough-level fusion mechanism to improve disease classification under severe data scarcity and the limitations of conventional audio augmentation techniques. Methods: The proposed framework consists of three AI modules: (1) A Cough Detection Module (CDM) that performs real-time cough event detection and segmentation from continuous audio using lightweight models optimized for on-device execution. (2) A Disease Analysis Module (DAM) that classifies cough events into asthma, COVID-19, or healthy classes using parallel Support Vector Machine classifiers and a probabilistic cough-level fusion strategy. (3) A Generative Augmentation Module (GAM) employing five distinct VAE architectures. This module uniquely operates across the time-frequency domain for latent feature optimization while reconstructing samples in the time-domain to ensure acoustic verifiability. Results: The CDM provides reliable segmentation across heterogeneous recording conditions. The DAM achieves strong discriminability between asthma, COVID-19, and healthy coughs using compact cepstral and spectral features. The multi-variant GAM framework demonstrates superior efficacy in alleviating class imbalance specifically, the cross-domain (time-frequency to time-domain) reconstruction allows for the clinical verification of synthetic biomarkers. All system components operate in real time on commodity Android hardware. Conclusion: This integrated framework addresses key limitations in dataset availability, model generalizability, and deployability, introducing an interpretable generative approach that demonstrates the feasibility of smartphone-based acoustic sensing as a scalable and privacy-preserving tool for respiratory health monitoring.
@article{KONTOGIANNIS2026111784, title = {On-device cough detection and respiratory disease classification enhanced by generative data augmentation}, journal = {Computers in Biology and Medicine}, volume = {213}, pages = {111784}, year = {2026}, issn = {0010-4825}, doi = {10.1016/j.compbiomed.2026.111784}, url = {https://www.sciencedirect.com/science/article/pii/S0010482526003483}, author = {Kontogiannis, George and Tzamalis, Pantelis and Giannikopoulos, Anastases and Nikoletseas, Sotiris}, keywords = {Cough detection, Cough sound analysis, Generative data augmentation, Support vector machines, Convolutional neural networks, Variational autoencoders, On-device machine learning, Respiratory health monitoring, Ubiquitous computing} } - Metadata-Conditioned Audio Transformers for Adaptive Respiratory Sound ClassificationGeorge Kontogiannis, Pantelis Tzamalis, and Sotiris NikoletseasIn Proceedings of the 34th European Signal Processing Conference (EUSIPCO), 2026In Press
Respiratory sound classification from auscultation recordings is confounded by heterogeneous acquisition conditions, including recording device, auscultation site, and patient characteristics, that introduce systematic variability unrelated to the underlying pathology. We propose a family of metadata-conditioning mechanisms for the Audio Spectrogram Transformer (AST), progressing from gated residual fusion to Feature-wise Linear Modulation (FiLM) adapted to the Transformer architecture, including two novel variants: Token-Aware FiLM (TAFiLM) and Soft-Factorized FiLM (SoftFiLM). Experiments on the ICBHI benchmark demonstrate consistent gains across all conditioning strategies. SoftFiLM achieves 64.11% on the official 4-class task (+3.94% over the baseline) and 72.40% on the 2-class task, establishing a new state-of-the-art on the 2-class task.
@inproceedings{kontogiannis2026eusipco, author = {Kontogiannis, George and Tzamalis, Pantelis and Nikoletseas, Sotiris}, title = {Metadata-Conditioned Audio Transformers for Adaptive Respiratory Sound Classification}, booktitle = {Proceedings of the 34th European Signal Processing Conference (EUSIPCO)}, year = {2026}, note = {In Press}, url = {https://github.com/kontogiannisg/Metadata-Conditioned_Audio_Transformers} } - AI-Based Jaundice Assessment from Ocular Images: A Structured Survey of Methods, Datasets, and Open ChallengesGeorge Kontogiannis, Pantelis Tzamalis, Dimitrios Velissaris, and 1 more authorHealthcare Informatics Research, 2026Under Review
We provide the first systematic, PRISMA-guided survey of AI-based methods for non-invasive jaundice detection and bilirubin prediction from adult ocular images. Following PRISMA guidelines, 47 studies were included from 520 retrieved records. A five-axis taxonomy categorises studies by task type, AI methodology, sclera segmentation strategy, input acquisition and calibration, and deployment setting. The best-performing bilirubin regression systems achieved Pearson correlations of 0.89 and R\textsuperscript2 of 0.956; state-of-the-art sclera segmentation reached F1-scores of 96.66% and IoU of 93.59%. Five critical gaps were identified: small datasets, absent standardised evaluation protocols, dominance of binary classification, disconnect between sclera segmentation and jaundice detection communities, and absence of prospective clinical validation.
@article{kontogiannis2026jaundice, author = {Kontogiannis, George and Tzamalis, Pantelis and Velissaris, Dimitrios and Nikoletseas, Sotiris}, title = {{AI}-Based Jaundice Assessment from Ocular Images: A Structured Survey of Methods, Datasets, and Open Challenges}, journal = {Healthcare Informatics Research}, year = {2026}, note = {Under Review}, } - Transforming Smartphones into Stethoscopes: Edge AI for Asthma and COPDGeorge Kontogiannis, Pantelis Tzamalis, Sotiris Nikoletseas, and 2 more authorsIEEE Journal of Biomedical and Health Informatics, 2026Under Review
@article{kontogiannis2026stethoscope, author = {Kontogiannis, George and Tzamalis, Pantelis and Nikoletseas, Sotiris and Karamanidou, Theodora and Stavropoulos, Thanos G.}, title = {Transforming Smartphones into Stethoscopes: Edge {AI} for Asthma and {COPD}}, journal = {IEEE Journal of Biomedical and Health Informatics}, year = {2026}, note = {Under Review}, }
2025
- Spectral clustering and query expansion using embeddings on the graph-based extension of the set-based information retrieval modelNikitas-Rigas Kalogeropoulos, George Kontogiannis, and Christos MakrisExpert Systems with Applications, 2025
This paper presents a straightforward yet novel approach to enhance graph-based information retrieval models, by calibrating the relationships between node terms, leading to better evaluation metrics at the retrieval phase, and by reducing the total size of the graph. This is achieved by integrating spectral clustering, embedding-based graph pruning and term re-weighting. Spectral clustering assigns each term to a specific cluster, allowing us to propose two pruning methods: out-cluster and in-cluster pruning based on node similarities. In-cluster pruning refers to pruning edges between terms within the same cluster, while out-cluster pruning refers to edges that connect different clusters. Both methods utilize spectral embeddings to assess node similarities, resulting in more manageable clusters termed concepts. These concepts are likely to contain semantically similar terms, with each term’s concept defined as the centroid of its cluster. We show that this graph pruning strategy significantly enhances the performance and effectiveness of the overall model, reducing, at the same time, its graph sparsity. Moreover, during the retrieval phase, the conceptually calibrated centroids are used to re-weight terms generated by user queries, and the precomputed embeddings enable efficient query expansion through a k-Nearest Neighbors (K-NN) approach, offering substantial enhancement with minimal additional time cost. To the best of our knowledge, this is the first application of spectral clustering and embedding-based conceptualization to prune graph-based IR models. Our approach enhances both retrieval efficiency and performance while enabling effective query expansion with minimal additional computational overhead. Our proposed technique is applied across various graph-based information retrieval models, improving evaluation metrics and producing sparser graphs.
2024
- Respiratory diseases diagnosis using audio analysis and artificial intelligence: a systematic reviewPanagiotis Kapetanidis, Fotios Kalioras, Constantinos Tsakonas, and 5 more authorsSensors, 2024
- Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using GANsGeorge Kontogiannis, Pantelis Tzamalis, and Sotiris NikoletseasIn Proceedings of the 2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), 2024
In the evolving domain of Human Activity Recognition (HAR) using Internet of Things (IoT) devices, there is an emerging interest in employing Deep Generative Models (DGMs) to address data scarcity, enhance data quality, and improve classification metrics scores. Among these types of models, Generative Adversarial Networks (GANs) have arisen as a powerful tool for generating synthetic data that mimic real-world scenarios with high fidelity. However, Human Gesture Recognition (HGR), a subset of HAR, particularly in healthcare applications, using time series data such as allergic gestures, remains highly unexplored.In this paper, we examine and evaluate the performance of two GANs in the generation of synthetic gesture motion data that compose a part of an open-source benchmark dataset. The data is related to the disease identification domain and healthcare, specifically to allergic rhinitis. We also focus on these AI models’ performance in terms of fidelity, diversity, and privacy. Furthermore, we examine the scenario if the synthetic data can substitute real data, in training scenarios and how well models trained on synthetic data can be generalized for the allergic rhinitis gestures. In our work, these gestures are related to 6-axes accelerometer and gyroscope data, serving as multi-variate time series instances, and retrieved from smart wearable devices. To the best of our knowledge, this study is the first to explore the feasibility of synthesizing motion gestures for allergic rhinitis from wearable IoT device data using Generative Adversarial Networks (GANs) and testing their impact on the generalization of gesture recognition systems. It is worth noting that, even if our method has been applied to a specific category of gestures, it is designed to be generalized and can be deployed also to other motion data in the HGR domain.
@inproceedings{kontogiannis2024dcoss, author = {Kontogiannis, George and Tzamalis, Pantelis and Nikoletseas, Sotiris}, title = {Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using {GANs}}, booktitle = {Proceedings of the 2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)}, pages = {384--391}, publisher = {IEEE}, year = {2024}, }