Complete list of publications can be found in IRIS UNIL, GoogleScholar or ORCID. Here below only the latest published works are displayed.
- Deep Learning for fODF Estimation in Infant Brains: Model Comparison, Ground-Truth Impact, and Domain Shift Mitigationby Lin, Rizhong on 1 October 2025
dc.title: Deep Learning for fODF Estimation in Infant Brains: Model Comparison, Ground-Truth Impact, and Domain Shift Mitigation dc.contributor.author: Lin, Rizhong; Kebiri, Hamza; Gholipour, Ali; Chen, Yufei; Thiran, Jean-Philippe; Karimi, Davood; Bach Cuadra, Meritxell dc.description.abstract: The accurate estimation of fiber orientation distribution functions (fODFs) in diffusion magnetic resonance imaging (MRI) is crucial for understanding early brain development and its potential disruptions. Although supervised deep learning (DL) models have shown promise in fODF estimation from neonatal diffusion MRI (dMRI) data, the out-of-domain (OOD) performance of these models remains largely unexplored, especially under diverse domain shift scenarios. This study evaluated the robustness of three state-of-the-art DL architectures: multilayer perceptron (MLP), transformer, and U-Net/convolutional neural network (CNN) on fODF predictions derived from dMRI data. Using 488 subjects from the developing Human Connectome Project (dHCP) and the Baby Connectome Project (BCP) datasets, we reconstructed reference fODFs from the full dMRI series using single-shell three-tissue constrained spherical deconvolution (SS3T-CSD) and multi-shell multi-tissue CSD (MSMT-CSD) to generate reference fODF reconstructions for model training, and systematically assessed the impact of age, scanner/protocol differences, and input dimensionality on model performance. Our findings reveal that U-Net consistently outperformed other models when fewer diffusion gradient directions were used, particularly with the SS3T-CSD-derived ground truth, which showed superior performance in capturing crossing fibers. However, as the number of input diffusion gradient directions increased, MLP and the transformer-based model exhibited steady gains in accuracy. Nevertheless, performance nearly plateaued from 28 to 45 input directions in all models. Age-related domain shifts showed asymmetric patterns, being less pronounced in late developmental stages (late neonates, and babies), with SS3T-CSD demonstrating greater robustness to variability compared to MSMT-CSD. To address inter-site domain shifts, we implemented two adaptation strategies: the Method of Moments (MoM) and fine-tuning. Both strategies achieved significant improvements ( p < 0.05 $$ p<0.05 $$ ) in over 95% of tested configurations, with fine-tuning consistently yielding superior results and U-Net benefiting the most from increased target subjects. This study represents the first systematic evaluation of OOD settings in DL applications to fODF estimation, providing critical insights into model robustness and adaptation strategies for diverse clinical and research applications. © 2025 The Author(s). Human Brain Mapping published by Wiley Periodicals LLC.
- ConfLUNet: Multiple sclerosis lesion instance segmentation in presence of confluent lesionsby Wynen, Maxence on 1 October 2025
dc.title: ConfLUNet: Multiple sclerosis lesion instance segmentation in presence of confluent lesions dc.contributor.author: Wynen, Maxence; Gordaliza, Pedro M; Istasse, Maxime; Stölting, Anna; Maggi, Pietro; Macq, Benoît; Bach Cuadra, Meritxell dc.description.abstract: Accurate lesion-level segmentation on MRI is critical for multiple sclerosis (MS) diagnosis, prognosis, and disease monitoring. However, current evaluation practices largely rely on semantic segmentation post-processed with connected components (CC), which cannot separate confluent lesions (aggregates of confluent lesion units, CLUs) due to reliance on spatial connectivity. To address this misalignment with clinical needs, we introduce formal definitions of CLUs and associated CLU-aware detection metrics, and include them in an exhaustive instance segmentation evaluation framework. Within this framework, we systematically evaluate CC and post-processing-based Automated Confluent Splitting (ACLS), the only existing methods for lesion instance segmentation in MS. Our analysis reveals that CC consistently underestimates CLU counts, while ACLS tends to oversplit lesions, leading to overestimated lesion counts and reduced precision. To overcome these limitations, we propose ConfLUNet, the first end-to-end instance segmentation framework for MS lesions. ConfLUNet jointly optimizes lesion detection and delineation from a single FLAIR image. Trained on 50 patients, ConfLUNet significantly outperforms CC and ACLS on the held-out test set (n = 13) in instance segmentation (Panoptic Quality: 42.0% vs. 37.5%/36.8%; p = 0.017/0.005) and lesion detection (F1: 67.3% vs. 61.6%/59.9%; p = 0.028/0.013). For CLU detection, ConfLUNet achieves the highest textF1<sup>CLU</sup> (81.5%), improving recall over CC (+12.5%, p = 0.015) and precision over ACLS (+31.2%, p = 0.003). By combining rigorous definitions, new CLU-aware metrics, a reproducible evaluation framework, and the first dedicated end-to-end model, this work lays the foundation for lesion instance segmentation in MS. Copyright © 2025 The Authors. Published by Elsevier Ltd.. All rights reserved.
- Biometry and volumetry in multi-centric fetal brain magnetic resonance imaging: assessing the bias of super-resolution reconstructionby Sanchez, Thomas on 1 September 2025
dc.title: Biometry and volumetry in multi-centric fetal brain magnetic resonance imaging: assessing the bias of super-resolution reconstruction dc.contributor.author: Sanchez, Thomas; Mihailov, Angeline; Koob, Mériam; Girard, Nadine; Manchon, Aurélie; Valenzuela, Ignacio; Gómez-Chiari, Marta; Martí Juan, Gerard; Pron, Alexandre; Eixarch, Elisenda; Piella, Gemma; Gonzalez Ballester, Miguel Angel; Camara, Oscar; Dunet, Vincent; Auzias, Guillaume; Bach Cuadra, Meritxell dc.description.abstract: Fetal brain MRI is increasingly used to complement ultrasound imaging. Images are processed using complex super-resolution reconstruction pipelines, which may bias biometric and volumetric measurements. To assess the consistency of 2-dimensional (D) biometric and 3-D volumetric measurements across three hospitals using three widely used super-resolution reconstruction pipelines. This retrospective multi-centric study used T2-weighted fetal brain MRI scans acquired at three hospitals between 2009 and 2023. MRIs from each subject were reconstructed with each super-resolution reconstruction method, and biometric measurements were performed by four experts. Automated 3-D volumetry was performed using a state-of-the-art segmentation method. A qualitative evaluation assessed the clinicians' likelihood of using super-resolution reconstructed volumes in their practice. Eighty-four healthy subjects were included. Biometric measurements revealed statistically significant changes that consistently remained below voxel width (0.8 mm; P<0.001). Automated 3-D volumetry revealed small systematic effects (<2.8%; P<0.001). The qualitative evaluation showed systematic differences between super-resolution reconstruction methods for the perception of white matter intensity (P=0.02) and sharpness of the image (P=0.01). Variations in 2-D and 3-D quantitative measurements did not show any large systematic bias when using different super-resolution reconstruction methods for clinical radiological assessment across centers, scanners, and raters. © 2025. The Author(s).
- A dataset of synthetic, maturation-informed magnetic resonance images of the human fetal brain.by Lajous, H. on 10 April 2025
dc.title: A dataset of synthetic, maturation-informed magnetic resonance images of the human fetal brain. dc.contributor.author: Lajous, H.; Le Boeuf Fló, A.; Gordaliza, P.M.; Esteban, O.; Marques, F.; Dunet, V.; Koob, M.; Bach Cuadra, M. dc.description.abstract: Magnetic resonance imaging (MRI) is a powerful modality for investigating abnormal developmental patterns in utero. However, since it is not the first-line diagnostic tool in this sensitive population, data remain scarce and heterogeneous across scanners and hospitals. To address this, we present a novel dataset of synthetic images representative of real fetal brain MRI. Our dataset comprises 594 two-dimensional, low-resolution series of T <sub>2</sub> -weighted images corresponding to 78 developing human fetal brains between 20.0 and 34.8 weeks of gestational age. Data are generated using a new version of the Fetal Brain MR Acquisition Numerical phantom (FaBiAN) to account for local white matter heterogeneities throughout maturation. Both healthy and pathological anatomies are simulated with standard clinical settings. Two independent radiologists qualitatively assessed the realism of the simulated images. A quantitative analysis confirms an enhanced fidelity compared to the original version of the software, with further validation through its applicability to fetal brain tissue segmentation. The cohort is publicly available to support the continuous endeavor of developing advanced post-processing methods as well as cutting-edge artificial intelligence models.
- Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results.by Payette, K. on 1 March 2025
dc.title: Multi-Center Fetal Brain Tissue Annotation (FeTA) Challenge 2022 Results. dc.contributor.author: Payette, K.; Steger, C.; Licandro, R.; Dumast, P.; Li, H.B.; Barkovich, M.; Li, L.; Dannecker, M.; Chen, C.; Ouyang, C.; McConnell, N.; Miron, A.; Li, Y.; Uus, A.; Grigorescu, I.; Gilliland, P.R.; Siddiquee, MMR; Xu, D.; Myronenko, A.; Wang, H.; Huang, Z.; Ye, J.; Alenya, M.; Comte, V.; Camara, O.; Masson, J.B.; Nilsson, A.; Godard, C.; Mazher, M.; Qayyum, A.; Gao, Y.; Zhou, H.; Gao, S.; Fu, J.; Dong, G.; Wang, G.; Rieu, Z.; Yang, H.; Lee, M.; Plotka, S.; Grzeszczyk, M.K.; Sitek, A.; Daza, L.V.; Usma, S.; Arbelaez, P.; Lu, W.; Zhang, W.; Liang, J.; Valabregue, R.; Joshi, A.A.; Nayak, K.N.; Leahy, R.M.; Wilhelmi, L.; Dandliker, A.; Ji, H.; Gennari, A.G.; Jakovcic, A.; Klaic, M.; Adzic, A.; Markovic, P.; Grabaric, G.; Kasprian, G.; Dovjak, G.; Rados, M.; Vasung, L.; Cuadra, M.B.; Jakab, A. dc.description.abstract: Segmentation is a critical step in analyzing the developing human fetal brain. There have been vast improvements in automatic segmentation methods in the past several years, and the Fetal Brain Tissue Annotation (FeTA) Challenge 2021 helped to establish an excellent standard of fetal brain segmentation. However, FeTA 2021 was a single center study, limiting real-world clinical applicability and acceptance. The multi-center FeTA Challenge 2022 focused on advancing the generalizability of fetal brain segmentation algorithms for magnetic resonance imaging (MRI). In FeTA 2022, the training dataset contained images and corresponding manually annotated multi-class labels from two imaging centers, and the testing data contained images from these two centers as well as two additional unseen centers. The multi-center data included different MR scanners, imaging parameters, and fetal brain super-resolution algorithms applied. 16 teams participated and 17 algorithms were evaluated. Here, the challenge results are presented, focusing on the generalizability of the submissions. Both in- and out-of-domain, the white matter and ventricles were segmented with the highest accuracy (Top Dice scores: 0.89, 0.87 respectively), while the most challenging structure remains the grey matter (Top Dice score: 0.75) due to anatomical complexity. The top 5 average Dices scores ranged from 0.81-0.82, the top 5 average percentile Hausdorff distance values ranged from 2.3-2.5mm, and the top 5 volumetric similarity scores ranged from 0.90-0.92. The FeTA Challenge 2022 was able to successfully evaluate and advance generalizability of multi-class fetal brain tissue segmentation algorithms for MRI and it continues to benchmark new algorithms.
- Assessment of fetal corpus callosum biometry by 3D super-resolution reconstructed T2-weighted magnetic resonance imagingby Lamon, Samuel on 4 February 2025
dc.title: Assessment of fetal corpus callosum biometry by 3D super-resolution reconstructed T2-weighted magnetic resonance imaging dc.contributor.author: Lamon, Samuel dc.description.abstract: Le corps calleux (CC) est une structure cérébrale clé jouant un rôle fondamental dans l’intégration des fonctions cérébrales latéralisées. Chez les 0.3 à 0.7% des personnes venant au monde avec un corps calleux absent, que ce soit totalement ou partiellement, le développement neurologique est étrangement variable. L’évaluation prénatale du CC est essentielle et passe par une appréciation biométrique précise qui contribue à établir un pronostic et à anticiper les soins anté- et post-nataux. L’évaluation prénatale du CC s’appuie sur deux modalités d’imagerie complémentaires : l’échographie (US) qui est utilisée pour le dépistage et l’imagerie par résonance magnétique (IRM) pondérée T2 (T2WS) pour compléter le bilan. L’IRM foetale clinique présente néanmoins certaines limitations comme les artéfacts induits par les mouvements imprévisibles du fœtus et la résolution spatiale limite en dehors du plan d’acquisition. Afin de surmonter ces obstacles, une nouvelle technique - l'IRM super-résolution (SR) - a été développée. Elle repose sur des algorithmes de correction de mouvements et d’interpolation de données et permet d’obtenir un volume en 3D de meilleure résolution dans les trois orientations de l’espace. Bien que prometteuse, l'intégration de cette technique en clinique reste un défi. Une validation par rapport aux références actuelles, notamment à l’échographie, est nécessaire avant d’envisager son implémentation. Cette étude y contribue en évaluant la précision de la biométrie du CC en utilisant la SR, comparée à l’échographie et à l’IRM conventionnelle (T2WS). Pour ce faire, nous avons rétrospectivement sélectionné une population de 57 fœtus ayant bénéficié d’IRM cérébrales (et d’US) entre 2014 et 2021 au CHUV. La longueur du CC, la hauteur de ses 4 segments (rostrum, genou, corps et splenium) ainsi que son aire ont été mesurées par deux observateurs : Le premier, junior, supervisé directement pour la lecture des US et le deuxième - senior. L’analyse statistique a principalement porté sur une comparaison entre les modalités et entre les observateurs. Nos résultats montrent que la SR, grâce à son approche multiplanaire et à sa meilleure résolution spatiale isotropique à 1mm, présente un taux nettement plus faible de segments du CC non-visualisés par rapport à l'US et à la T2WS. Ces omissions concernaient principalement le segment rostral du CC. En termes de biométrie, les mesures de la longueur du CC (LCC) et des segment postérieurs et moyens étaient similaires entre les modalités d’image etudiées. Cependant, des différences ont été observées pour les segments antérieurs (rostrum et genou), qui restent difficiles à mesurer même en SR. Nous avançons plusieurs hypothèses pour expliquer cette difficulté persistante, comme la taille millimétrique du rostrum, se heurtant à l’effet du volume partiel et le rôle de l’environnement liquidien variable du cerveau foetal. L’étude a également validé des régression linéaires pour la LCC et introduit des valeurs normatives auparavant inexistantes pour les hauteurs des sous-segments du CC en T2WS et en SR. Bien que cette étude présente des résultats prometteurs, la portée de ses conclusions est réduite par sa nature rétrospective, son faible effectif, l’absence de données d’imagerie post-natale et de suivi clinique à long terme ainsi que par un intervalle toléré de 2 semaines entre les US et les IRM. En termes de perspectives, les recherches futures devraient inclure des cohortes plus importantes, notamment de patients avec agénésie partiel calleuse, et un suivi clinique prolongé. D’un point de vue ingénierie, des progrès vers l’automatisation complète du processus de reconstruction SR doivent être encouragés. En conclusion, cette étude montre que la SR permet une évaluation fiable et plus fréquemment possible de la longueur du CC et des hauteurs des segments postérieurs et moyens du CC, même si son utilisation ne peut pas encore être appliquée dans la pratique clinique courante. En cas de difficultés d'évaluation par US, la SR pourrait offrir une alternative précieuse. Des recherches supplémentaires sont cependant nécessaires pour valider cette technique et faciliter son intégration dans la pratique clinique courante. La SR pourrait ainsi devenir un outil essentiel pour mieux comprendre les anomalies du CC et leur impact sur le développement neurologique global. Soulignons pour le mot de la fin l’excellent exemple de collaboration entre cliniciens et ingénieurs que cette étude a représenté.
- BrainAgeNeXt: Advancing brain age modeling for individuals with multiple sclerosisby La Rosa, Francesco on 1 January 2025
dc.title: BrainAgeNeXt: Advancing brain age modeling for individuals with multiple sclerosis dc.contributor.author: La Rosa, Francesco; Dos Santos Silva, Jonadab; Dereskewicz, Emma; Invernizzi, Azzurra; Cahan, Noa; Galasso, Julia; Garcia, Nadia; Graney, Robin; Levy, Sarah; Verma, Gaurav; Balchandani, Priti; Reich, Daniel S.; Horton, Megan; Greenspan, Hayit; Sumowski, James; Bach Cuadra, Merixtell; Beck, Erin S. dc.description.abstract: Aging is associated with structural brain changes, cognitive decline, and neurodegenerative diseases. Brain age, an imaging biomarker sensitive to deviations from healthy aging, offers insights into structural aging variations and is a potential prognostic biomarker in neurodegenerative conditions. This study introduces BrainAgeNeXt, a novel convolutional neural network inspired by the MedNeXt framework, designed to predict brain age from T1-weighted magnetic resonance imaging (MRI) scans. BrainAgeNeXt was trained and validated on 11,574 MRI scans from 33 private and publicly available datasets of healthy volunteers, aged 5 to 95 years, imaged with 3T and 7T MRI. Performance was compared against three state-of-the-art brain age prediction methods. BrainAgeNeXt achieved a mean absolute error (MAE) of 2.78 ± 3.64 years, lower than the compared methods (MAE range 3.55-4.16 years). We also tested all methods across different levels of image quality, and BrainAgeNeXt performed well even with motion artifacts and less common 7T MRI data. In three longitudinal multiple sclerosis (MS) cohorts (273 individuals), brain age was, on average, 4.21 ± 6.51 years greater than chronological age. Longitudinal analysis indicated that brain age increased by 1.15 years per chronological year in individuals with MS (95% CI = [1.05, 1.26]). Moreover, in early MS, individuals with worsening disability had a higher annual increase in brain age compared to those with stable clinical assessments (1.24 vs 0.75,p< 0.01). These findings suggest that brain age is a promising prognostic biomarker for MS progression and potentially a valuable endpoint for clinical trials. © 2025 The Authors. Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
- Structural-based uncertainty in deep learning across anatomical scales: Analysis in white matter lesion segmentation.by Molchanova, N. on 1 January 2025
dc.title: Structural-based uncertainty in deep learning across anatomical scales: Analysis in white matter lesion segmentation. dc.contributor.author: Molchanova, N.; Raina, V.; Malinin, A.; Rosa, F.; Depeursinge, A.; Gales, M.; Granziera, C.; Müller, H.; Graziani, M.; Cuadra, M.B. dc.description.abstract: This paper explores uncertainty quantification (UQ) as an indicator of the trustworthiness of automated deep-learning (DL) tools in the context of white matter lesion (WML) segmentation from magnetic resonance imaging (MRI) scans of multiple sclerosis (MS) patients. Our study focuses on two principal aspects of uncertainty in structured output segmentation tasks. First, we postulate that a reliable uncertainty measure should indicate predictions likely to be incorrect with high uncertainty values. Second, we investigate the merit of quantifying uncertainty at different anatomical scales (voxel, lesion, or patient). We hypothesize that uncertainty at each scale is related to specific types of errors. Our study aims to confirm this relationship by conducting separate analyses for in-domain and out-of-domain settings. Our primary methodological contributions are (i) the development of novel measures for quantifying uncertainty at lesion and patient scales, derived from structural prediction discrepancies, and (ii) the extension of an error retention curve analysis framework to facilitate the evaluation of UQ performance at both lesion and patient scales. The results from a multi-centric MRI dataset of 444 patients demonstrate that our proposed measures more effectively capture model errors at the lesion and patient scales compared to measures that average voxel-scale uncertainty values. We provide the UQ protocols code at https://github.com/Medical-Image-Analysis-Laboratory/MS_WML_uncs.
- Interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosisby Molchanova, Nataliia on 1 January 2025
dc.title: Interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosis dc.contributor.author: Molchanova, Nataliia; Cagol, Alessandro; Gordaliza, Pedro M.; Ocampo-Pineda, Mario; Lu, Po-Jui; Weigel, Matthias; Chen, Xinjie; Depeursinge, Adrien; Granziera, Cristina; Müller, Henning; Cuadra, Meritxell Bach dc.description.abstract: Uncertainty quantification (UQ) has become critical for evaluating the reliability of artificial intelligence systems, especially in medical image segmentation. This study addresses the interpretability of instance-wise uncertainty values in deep learning models for focal lesion segmentation in magnetic resonance imaging, specifically cortical lesion (CL) segmentation in multiple sclerosis. CL segmentation presents several challenges, including the complexity of manual segmentation, high variability in annotation, data scarcity, and class imbalance, all of which contribute to aleatoric and epistemic uncertainty. We explore how UQ can be used not only to assess prediction reliability but also to provide insights into model behavior, detect biases, and verify the accuracy of UQ methods. Our research demonstrates the potential of instance-wise uncertainty values to offer post hoc global model explanations, serving as a sanity check for the model. The implementation is available at https://github.com/NataliiaMolch/interpret-lesion-unc .
- Exploiting XAI Maps to Improve MS Lesion Segmentation and Detection in MRIby Spagnolo, Federico on 1 January 2025
dc.title: Exploiting XAI Maps to Improve MS Lesion Segmentation and Detection in MRI dc.contributor.author: Spagnolo, Federico; Molchanova, Nataliia; Ocampo-Pineda, Mario; Melie-Garcia, Lester; Bach Cuadra, Meritxell; Granziera, Cristina; Andrearczyk, Vincent; Depeursinge, Adrien dc.description.abstract: To date, several methods have been developed to explain deep learning algorithms for classification tasks. Recently, an adaptation of two of such methods has been proposed to generate instance-level explainable maps in a semantic segmentation scenario, such as multiple sclerosis (MS) lesion segmentation. In the mentioned work, a 3D U-Net was trained and tested for MS lesion segmentation, yielding an F1 score of 0.7006, and a positive predictive value (PPV) of 0.6265. The distribution of values in explainable maps exposed some differences between maps of true and false positive (TP/FP) examples. Inspired by those results, we explore in this paper the use of characteristics of lesion-specific saliency maps to refine segmentation and detection scores. We generate around 21000 maps from as many TP/FP lesions in a batch of 72 patients (training set) and 4868 from the 37 patients in the test set. 93 radiomic features extracted from the first set of maps were used to train a logistic regression model and classify TP versus FP. On the test set, F1 score and PPV were improved by a large margin when compared to the initial model, reaching 0.7450 and 0.7817, with 95% confidence intervals of [0.7358, 0.7547] and [0.7679, 0.7962], respectively. These results suggest that saliency maps can be used to refine prediction scores, boosting a model’s performances.
