TY - JOUR AB - Learning the causes of time-series data is a fundamental task in many applications, spanning from finance to earth sciences or bio-medical applications. Common approaches for this task are based on vector auto-regression, and they do not take into account unknown confounding between potential causes. However, in settings with many potential causes and noisy data, these approaches may be substantially biased. Furthermore, potential causes may be correlated in practical applications or even contain cycles. To address these challenges, we propose a new double machine learning based method for structure identification from temporal data (DR-SIT). We provide theoretical guarantees, showing that our method asymptotically recovers the true underlying causal structure. Our analysis extends to cases where the potential causes have cycles, and they may even be confounded. We further perform extensive experiments to showcase the superior performance of our method. Code: https://github.com/sdi1100041/TMLR_submission_DR_SIT. AU - Angelis, E. AU - Quinzan, F.* AU - Soleymani, A.* AU - Jail, P.J.* AU - Bauer, S. C1 - 75684 C2 - 58076 TI - Double machine learning based structure identification from temporal data. JO - Trans. Machine Learn. Res. VL - 2025 PY - 2025 SN - 2835-8856 ER - TY - JOUR AB - The field of deep generative modeling has grown rapidly in the last few years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue that current large-scale generative AI models exhibit several fundamental shortcomings that hinder their widespread adoption across domains. In this work, our objective is to identify these issues and highlight key unresolved challenges in modern generative AI paradigms that should be addressed to further enhance their capabilities, versatility, and reliability. By identifying these challenges, we aim to provide researchers with insights for exploring fruitful research directions, thus fostering the development of more robust and accessible generative AI solutions. AU - Manduchi, L.* AU - Meister, C.* AU - Pandey, K.C.* AU - Bamler, R.* AU - Cotterell, R.* AU - Däubener, S.* AU - Fellenz, S.* AU - Fischer, A.* AU - Gärtner, T.* AU - Kirchler, M.* AU - Kloft, M.* AU - Li, Y.* AU - Lippert, C.* AU - de Melo, G.* AU - Nalisnick, E.* AU - Ommer, B.* AU - Ranganath, R.* AU - Waldron, M.* AU - Ullrich, K.* AU - Van den Broeck, G.* AU - Vogt, J.E.* AU - Wang, Y.* AU - Wenzel, F.* AU - Wood, F.* AU - Mandt, S.* AU - Fortuin, V. C1 - 75499 C2 - 58088 TI - On the challenges and opportunities in generative AI. JO - Trans. Machine Learn. Res. VL - 2025 PY - 2025 SN - 2835-8856 ER - TY - JOUR AB - Population graphs and their use in combination with graph neural networks (GNNs) have demonstrated promising results for multi-modal medical data integration and improving disease diagnosis and prognosis. Several different methods for constructing these graphs and advanced graph learning techniques have been established to maximise the predictive power of GNNs on population graphs. However, in this work, we raise the question of whether existing methods are really strong enough by showing that simple baseline methods –such as random forests or linear regressions–, perform on par with advanced graph learning models on several population graph datasets for a variety of different clinical applications. We use the commonly used public population graph datasets TADPOLE and ABIDE, a brain age estimation and a cardiac dataset from the UK Biobank, and a real-world in-house COVID dataset. We (a) investigate the impact of different graph construction methods, graph convolutions, and dataset size and complexity on GNN performance and (b) discuss the utility of GNNs for multi-modal data integration in the context of population graphs. Based on our results, we argue towards the need for “better” graph construction methods or innovative applications for population graphs to render them beneficial. AU - Mueller, T.T.* AU - Starck, S.* AU - Bintsi, K.M.* AU - Ziller, A.* AU - Braren, R.* AU - Kaissis, G. AU - Rueckert, D.* C1 - 73645 C2 - 57277 TI - Are Population Graphs Really as Powerful as Believed? JO - Trans. Machine Learn. Res. VL - 2024 PY - 2024 SN - 2835-8856 ER - TY - JOUR AB - Existing convolutional neural network architectures frequently rely upon batch normalization (BatchNorm) to effectively train the model. BatchNorm, however, performs poorly with small batch sizes, and is inapplicable to differential privacy. To address these limi-tations, we propose the kernel normalization (KernelNorm) and kernel normalized convolutional layers, and incorporate them into kernel normalized convolutional networks (KNConvNets) as the main building blocks. We implement KNConvNets corresponding to the state-of-the-art ResNets while forgoing the BatchNorm layers. Through extensive exper-iments, we illustrate that KNConvNets achieve higher or competitive performance compared to the BatchNorm counterparts in image classification and semantic segmentation. They also significantly outperform their batch-independent competitors including those based on layer and group normalization in non-private and differentially private training. Given that, KernelNorm combines the batch-independence property of layer and group normalization with the performance advantage of BatchNorm1. AU - Nasirigerdeh, R.* AU - Torkzadehmahani, R.* AU - Rueckert, D.* AU - Kaissis, G. C1 - 73647 C2 - 57279 SP - 107-118 TI - Kernel normalized convolutional networks. JO - Trans. Machine Learn. Res. VL - 2024 PY - 2024 SN - 2835-8856 ER - TY - JOUR AB - Notions of counterfactual invariance (CI) have proven essential for predictors that are fair, robust, and generalizable in the real world. We propose graphical criteria that yield a sufficient condition for a predictor to be counterfactually invariant in terms of a conditional independence in the observational distribution. In order to learn such predictors, we propose a model-agnostic framework, called Counterfactually Invariant Prediction (CIP), building on the Hilbert-Schmidt Conditional Independence Criterion (HSCIC), a kernel-based conditional dependence measure. Our experimental results demonstrate the effectiveness of CIP in enforcing counterfactual invariance across various simulated and real-world datasets including scalar and multi-variate settings. AU - Quinzan, F.* AU - Casolo, C. AU - Muandet, K.* AU - Luo, Y.* AU - Kilbertus, N. C1 - 73644 C2 - 57276 TI - Learning counterfactually invariant predictors. JO - Trans. Machine Learn. Res. VL - 2024 PY - 2024 SN - 2835-8856 ER - TY - JOUR AB - Conventional Bayesian Neural Networks (BNNs) are unable to leverage unlabelled data to improve their predictions. To overcome this limitation, we introduce Self-Supervised Bayesian Neural Networks, which use unlabelled data to learn models with suitable prior predictive distributions. This is achieved by leveraging contrastive pretraining techniques and optimising a variational lower bound. We then show that the prior predictive distributions of self-supervised BNNs capture problem semantics better than conventional BNN priors. In turn, our approach offers improved predictive performance over conventional BNNs, especially in low-budget regimes. AU - Sharma, M.* AU - Rainforth, T.* AU - Teh, Y.W.* AU - Fortuin, V. C1 - 73643 C2 - 57275 TI - Incorporating unlabelled data into bayesian neural networks. JO - Trans. Machine Learn. Res. VL - 2024 PY - 2024 SN - 2835-8856 ER - TY - JOUR AB - Computational topology recently started to emerge as a novel paradigm for characterising the ‘shape’ of high-dimensional data, leading to powerful algorithms in (un)supervised representation learning. While capable of capturing prominent features at multiple scales, topological methods cannot readily be used for Bayesian inference. We develop a novel approach that bridges this gap, making it possible to perform parameter estimation in a Bayesian framework, using topology-based loss functions. Our method affords easy integration into topological machine learning algorithms. We demonstrate its efficacy for parameter estimation in different simulation settings. AU - von Rohrscheidt, J.C. AU - Rieck, B. AU - Schmon, S.M.* C1 - 73646 C2 - 57278 TI - Bayesian computation meets topology. JO - Trans. Machine Learn. Res. VL - 2024 PY - 2024 SN - 2835-8856 ER -