TY - JOUR AB - With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN. AU - Laimighofer, M. AU - Krumsiek, J. AU - Buettner, F. AU - Theis, F.J. C1 - 48008 C2 - 39840 CY - New Rochelle SP - 279-290 TI - Unbiased prediction and feature selection in high-dimensional survival regression. JO - J. Comput. Biol. VL - 23 IS - 4 PB - Mary Ann Liebert, Inc PY - 2016 SN - 1066-5277 ER - TY - JOUR AB - Abstract Quorum sensing, a special kind of cell-cell communication, has originally been described for well-mixed homogeneous bacterial cultures. However, recent perception supports its ecological relevance for spatially heterogeneous distributed cells, like colonies and biofilms. New experimental techniques allow for single cell analysis under these conditions, which is crucial to understanding the effect of chemical gradients and intercell variations. Based on a reaction-diffusion system, we develop a method that drastically reduces the computational complexity of the model. In comparison to similar former approaches, handling and scaling is much easier. Via a suitable scaling, this approach leads to approximative algebraic equations for the stationary case. This approach can be easily used for numerical situations. AU - Gölgeli Matur, M. AU - Müller, J. AU - Kuttler, C. AU - Hense, B.A. C1 - 32650 C2 - 35205 CY - New Rochelle SP - 227-235 TI - An approximative approach for single cell spatial modeling of quorum sensing. JO - J. Comput. Biol. VL - 22 IS - 3 PB - Mary Ann Liebert, Inc PY - 2015 SN - 1066-5277 ER - TY - JOUR AB - In biology, more and more information about the interactions in regulatory systems becomes accessible, and this often leads to prior knowledge for recent data interpretations. In this work we focus on multivariate signaling data, where the structure of the data is induced by a known regulatory network. To extract signals of interest we assume a blind source separation (BSS) model, and we capture the structure of the source signals in terms of a Bayesian network. To keep the parameter space small, we consider stationary signals, and we introduce the new algorithm emGrade, where model parameters and source signals are estimated using expectation maximization. For network data, we find an improved estimation performance compared to other BSS algorithms, and the flexible Bayesian modeling enables us to deal with repeated and missing observation values. The main advantage of our method is the statistically interpretable likelihood, and we can use model selection criteria to determine the (in general unknown) number of source signals or decide between different given networks. In simulations we demonstrate the recovery of the source signals dependent on the graph structure and the dimensionality of the data. AU - Illner, K. AU - Fuchs, C. AU - Theis, F.J. C1 - 32522 C2 - 35101 CY - New Rochelle SP - 855-865 TI - Bayesian blind source separation for data with network structure. JO - J. Comput. Biol. VL - 21 IS - 11 PB - Mary Ann Liebert, Inc PY - 2014 SN - 1066-5277 ER - TY - JOUR AB - Diffusion geometry techniques are useful to classify patterns and visualize high-dimensional datasets. Building upon ideas from diffusion geometry, we outline our mathematical foundations for learning a function on high-dimension biomedical data in a local fashion from training data. Our approach is based on a localized summation kernel, and we verify the computational performance by means of exact approximation rates. After these theoretical results, we apply our scheme to learn early disease stages in standard and new biomedical datasets. AU - Ehler, M. AU - Filbir, F. AU - Mhaskar, H.N.* C1 - 11261 C2 - 30593 SP - 1251-1264 TI - Locally learning biomedical data using diffusion frames. JO - J. Comput. Biol. VL - 19 IS - 11 PB - Mary Ann Liebert Inc. PY - 2012 SN - 1066-5277 ER - TY - JOUR AB - Somitogenesis describes the segmentation of vertebrate embryonic bodies, which is thought to be induced by ultradian clocks (i.e., clocks with relatively short cycles compared to circadian clocks). One candidate for such a clock is the bHLH factor Hes1, forming dimers which repress the transcription of its own encoding gene. Most models for such small autoregulative networks are based on delay equations where a Hill function represents the regulation of transcription. The aim of the present paper is to estimate the Hill coefficient in the switch of an Hes1 oscillator and to suggest a more detailed model of the autoregulative network. The promoter of Hes1 consists of three to four binding sites for Hes1 dimers. Using the sparse data from literature, we find, in contrast to other statements in literature, that there is not much evidence for synergistic binding in the regulatory region of Hes1, and that the Hill coefficient is about three. As a model for the negative feedback loop, we use a Goodwin system and find sustained oscillations for systems with a large enough number of linear differential equations. By a suitable variation of the number of equations, we provide a rational lower bound for the Hill coefficient for such a system. Our results suggest that there exist additional nonlinear processes outside of the regulatory region of Hes1. AU - Zeiser, S. AU - Müller, J.* AU - Liebscher, V.* C1 - 238 C2 - 24965 SP - 984-1000 TI - Modeling the Hes1 oscillator. JO - J. Comput. Biol. VL - 14 IS - 7 PB - Liebert PY - 2007 SN - 1066-5277 ER -