Neural-network-based outcome predictions may enable further treatment personalization of patients with head and neck cancer. The development of neural networks can prove challenging when a limited number of cases is available. Therefore, we investigated whether multitask learning strategies, implemented through the simultaneous optimization of two distinct outcome objectives (multi-outcome) and combined with a tumor segmentation task, can lead to improved performance of convolutional neural networks (CNNs) and vision transformers (ViTs). Model training was conducted on two distinct multicenter datasets for the endpoints loco-regional control (LRC) and progression-free survival (PFS), respectively. The first dataset consisted of pre-treatment computed tomography (CT) imaging for 290 patients and the second dataset contained combined positron emission tomography (PET)/CT data of 224 patients. Discriminative performance was assessed by the concordance index (C-index). Risk stratification was evaluated using log-rank tests. Across both datasets, CNN and ViT model ensembles achieved similar results. Multitask approaches showed favorable performance in most investigations. Multi-outcome CNN models trained with segmentation loss were identified as the optimal strategy across cohorts. On the PET/CT dataset, an ensemble of multi-outcome CNNs trained with segmentation loss achieved the best discrimination (C-index: 0.29, 95% confidence interval (CI): 0.22-0.36) and successfully stratified patients into groups with low and high risk of disease progression (p=0.003). On the CT dataset, ensembles of multi-outcome CNNs and of single-outcome ViTs trained with segmentation loss performed best (C-index: 0.26 and 0.26, CI: 0.18-0.34 and 0.18-0.35, respectively), both with significant risk stratification for LRC in independent validation (p=0.002 and p=0.011). Further validation of the developed multitask-learning models is planned based on a prospective validation study, which has recently completed recruitment.