OBJECTIVE: Reliable image quality assessment is crucial for evaluating new motion correction methods for magnetic resonance imaging. We compare the performance of common reference-based and reference-free image quality metrics on unique datasets with real motion artifacts, and analyze the metrics' robustness to typical pre-processing techniques. MATERIALS AND METHODS: We compared five reference-based and five reference-free metrics on brain data acquired with and without intentional motion (2D and 3D sequences). The metrics were recalculated seven times with varying pre-processing steps. Spearman correlation coefficients were computed to assess the relationship between image quality metrics and radiological evaluation. RESULTS: All reference-based metrics showed strong correlation with observer assessments. Among reference-free metrics, Average Edge Strength offers the most promising results, as it consistently displayed stronger correlations across all sequences compared to the other reference-free metrics. The strongest correlation was achieved with percentile normalization and restricting the metric values to the skull-stripped brain region. In contrast, correlations were weaker when not applying any brain mask and using min-max or no normalization. DISCUSSION: Reference-based metrics reliably correlate with radiological evaluation across different sequences and datasets. Pre-processing significantly influences correlation values. Future research should focus on refining pre-processing techniques and exploring approaches for automated image quality evaluation.