TY - JOUR AB - The current generation of large language models (LLMs) has limited chemical knowledge. Recently, it has been shown that these LLMs can learn and predict chemical properties through fine-tuning. Using natural language to train machine learning models opens doors to a wider chemical audience, as field-specific featurization techniques can be omitted. In this work, we explore the potential and limitations of this approach. We studied the performance of fine-tuning three open-source LLMs (GPT-J-6B, Llama-3.1-8B, and Mistral-7B) for a range of different chemical questions. We benchmark their performances against "traditional" machine learning models and find that, in most cases, the fine-tuning approach is superior for a simple classification problem. Depending on the size of the dataset and the type of questions, we also successfully address more sophisticated problems. The most important conclusions of this work are that, for all datasets considered, their conversion into an LLM fine-tuning training set is straightforward and that fine-tuning with even relatively small datasets leads to predictive models. These results suggest that the systematic use of LLMs to guide experiments and simulations will be a powerful technique in any research study, significantly reducing unnecessary experiments or computations. AU - Van Herck, J.* AU - Gil, M.V.* AU - Jablonka, K.M.* AU - Abrudan, A.* AU - Anker, A.S.* AU - Asgari, M.* AU - Blaiszik, B.* AU - Buffo, A.* AU - Choudhury, L.* AU - Corminboeuf, C.* AU - Daglar, H.* AU - Elahi, A.M.* AU - Foster, I.T.* AU - García, S.A.* AU - Garvin, M.* AU - Godin, G.* AU - Good, L.L.* AU - Gu, J.* AU - Xiao Hu, N.* AU - Jin, X.* AU - Junkers, T.* AU - Keskin, S.* AU - Knowles, T.P.J.* AU - Laplaza, R.* AU - Lessona, M.* AU - Majumdar, S.K.* AU - Mashhadimoslem, H.* AU - McIntosh, R.D.* AU - Moosavi, S.M.* AU - Mouriño, B.* AU - Nerli, F.* AU - Pevida, C.* AU - Poudineh, N.* AU - Rajabi-Kochi, M.* AU - Saar, K.L.* AU - Hooriabad Saboor, F.* AU - Sagharichiha, M.* AU - Schmidt, K.J.* AU - Shi, J.* AU - Simone, E.* AU - Svatunek, D.* AU - Taddei, M.* AU - Tetko, I.V. AU - Tolnai, D.* AU - Vahdatifar, S.* AU - Whitmer, J.* AU - Wieland, D.C.F.* AU - Willumeit-Römer, R.* AU - Züttel, A.* AU - Smit, B.* C1 - 72738 C2 - 56722 CY - Thomas Graham House, Science Park, Milton Rd, Cambridge Cb4 0wf, Cambs, England SP - 670-684 TI - Assessment of fine-tuned large language models for real-world chemistry and material science applications. JO - Chem. Sci. VL - 16 IS - 2 PB - Royal Soc Chemistry PY - 2025 SN - 2041-6520 ER - TY - JOUR AB - Automated synthesis planning is key for efficient generative chemistry. Since reactions of given reactants may yield different products depending on conditions such as the chemical context imposed by specific reagents, computer-aided synthesis planning should benefit from recommendations of reaction conditions. Traditional synthesis planning software, however, typically proposes reactions without specifying such conditions, relying on human organic chemists who know the conditions to carry out suggested reactions. In particular, reagent prediction for arbitrary reactions, a crucial aspect of condition recommendation, has been largely overlooked in cheminformatics until recently. Here we employ the Molecular Transformer, a state-of-the-art model for reaction prediction and single-step retrosynthesis, to tackle this problem. We train the model on the US patents dataset (USPTO) and test it on Reaxys to demonstrate its out-of-distribution generalization capabilities. Our reagent prediction model also improves the quality of product prediction: the Molecular Transformer is able to substitute the reagents in the noisy USPTO data with reagents that enable product prediction models to outperform those trained on plain USPTO. This makes it possible to improve upon the state-of-the-art in reaction product prediction on the USPTO MIT benchmark. AU - Andronov, M.* AU - Voinarovska, V. AU - Andronova, N.* AU - Wand, M.* AU - Clevert, D.A.* AU - Schmidhuber, J.* C1 - 67661 C2 - 53968 CY - Thomas Graham House, Science Park, Milton Rd, Cambridge Cb4 0wf, Cambs, England SP - 3235-3246 TI - Reagent prediction with a molecular transformer improves reaction data quality. JO - Chem. Sci. VL - 14 IS - 12 PB - Royal Soc Chemistry PY - 2023 SN - 2041-6520 ER - TY - JOUR AB - X-pyrene is a new nucleic acid duplex stabilizing cytosine analogue that combines enhanced π-stacking, hydrogen bonding and electrostatic interactions to greatly increase the stability of bulged DNA duplexes and DNA/RNA hybrids. X-pyrene is highly selective for guanine as a partner and duplex stability is reduced dramatically when X-pyrene or a neighboring base is mismatched. An NMR study indicates that the pyrene moiety stacks within the helix, and large changes in fluorescence emission on duplex formation are consistent with this. These favorable properties make X-pyrene a promising cytosine analogue for use in a variety of biological applications. AU - Lou, C.* AU - Dallmann, A. AU - Geo, R.* AU - Brown, T.C.* C1 - 32107 C2 - 34973 CY - Cambridge SP - 3836-3844 TI - Enhanced H-bonding and π-stacking in DNA: A potent duplex-stabilizing and mismatch sensing nucleobase analogue. JO - Chem. Sci. VL - 5 IS - 10 PB - Royal Soc Chemistry PY - 2014 SN - 2041-6520 ER - TY - JOUR AB - A series of N-methyl-D-aspartate (NMDA) receptor-targeted MRI contrast agents has been developed, based on the known competitive NMDA antagonist, 3,4-diamino-3-cyclobutene-1,2-dione. Their use as responsive MR imaging probes has been evaluated in vitro and two contrast agents showed 170–176% enhancements in relaxation rate, following incubation with a neuronal cell line model. A derivative of the lead compound was prepared containing a biotin moiety, and both the specificity and reversibility of binding to the NMDA cell surface receptors demonstrated using confocal microscopy. AU - Sim, N.* AU - Gottschalk, S. AU - Pal, R.* AU - Engelmann, J.* AU - Parker, D.* AU - Mishra, A. C1 - 28499 C2 - 33429 SP - 3148-3153 TI - Responsive MR-imaging probes for N-methyl-D-aspartate receptors and direct visualisation of the cell-surface receptors by optical microscopy. JO - Chem. Sci. VL - 4 PB - Royal Soc.Chemistry PY - 2013 SN - 2041-6520 ER -