Fig. 1 (A) The three databases for evaluating theoretical methods against experimental solubilities. (B) Accuracy comparisons of CombiSolv-QM and QM-DB for the data points overlapping with Exp-DB. (C) Architecture of the graph neural network for solubility. ASA: Accessible surface area, TPSA: Topological polar surface area.
Yeonjoon, Hojin, and Sabari in Prof. Kim Group and Prof. Paton had their work regarding the graph neural networks for solubility prediction accepted for publication in Chemical Science. This work reconciles the different magnitudes of error and uncertainties of experimental and computational databases of solubility through semi-supervised distillation (SSD) scheme using graph neural network model for solubility prediction. This approach led to the maximization of the database size and model’s prediction accuracy, and applied to two practical examples of solvent design: (1) solvation free energy vs. reaction rates, (2) log P prediction of lignin-derived monomers and drug-like molecules.
To learn more about this work, Click Here