Consequently, because of the computational cost, conventional docking of ultra-large libraries remains unaffordable for most of the research community. Hence, various machine learning emulation techniques of docking have been proposed to perform such tasks without large computing resources. In Supplementary Table 1, we have listed several of such methods that have been developed as a proof of concept to approximate docking scores from molecular structural features (descriptors)9,18,19,20,21,22. Although these methodologies cannot be easily compared (owing to the use of different benchmark sets and docking libraries), it is possible to stipulate that DD is one of the fastest AI-enabled docking platforms and the only method that has been extensively tested on 1B+ libraries. In addition, the DD protocol does not rely on a particular docking program, and thus it is compatible with the emerging large-scale docking methods to improve their high-throughput capabilities.
A test recall value that differs by >0.015 from the predefined value indicates poor model generalizability caused by validation and test sets that are not sufficiently large. As already indicated in the Critical step of Step 14, the user should generate validation and test sets of as large a size as possible in the first iteration.
(If needed) When during the first iteration the precision value is 40% of the starting number in the database, restart the iteration and increase the training set size. If the training set cannot be further increased or its increase does not have a significant effect on the number of remaining molecules, we recommend rerunning model training from Step 22 by decreasing the recall value at Steps 22 and 24 by 0.05 and repeating this process until 50% of the original molecules in the library are retained. Setting the size of the molecular sets to 700,000 molecules (Example 2) resulted in substantially higher precision and AUC values, which allowed us to discard ~79% of the original molecules, and improved model generalizability as well.
(Recommended) When inference is finished, compare the number of positively predicted molecules in morgan_1024_predictions with the Total Left value in best_model_stats.txt to confirm model generalizability. If the values are substantially different, larger validation and test sets must be regenerated, and training and inference must be repeated, as described in Step 26.
Using a text editor, create a logs.txt file (Box 3) in the project folder, listing the path to the project folder, the project folder name, the location of the docking grid, the location of the fingerprint library, the location of the SMILES library, the docking program (either Glide or FRED), the number of models to train at each iteration (the recommended value is 24; allowed values are 16, 24, 48, 72 and 144), the desired size of validation and test sets and the location of the Glide template docking script (not required if a different docking program is used).
When inference jobs are completed, we suggest comparing the number of positively predicted molecules in morgan_1024_predictions with the Total Left value in best_model_stats.txt, to confirm model generalizability. If the values are significantly different, validation and test sets must be regenerated with a larger size, and training and inference must be repeated, as described in Step 26 and the related Troubleshooting for Procedure 1. 2b1af7f3a8