Abstract
Machine learningbased methods have shown potential for optimizing existing molecules with more desirable properties, a critical step towards accelerating new chemical discovery. Here we propose QMO, a generic querybased molecule optimization framework that exploits latent embeddings from a molecule autoencoder. QMO improves the desired properties of an input molecule based on efficient queries, guided by a set of molecular property predictions and evaluation metrics. We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules for druglikeness and solubility under similarity constraints. We also demonstrate substantial property improvement using QMO on two new and challenging tasks that are also important in realworld discovery problems: (1) optimizing existing potential SARSCoV2 main protease inhibitors towards higher binding affinity and (2) improving known antimicrobial peptides towards lower toxicity. Results from QMO show high consistency with external validations, suggesting an effective means to facilitate material optimization problems with design constraints.
Main
Molecule optimization (MO) for improving the structural and/or functional profile of a molecule is an essential step for many scientific and engineering applications, including chemistry, drug discovery, bioengineering and material science. Without further modelling or use of prior knowledge, the challenge of MO lies in searching over the prohibitively large space composed of all possible molecules and generating new, valid and optimal ones. In recent years, machine learning has shown to be a promising tool for MO by combining domain knowledge and datadriven learning for efficient discovery^{1,2,3,4}. Compared to traditional highthroughput wetlab experiments or computer simulations, which are timeconsuming and expensive^{5,6}, machine learning can accelerate MO by enabling iterative improvements based on instant feedback from realtime model prediction and analysis^{7,8}, thereby reducing the gap between initial discovery and subsequent optimization and production of materials for various applications. For example, machine learningdriven MO can enable the prompt design of optimized candidates starting from existing lead molecules, leading to potentially better inhibition of severe acute respiratory syndrome coronavirus 2 (SARSCoV2) proteins. It is now well accepted that the majority of existing drugs fail to show the desired binding (and inhibition) to SARSCoV2 targets, mostly due to the novel nature of the SARSCoV2 virus^{9,10}. Therefore, optimization of existing lead molecules towards better SARSCoV2 target binding affinity while keeping the molecular similarity high appears a promising first step for optimal drug design for coronavirus disease 2019 (COVID19). Similarly, an efficient MO method can guide the design of antimicrobials with better optimized toxicity to fight against resistant pathogens, one of the biggest threats to global health^{11}. Without loss of generality, we refer to a lead molecule as the starting molecule to be optimized to meet a set of desired properties and constraints. Many recent research studies that focus on machine learningenabled MO represent a molecule as a string consisting of chemical units. For small organic molecules, the SMILES representation^{12} is widely used, whereas for peptide sequences, a text string comprised of aminoacid characters is a popular representation. Often, for efficiency reasons, the optimization is performed on a learned representation space of the system of interest, which describes molecules as embedding vectors in a lowdimensional continuous space. A sequencetosequence encoder–decoder model, such as a (variational) autoencoder, can be used to learn continuous representations of the molecules in a latent space. Moreover, different optimization or sampling techniques based on the latent representation can be used to improve a molecule with external guidance from a set of molecular property predictors and simulators. The external guidance can be either explicitly obtained from physicsbased simulations, (chem/bio)informatics and wetlab experiments or implicitly learned from a chemical database.
Based on the methodology, the related works on machine learning for MO can be divided into two categories: guided search and translation. Guided search uses guidance from the predictive models and/or evaluations from statistical models, where the search can be either in the discrete molecule sequence space or through a continuous latent space (or distribution) learned by an encoder–decoder. Genetic algorithms^{13,14,15} and Bayesian optimization (BO)^{16} have been proposed for searching in the discrete sequence space, but their efficiency can be low in the case of a high search dimension. Recent works have exploited latent representation learning and different optimization/sampling techniques to achieve an efficient search. Examples include the combined use of a variational autoencoder (VAE) and BO^{17,18,19,20}, VAE and Gaussian sampling^{21}, VAE and sampling guided by a predictor^{22,23}, VAE and evolutionary algorithms^{24}, deep reinforcement learning and/or a generative network^{25,26,27,28,29} and attributeguided rejection sampling on an autoencoder^{30}. The translationbased approach, on the other hand, treats molecule generation as a sequencetosequence translation problem^{31,32,33,34}. Examples include translation with junctiontree^{35,36}, shape features^{18}, hierarchical graphs^{37} and transfer learning^{38}. Comparing to guided search, translationbased approaches require the additional knowledge of paired sequences for learning to translate a lead molecule into an improved molecule. This knowledge may not be available for new MO tasks with limited information. For example, in the task of optimizing a set of known inhibitor molecules to better bind to the SARSCoV2 target protein sequence while preserving the desired drug properties, a sufficient number of such paired molecule sequences is unavailable. We also note that these two categories are not exclusive. Guided search can be jointly used with translation.
In this Article, we propose a novel querybased molecule optimization (QMO) framework, as illustrated in Fig. 1. In this context, a query to a designed loss function for QMO gives the corresponding numerical value obtained through the associated property evaluations. Efficiency refers to the performance of the optimization results given a query budget. QMO uses an encoder–decoder and external guidance, but it differs from existing works in the following aspects. (1) QMO is a generic endtoend optimization framework that reduces the problem complexity by decoupling representation learning and guided search. It applies to any plugin (pretrained) encoder–decoder with continuous latent representations. It is also a unified and principled approach that incorporates multiple predictions and evaluations made directly at the molecule sequence level into guided search without further model fitting. (2) To achieve efficient endtoend optimization with discrete molecule sequences and their continuous latent representations, QMO adopts a novel querybased guided search method based on zerothorder optimization^{39,40}, a technique that performs efficient mathematical optimization using only function evaluations (more details are provided in Supplementary Section 1). Its querybased guided search enables direct optimization over the property evaluations provided by chemical informatics/simulation software packages or prediction application programming interfaces (APIs), and it supports guided search with exact property evaluations that only operate at the molecular sequence level instead of latent representations or surrogate models. To the best of our knowledge, this work is the first study that facilitates molecule optimization by disentangling molecule representation learning and guided search, and by exploiting zerothorder optimization for efficient search in the molecular property landscape. The success of QMO can be attributed to its data efficiency, achieved by exploiting the latent representations learned from abundant unlabelled data and the guidance for property prediction trained on relatively limited labelled data.
We first demonstrate the effectiveness of QMO by means of two sets of standard benchmarks. On two existing (and simpler) MO benchmark tasks—optimizing QED (quantitative estimate of druglikeness)^{41} and penalized logP (reflecting the octanol–water partition coefficient)^{22} with similarity constraints—QMO attains superior performance over existing baselines, showing at least 15% higher success on QED optimization and an absolute improvement of 1.7 on penalized logP. The performance of QMO on these molecular physical property benchmarks shows its potential for optimizing material design before synthesis, which is critical in many applications and fields, such as the food industry, agrochemicals, pesticides, drugs, catalysts and waste chemicals.
Next, as a motivating discovery use case that also, at least to some extent, reflects the complexity of real discovery problems^{42}, we demonstrate how QMO can be used to improve the binding affinity of a number of existing inhibitor molecules for the SARSCoV2 main protease (M^{pro}), one of the most extensively studied drug targets for SARSCoV2. As an illustration, Fig. 2 shows the top docking poses of dipyridamole and its QMOoptimized variant with SARSCoV2 M^{pro}. We formulate this task as an optimization over predicted binding affinity (obtained using a pretrained machine learning model) starting from an existing molecule of interest (a lead molecule). Because experimental halfmaximal inhibitory concentration (IC_{50}) values are widely available, we use them as a measure for protein–ligand binding affinity. The pIC_{50} of the optimized molecule (where \({{{\rm{pIC}}}}_{50}={{{\mathrm{log}}}\,}_{10}({{{\rm{IC}}}}_{50})\)) is constrained to be above 7.5, a sign of good affinity, while the Tanimoto similarity between the optimized molecule and the original molecule is maximized. Retaining high similarity while optimizing the initial lead molecule means important chemical characteristics can be maximally preserved. Moreover, a high similarity to existing leads is important for a rapid response to a novel pathogen such as SARSCoV2, as it is more likely to leverage existing knowledge and manufacturing pipelines for the synthesis and wetlab evaluation of the optimized variants. Moreover, the chance of optimized variants inducing adverse effects is potentially low. Our results show that QMO can find molecules with high similarity and improved affinity, while preserving other properties of interest such as druglikeness.
We also consider the task of optimizing existing antimicrobial peptides towards lower selective toxicity, which is critical for accelerating safe antimicrobial discovery. In this task, QMO shows a high success rate (~72%) in improving the toxicity of antimicrobial peptides, and the properties of optimized molecules are consistent with external toxicity and antimicrobial activity classifiers. Finally, we perform property landscape visualization and trajectory analysis of QMO to illustrate its efficiency and diversity in finding improved molecules with desired properties.
We emphasize that QMO is a genericpurpose optimization algorithm that enables optimization over discrete spaces (for example, sequences and graphs), which involves searching over a latent space of the system by using guidance from (expensive) blackbox function evaluations. Beyond the organic and biological molecule optimization applications considered in this study, QMO can be applied to the optimization of other classes of material, such as inorganic solidstate materials like metal alloys or metal oxides.
Results
Representations of molecules
In our QMO framework, we model a molecule as a discrete string of chemical or aminoacid characters (a sequence). Depending on the downstream MO tasks, the sequence representation can either be a string of natural amino acids^{30,43} or a string designed for encoding small organic chemicals. In particular, the simplified molecular input line entry specification (SMILES) representation^{12} describes the structure of small organic molecules using short ASCII strings. Without loss of generality, we define \({{\mathbb{X}}}^{m}:= {\mathbb{X}}\times {\mathbb{X}}\cdots \times {\mathbb{X}}\) as the product space containing every possible molecule sequence of length m, where \({\mathbb{X}}\) denotes the set of all chemical characters. To elucidate the problem complexity, considering the 20 proteinbuilding amino acids as characters in a peptide sequence, the number of possible candidates in the space of sequences with length m = 60 is already reaching the number of atoms in the known universe (~10^{80}). Similarly, the space of small molecules with therapeutic potential is estimated to be on the order of 10^{60} (refs. ^{44,45}). Therefore, the problem of MO in the ambient space \({{\mathbb{X}}}^{m}\) can be computationally inefficient as the search space grows combinatorially with sequence length m.
Encoder–decoder for learning latent molecule representations
To address the issue of the large search space for molecule sequences, QMO adopts an encoder–decoder framework. The encoder \({{{\rm{Enc}}}}:{{\mathbb{X}}}^{m}\mapsto {{\mathbb{R}}}^{d}\) encodes a sequence \({x}\in {{\mathbb{X}}}^{m}\) to a lowdimensional continuous realvalued representation of dimension d, denoted by an embedding vector z = Enc(x). The decoder \({{{\rm{Dec}}}}:{{\mathbb{R}}}^{d}\mapsto {{\mathbb{X}}}^{{m}^{\prime}}\) decodes the latent representation z of x back to the sequence representation, denoted by \({\widehat{x}}={{{\rm{Dec}}}}{(z)}\). We note that, depending on the encoder–decoder implementation, the input sequence x and the decoded sequence \({\widehat{x}}\) may have different lengths. On the other hand, the latent dimension d is universal (fixed) to all sequences. Winter et al.^{46} proposed a novel molecular descriptor and used it for an autoencoder to learn latent representations featuring high similarity between the original and reconstructed sequences. QMO applies to any plugin (pretrained) encoder–decoder with continuous latent representations and thus decouples representation learning and guided search, reducing the problem complexity of MO.
MO formulation via guided search
In addition to leveraging learned latent representations from a molecule encoder–decoder, our QMO framework incorporates molecular property prediction models and similarity metrics at the sequence level as external guidance. Specifically, for any given sequence \({x}\in {{\mathbb{X}}}^{m}\) we use a set of I separate prediction models \({\{{f}_{i}{(x)}\}}_{i = 1}^{I}\) to evaluate the properties of interest for MO. In principle, for a candidate sequence x, a set of thresholds \({\{{\tau }_{i}\}}_{i = 1}^{I}\) on its property predictions \({\{{f}_{i}{(x)}\}}_{i = 1}^{I}\) is used for validating the condition f_{i}(x) ≥ τ_{i} for all i ∈ [I], where [I] denotes the integer set {1, 2, …, I}. Moreover, we can simultaneously impose a set of J separate constraints \({\{{g}_{j}{(x {\mathsf{S}})}\ge {\eta }_{j}\}}_{j = 1}^{J}\) in the optimization process, such as molecular similarity, relative to a set of reference molecule sequences denoted by S.
Our QMO framework covers two practical cases in MO: (1) optimizing molecular similarity while satisfying the desired chemical properties and (2) optimizing chemical properties with similarity constraints. It can be easily extended to other MO settings that can be formulated via \({\{{f}_{i}{(x)}\}}_{i = 1}^{I}\) and \({\{{g}_{j}{(x {\mathsf{S}})}\}}_{j = 1}^{J}\).
In what follows, we formally define our designed loss function of QMO for case (1). Given a starting molecule sequence x_{0} (a lead molecule) and a pretrained encoder–decoder, let x = Dec(z) denote a candidate sequence decoded from a latent representation \({z}\in {{\mathbb{R}}}^{d}.\) Our QMO framework aims to find an optimized sequence by solving the following continuous optimization problem:
The first term, \({\mathop{\sum }\nolimits_{i = 1}^{I}}{\max} \{{{\tau }_{i}}{{f}_{i}}({{{\rm{Dec}}}}{(z)}),{0}\}\), quantifies the loss of property constraints and is presented as the sum of hinge loss over all property predictions, which approximates the binary property validation relative to the required thresholds \({\{{\tau }_{i}\}}_{i = 1}^{I}\). It achieves the optimal value (that is, 0) only when the candidate sequence x = Dec(z) satisfies all the desired properties, which is equivalent to the condition that f_{i}(Dec(z)) ≥ τ_{i} for all i ∈ [I]. The second term, \({\mathop{\sum }\nolimits_{j = 1}^{J}}{{\lambda }_{j}}\cdot {{g}_{j}}({{{\rm{Dec}}}}{(z)} {\mathsf{S}})\), corresponds to a set of molecular similarity scores to be maximized (therefore a minus sign in the minimization formulation). The reference sequence set S can be the starting sequence such that S = {x_{0}}, or a set of molecules. The positive coefficients \({\{{\lambda }_{j}\}}_{j = 1}^{J}\) are associated with the set of molecular similarity scores \({\{{{g}_{j}}({{{\rm{Dec}}}}{(z)} {\mathsf{S}})\}}_{j = 1}^{J}\), respectively. It is worth mentioning that the use of the latent representation z as the optimization variable in a lowdimensional continuous space greatly facilitates the original MO problem in a highdimensional discrete space. The optimization variable z can be initialized as the latent representation of x_{0}, denoted by z_{0} = Enc(x_{0}).
Similarly, for case (2), the optimization problem is formulated as
where \({\{{\eta }_{j}\}}_{j = 1}^{J}\) are the similarity score constraints and \({\{{\gamma }_{i}\}}_{i = 1}^{I}\) are positive coefficients of the property scores \({\{{f}_{i}({{{\rm{Dec}}}}(z))\}}_{i = 1}^{I}\).
QMO procedure
Although we formulate MO as an unconstrained continuous minimization problem, we note that solving it for a feasible candidate sequence x = Dec(z) is not straightforward, for two reasons. (1) The output of the decoder x = Dec(z) is a discrete sequence, which imposes challenges on any gradientbased (and highorder) optimization method because acquiring the gradient of z becomes nontrivial. Even by resorting to the Gumbelsoftmax sampling trick for discrete outputs^{47}, the large output space of the decoder may render it ineffective. (2) In practice, many molecular property prediction models and molecular metrics are computed in an accesslimited environment, such as prediction APIs and chemical software, which only allow inference on a queried sequence and prohibit other functionalities such as gradient computation. To address these two issues, we use zerothorder optimization in our QMO framework (details of the procedure are provided in the Methods) to provide a generic and modelagnostic approach for solving the problem formulation in equations (1) and (2), using only the inference results of \({\{{f}_{i}\}}_{i = 1}^{I}\) and \({\{{g}_{j}\}}_{j = 1}^{J}\) on queried sequences.
Let Loss(z) denote the objective function to be minimized, as defined in either equation (1) or equation (2). Our QMO framework uses zerothorder gradient descent to find a solution, which mimics the descent steps on the loss landscape in gradientbased solvers but only uses the function values Loss(⋅) of queried sequences. Specifically, at the tth iteration of the zerothorder optimization process, the iterate (candidate embedding vector) z^{(t + 1)} is updated by
where α_{t} ≥ 0 is the step size at the tth iteration, and the true gradient ∇Loss(z^{(t)}) (which is challenging or infeasible to compute) is approximated by the pseudo gradient \({\widehat{\nabla }}{\mathtt{Loss}}({z}^{(t)})\). The pseudo gradient \({\widehat{\nabla }}{\mathtt{Loss}}({z}^{(t)})\) is estimated by Q independent random directional queries defined as
where d is the dimension of the latent space of the encoder–decoder used in QMO, and β > 0 is a smoothing parameter used to perturb the embedding vector z^{(t)} for neighbourhood sampling with Q random directions \({\{{u}^{(q)}\}}_{q = 1}^{Q}\) that are independently and identically sampled on a ddimensional unit sphere. Figure 1 presents an illustration of random neighbourhood sampling. In our implementation, we sample \({\{{u}^{(q)}\}}_{q = 1}^{Q}\) using a zeromean ddimensional isotropic Gaussian random vector divided by its Euclidean norm, such that the resulting samples are drawn uniformly from the unit sphere. Intuitively, the gradient estimator in equation (4) can be viewed as an average of Q random directional derivatives along the sampled directions \({\{{u}^{(q)}\}}_{q = 1}^{Q}\). The constant \({\frac{d}{\beta \cdot Q}}\) in equation (4) ensures the norm of the estimated gradient is of the same order as that of the true gradient^{39,40}.
A schematic example of the QMO procedure is illustrated in Fig. 1 using binding affinity and Tanimoto similarity as property evaluation criteria. Note that, based on the iterative optimization step in equation (3), QMO only uses function values queried at the original and perturbed sequences for optimization. The query count made on the Loss function for computing \({\widehat{\nabla }}{\mathtt{Loss}}({z}^{(t)})\) is Q + 1 per iteration. Larger Q further reduces the gradient estimation error at the price of increased query complexity. When solving equation (1), an iterate z^{(t)} is considered as a valid solution if its decoded sequence Dec(z^{(t)}) satisfies the property conditions f_{i}(Dec(z^{(t)})) ≥ τ_{i} for all i ∈ [I]. Similarly, when solving equation (2), a valid solution z^{(t)} means g_{j}(Dec(z^{(t)}∣S)) ≥ η_{j} for all j ∈ [J]. Finally, QMO returns a set of found solutions (returning null if in vain). Detailed descriptions for the QMO procedure are provided in the Methods.
Three sets of MO tasks with multiple property evaluation criteria
In what follows, we demonstrate the performance of our proposed QMO framework on three sets of tasks that aim to optimize molecular properties with constraints, including standard MO benchmarks and challenging tasks relating to realworld discovery problems. The pretrained encoder–decoder and the hyperparameters of QMO for each task are specified in the Methods and in Supplementary Section 3.
Benchmarks on QED and penalized logP optimization
We start by testing QMO on two single property targets: penalized logP and QED^{41}. LogP is the logarithm of the partition ratio of the solute between octanol and water. Penalized logP is defined as the logP minus the synthetic accessibility (SA) score^{22}. Given a similarity constraint, finding an optimized molecule that improves the druglikeness of compounds using the QED score (from a range of [0.7, 0.8] to [0.9, 1.0])^{41} or improves the penalized logP score^{22} are two widely used benchmarks. For a pair of original and optimized sequences (x_{0}, x), we use the QMO formulation in equation (2) with the Tanimoto similarity (ranging from 0 to 1) over Morgan fingerprints^{48} as g_{Tanimoto}(x∣x_{0}) and the interested property score (QED or penalized logP) as f_{score}(x). Following the same setting as existing works, the threshold δ for g_{Tanimoto}(x∣x_{0}) is set as either 0.4 or 0.6. We use RDKit (opensource cheminformatics; http://www.rdkit.org) to compute QED and logP, and use MOSES^{49} to compute SA, where f_{penalized logP}(x) = logP(x) − SA(x).
In our experiments, we use the same set of 800 molecules with low penalized logP scores and 800 molecules with QED ∈ [0.7, 0.8] chosen from the ZINC test set^{50} as in Jin et al.^{22} as our starting sequences. We compare QMO with various guidedsearch and translationbased methods in Extended Data Figs. 1 and 2. Baseline results are obtained from the literature^{35,38} that use machine learning for solving the same task.
For the QED optimization task, the success rate, defined as the percentage of improved molecules having similarity greater than δ = 0.4, is shown in Extended Data Fig. 1. QMO outperforms all baselines by at least 15%. For penalized logP task, the molecules optimized by QMO outperform the baseline results by a notable margin, as shown in Extended Data Fig. 2. The increased standard deviation in QMO is an artefact of having some molecules with much improved penalized logP scores (Supplementary Section 4).
Although the abovementioned molecular property optimization tasks provide welldefined benchmarks for testing our QMO algorithm, it is well recognized that such tasks are easy to solve and do not capture the complexity associated with realworld discovery^{51}. For example, it is trivial to achieve stateoftheart results for logP optimization by generating long saturated hydrocarbon chains^{52}. Coley et al.^{42} have proposed that MO goals that better reflect the complexity of real discovery tasks might include binding or selectivity attributes. Therefore, in the remainder of this Article we consider two such tasks: (1) optimizing the binding affinity of existing SARSCoV2 M^{pro} inhibitor molecules and (2) lowering the toxicity of known antimicrobial peptides.
Optimizing existing SARSCoV2 main protease inhibitor molecules toward better IC_{50}
To provide a timely solution and accelerate the drug discovery against a new virus such as SARSCoV2, it is a sensible practice to optimize known leads to facilitate design and production as well as to minimize the emergence of adverse effects. Here we focus on the task of optimizing the parentmolecule structure of a set of existing SARSCoV2 M^{pro} inhibitors. Specifically, we use the QMO formulation in equation (1), a pretrained binding affinity predictor^{53} f_{affinity} (the output is the pIC_{50} value) and the Tanimoto similarity g_{Tanimoto} between the original and optimized molecules. Given a known inhibitor molecule x_{0}, we aim to find an optimized molecule x such that f_{affinity}(x) ≥ τ_{affinity} while g_{Tanimoto}(x∣x_{0}) is maximized.
For this task, we start by assembling 23 existing molecules shown to have weak to moderate affinity with SARSCoV2 M^{pro}^{54,55}. These are generally in the micromolar range of IC_{50}, a measure of inhibitory potency (Supplementary Section 3 describes the experimental IC_{50} values). We choose the target affinity threshold τ_{affinity} as pIC_{50} ≥ 7.5, which implies strong affinity. Table 1 shows the final optimized molecules compared to their initial state (that is, the original lead molecule). We highlight common substructures and show a similarity map to emphasize the changes. The results of all 23 inhibitors are summarized in Supplementary Table 2.
Because all of these 23 inhibitors are reported to bind to the substratebinding pocket of M^{pro}, we investigate possible binding mode alterations of the QMOoptimized molecules. It should be noted that a direct comparison of IC_{50} with binding free energy (BFE) is not always possible, as the relationship of binding affinity and IC_{50} for a given compound varies depending on the assay conditions and the compound’s mechanism of inhibition^{56}. Furthermore, highfidelity BFE estimation requires accounting for factors such as conformational entropy and the explicit presence of the solvent. Nevertheless, we report the BFE and mode for the QMOoptimized variants. For simplicity, we limit the analysis to achiral molecules. First, we run blind docking simulations using AutoDock Vina^{57} over the entire structure of M^{pro} with the exhaustiveness parameter set to 8. We further rescore the top three docking poses for each of the original and QMOoptimized molecules using the molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) method and AMBER forcefield^{58}, which is known to be more rigorous and accurate than the scoring function used in docking. Next we inspect if any of the top three docking poses of the original as well as of QMOoptimized variants involve the substratebinding pocket of M^{pro}, as favourable interaction with that pocket is crucial for M^{pro} function inhibition.
As an illustration, Fig. 2 shows the top docking pose of dipyridamole and its QMOoptimized variant to the M^{pro} substratebinding pocket. Consistent with the more favourable MM/PBSA BFE, the QMOoptimized variant forms 14% more contacts (with a 5Å distance cutoff between heavy atoms) with the M^{pro} substratebinding pocket compared to dipyridamole. Some of the M^{pro} residues that explicitly form contacts with the dipyridamole variant are Leu167, Asp187, Arg188 and Gln192. Similar observations, for example, a higher number of contacts with the M^{pro} substratebinding pocket that involve Tyr54, were found for other exemplars of QMO variants, such as favipiravir and umifenovir. Supplementary Section 4 provides an extended analysis of blind docking.
Optimization of existing antimicrobial peptides towards improved toxicity
As an additional motivating use case, discovering new antibiotics at rapid speed is critical for tackling the looming crisis of a global increase in antimicrobial resistance^{11}. Antimicrobial peptides (AMPs) are considered promising candidates for nextgeneration antibiotics. Optimal AMP design requires balancing between multiple, tightly interacting attribute objectives^{59,60}, such as high potency and low toxicity. In an attempt to address this challenge, we show how QMO can be used to find improved variants of known AMPs with reported/predicted toxicity, such that the variants have lower predicted toxicity and high sequence similarity compared to original AMPs.
For the AMP optimization task, a peptide molecule is represented as a sequence of 20 natural aminoacid characters. Using the QMO formulation in equation (1), subject to the constraints of the toxicity prediction value (f_{tox}) and the AMP prediction value (f_{AMP}), we aim to find most similar molecules for a set of toxic AMPs. The sequence similarity score (g_{sim}) to be maximized is computed using Biopython (http://www.biopython.org), which uses global alignment between two sequences (normalized by the length of the starting sequence) to evaluate the best concordance of their characters. Detailed descriptions are provided in the Methods.
The objective of QMO is to search for improved AMP sequences by maximizing similarity while satisfying the AMP activity and toxicity predictions (that is, classified as being an AMP and nontoxic based on predictions from pretrained deep learning models^{30}).
In our experiments, we use QMO to optimize 150 experimentally verified toxic AMPs collected from public databases^{61,62} by Das et al.^{30} as starting sequences. Note that the toxic annotation here does not depend on a specific type of toxicity, such as haemolytic toxicity. Extended Data Fig. 3 shows their cumulative success rate (turning toxic AMPs into nontoxic AMPs) using QMO up to the tth iteration. Within the first few iterations, more than 60% molecules are successfully optimized. Eventually, ~72.67% (109/150) molecules can be successfully optimized. Analysis over all 109 original–improved pairs reveals notable physicochemical changes, for example, lowering of hydrophobicity and hydrophobic moment in the QMOoptimized AMP sequences (Fig. 3a,b and Supplementary Table 8). This trend is consistent with the reported positive correlation of hydrophobicity and hydrophobic moment with cytotoxicity and haemolytic activity^{63,64}. Figure 3c shows examples of known AMPs and their QMOoptimized variant sequences. Sequence alignment and similarity ratio relative to the original sequence are also shown, indicating that sequences resulting from QMO differ widely from the initial ones. Supplementary Fig. 3 depicts the optimization process of some AMP sequences. QMO can further improve similarity while maintaining low predicted toxicity and high AMP values for the specified thresholds after the first success.
We perform additional validation of our optimization results by comparing QMOoptimized sequences using a number of stateoftheart AMP and toxicity predictors that are external classifiers not used in the QMO framework. Extended Data Fig. 4 shows the external classifiers’ prediction results on 109 original and improved sequence pairs that are successfully optimized by QMO. We note that these external classifiers vary in terms of training data size and type, as well as on model architecture, and report a range of accuracy. Data and models for the toxicity prediction task are more rare than those for the AMP classification problem. Furthermore, external toxicity classifiers such as HAPPENN^{65} and HLPpredfuse^{66} explicitly target predicting the haemolytic nature. For these reasons, the predictions of the external classifiers on the original lead sequences may vary when compared to groundtruth labels (third column in Supplementary Table 7). Nonetheless, predictions on the QMOoptimized sequences using external classifiers show high consistency in terms of toxicity improvement when compared with the predictors used in QMO.
Specifically, the predictions from iAMP2L^{67} and HAPPENN^{65} (haemolytic toxicity prediction) show that 56.88% (62/109) of QMOoptimized molecules are predicted as nontoxic AMPs. In Supplementary Section 7.1, we also show that the use of a better encoder–decoder helps the optimization performance of QMO.
Property landscape visualization and trajectory analysis
To gain a better understanding of how QMO optimizes a lead molecule with respect to the property constraints and objectives, we provide visual illustration of the property landscapes and search trajectories via QMO using a twodimensional (2D) local interpolation on the molecule embedding space. Specifically, given the original embedding z_{0} and the embedding of the best candidate z* returned by QMO, we perform local grid sampling following two selected directions v_{x} and v_{y}, and then evaluate the properties of the decoded sequences from the sampled embeddings for property landscape analysis. For the purpose of visualizing the property landscape in low dimensions, we project the highdimensional search trajectories \({\{{z}_{t}\}}_{t = 1}^{T}\) to the two directions v_{x} and v_{y}. Figure 4a shows the landscape of Tanimoto similarity versus binding affinity prediction when using remdesivir as the lead molecule, with the optimization objective of maximizing Tanimoto similarity while ensuring the predicted binding affinity is above a defined threshold of 7.5. The two directions are the principal vector z* − z_{0} and a random vector orthogonal to the principal vector (more details are provided in the Methods). The trajectory shows how QMO leverages the evaluations of similarity and binding affinity for optimizing the lead molecule. Figure 4b displays the common substructure of candidate molecules in comparison to the remdesivir molecule in terms of subgraph similarity and their predicted properties over sampled iterations in QMO.
In addition to demonstrating the efficiency in optimizing lead molecules, we also study the diversity of the optimized molecules by varying the random seed used in QMO for querybased guided search. Figure 5 shows three different sets of trajectory on the landscape of predicted binding affinity when using remdesivir as the lead molecule (more details are provided in Methods). The optimization objective is the same as that of Fig. 4. The visualization suggests that the trajectories are distinct and the best candidate molecules in each trajectory are distant from each other in the embedding space, suggesting that QMO can find a diverse set of improved molecules with desired properties. In Supplementary Section 6.1 we also provide a quantitative study on the diversity and novelty of the QMOoptimized sequences when varying the similarity threshold. Setting a lower similarity threshold in QMO results in more novel and diverse sequences.
Discussion and conclusion
In this Article we have proposed QMO, a generic MO framework that readily applies to any pretrained molecule encoder–decoder with continuous latent molecule embeddings and any set of property predictions and evaluation metrics. It features efficient guided search with molecular property evaluations and constraints obtained using predictive models and cheminformatics software. More broadly, QMO is a machine learning tool that can be incorporated into different scientific discovery pipelines with deep generative models, such as generative adversarial networks, for efficient guided optimization with constraints. As a demonstration, Supplementary Sections 6.2 and 6.3 show the QMO results on the SARSCoV2 main protease inhibitor optimization task, with alternative objectives and randomly generated lead sequences, respectively. QMO is able to perform successful optimization with respect to different objectives, constraints and starting sequences. The proposed QMO framework can be applied, in principle, to other classes of material, such as metal oxides, alloys and genes.
On the simpler benchmark tasks for optimizing druglikeness and penalized logP scores with similarity constraints, QMO demonstrates superior performance over baseline results. We also apply QMO to improve the binding affinity of existing inhibitors of the SARSCoV2 main protease and to improve the toxicity of AMPs. The QMOoptimized variants of existing drug molecules show a favourable BFE with SARSCoV2 main protease upon blind docking and MM/PBSA rescoring, whereas the QMOoptimized peptides are consistently predicted to be antimicrobial and nontoxic by external peptide property predictors. The property landscape analysis and lowdimensional visualization of the optimization trajectories provide insights on how QMO efficiently navigates in the property space to find a diverse set of improved molecules with the desired properties. Our results show strong evidence that QMO can serve as a novel and practical tool for molecule optimization and other process/product design problems as well to aid in accelerating chemical discovery with constraints. In Supplementary Section 7 we provide an ablation study of QMO for additional performance analysis, including the effect of the encoder–decoder, the difference between sequencelevel and latentspace classifiers, and a comparison between different gradientfree optimizers. The results show that QMO is a queryefficient endtoend MO framework, and a better encoder–decoder can further improve its performance. Future work will include integrating multifidelity expert feedback into the QMO framework for human–AI collaborative material optimization, and using QMO for accelerating the discovery of novel, highperformance and lowcost materials.
Methods
Procedure descriptions for the QMO framework

Procedure inputs: pretrained encoder–decoder; molecular property predictors \({\{{f}_{i}\}}_{i = 1}^{I}\) and thresholds \({\{{\tau }_{i}\}}_{i = 1}^{I}\); molecular similarity metrics \({\{{g}_{j}\}}_{j = 1}^{J}\) and thresholds \({\{{\eta }_{j}\}}_{j = 1}^{J}\); total search iteration T; step size \({\{{\alpha }_{t}\}}_{t = 0}^{T1}\); starting lead molecule sequence x_{0}; reference sequence set S; Loss function from equation (1) or (2)

Procedure initialization: z^{(0)} = Enc(x_{0}); \({{\mathbb{Z}}}_{{{\rm{solution}}}}\leftarrow \{\varnothing \}\)

Repeat the following steps T times, starting from T = 0:

Gradient estimation: generate Q random unitnorm perturbations \({\{{u}^{(q)}\}}_{q = 1}^{Q}\) and compute \({\widehat{\nabla }}{\mathtt{Loss}}({z}^{(t)})={\frac{d}{\beta \cdot Q}}{\mathop{\sum }\nolimits_{q = 1}^{Q}}{\left[{{\mathtt{Loss}}({z}^{(t)}}+{\beta {u}^{(q)}}){{\mathtt{Loss}}({z}^{(t)}})\right]}\times {{u}^{(q)}}\)

Pseudo gradient descent: \({z}^{(t+1)}={{z}^{(t)}}{{\alpha }_{t}}\times {\widehat{\nabla }}{\mathtt{Loss}}({z}^{(t)})\)

Molecular property and constraint verification: if solving for formulation (1), check f_{i}(Dec(z^{(t)})) ≥ τ_{i} for all i ∈ [I]. If solving for formulation (2), check g_{j}(Dec(z^{(t)})∣S) ≥ η_{j} for all j ∈ [J].

Update valid molecule sequence: \({{\mathbb{Z}}}_{{{\rm{solution}}}}\leftarrow {{\mathbb{Z}}}_{{{\rm{solution}}}}\cup \{{z}^{(t)}\}\)
Procedure convergence guarantee and implementation details for QMO
Inherited from zerothorder optimization, QMO has algorithmic convergence guarantees. Under mild conditions for the true gradient (Lipschitz continuous and bounded gradient), the zerothorder gradient descent following equation (3) ensures QMO takes at most \({O(\frac{d}{T})}\) iterations to be sufficiently close to a local optimum in the loss landscape for a nonconvex objective function^{39,40}, where T is the number of iterations. In addition to the standard zerothorder gradient descent method, our QMO algorithm can naturally adopt different zerothorder solvers, such as zerothorder stochastic and accelerated gradient descent. Our implementation of gradient estimation gives Q + 1 loss function queries per iteration. If the decoder outputs a SMILES string, we pass the string to RDKit for validity verification and disregard invalid strings.
In our QMO implementation, we use the zerothorder version of the popular Adam optimizer^{68} that automatically adjusts the step sizes \({\{{\alpha }_{t}\}}_{t = 1}^{T1}\) with an initial learning rate α_{0} (further details are provided in Supplementary Section 2). Empirically, we find that Adam performs better than stochastic gradient descent in our tasks. The convergence of the zerothorder Adamtype optimizer is described in ref. ^{69}. We will specify experimental settings, data descriptions and QMO hyperparameters for each task. In all settings, QMO hyperparameters were tuned to a narrow range and then all the reported combinations were tried for each starting sequence. Among all feasible solutions returned by QMO, we report the one having the best molecular score given the required constraints. The stability analysis of QMO is studied in Supplementary Section 5.
Machine learning experimental settings
In our experiments, we run the QMO procedure based on the reported hyperparameter values and report the results of the best molecule found in the search process. The procedure will return null (that is, an unsuccessful search) if it fails to find a valid molecule sequence.
Benchmarks on QED and penalized logP
The pretrained encoder–decoder by Winter et al.^{46} is used, with the latent dimension d = 512. For the penalized logP optimization task, we use Q = 100, β = 10, α_{0} = 2.5, γ_{penalized logP} = 0.04 and T = 80. For the QED task, we use Q = 50, β = 10, α_{0} = 0.05, γ_{QED} = 4 and T = 20, and report the best results among 50 restarts. We find that, for the QED task, using multiple restarts can further improve the performance (Supplementary Section 5 provides a detailed discussion). For penalized logP, there is no reason to continue optimizing past 80 iterations, as penalized logP can be increased almost arbitrarily without making the resulting molecule more useful for drug discovery^{29}—even under similarity constraints—as we find. Therefore, we set T = 80 for the penalized logP task.
Optimizing existing inhibitor molecules for SARSCoV2 main protease
The pretrained encoder–decoder from Winter et al.^{46} is used, with the latent dimension d = 512. The hyperparameters of QMO are Q = 10, T = 2,000, β = {10, 25}, α_{0} = {0.1, 0.05} and λ_{Tanimoto} = {1, 10}.
Optimization of AMPs for improved toxicity
The pretrained predictors for toxicity and AMP by Das et al.^{30} are used, with the latent dimension d = 100. The similarity between the original sequence x_{0} and the improved sequence x is computed using the global alignment function in Biopython, formally defined as g_{sim}(x∣x_{0}) = globalalignment(x, x_{0})/log(length(x_{0})), where globalalignment(x, x_{0}) is the value returned by the function pairwise2.align.globalds(x, x_{0}, matlist.blosum62, −10, −1) and log(length(x_{0})) is the log value of the sequence length of x_{0}. Blosum62 is the weight matrix for estimating the alignment score^{70}, and −10/−1 is the penalty for opening/continuing a gap. The QMO parameters are Q = 100, β = {1, 10}, α_{0} = {0.1, 0.05, 0.01}, λ_{sim} = 0.01 and T = 5,000. The toxicity property constraint is set as f_{tox}(x) ≤ 0.1529 and amp as f_{amp}(x) ≥ 0.9998. Binary classification on this threshold gives 93.7% accuracy for toxicity and 88.00% for AMP prediction on a large peptide database^{30}.
Trajectory visualization
In Figs. 4 and 5, the optimization trajectory achieved by QMO is visualized by projection on two selected directors v_{x} and v_{y} originating from the starting embedding z_{0}. Specifically, in Fig. 4 we set v_{x} = z* − z_{0} and set v_{y} as a unitnorm randomly generated vector that is orthogonal to v_{x}. The 2D local grid in the embedding space is then sampled according to z_{grid}(x, y) = z_{0} + x ⋅ v_{x} + y ⋅ ∥z*∥_{2} ⋅ u_{y}, where ∥⋅∥_{2} denotes the Euclidean distance, and we sample x and y uniformly from [−0.5, 1.5] and [−2, 2], respectively. Note that, by construction, z_{grid}(0, 0) = z_{0} and z_{grid}(1, 0) = z*. We then evaluate the Tanimoto similarity and binding affinity prediction of the grid and present their results in Fig. 4. Similarly, in Fig. 5, we set v_{x} and v_{y} to be two unitnorm randomly generated vectors, and set z_{grid}(x, y) = z_{0} + x ⋅ ∥z_{0}∥_{2} ⋅ v_{x} + y ⋅ ∥z_{0}∥_{2} ⋅ u_{y}, where x and y are sampled uniformly from [−1.6, 1.6].
Data availability
Data for the benchmark molecule optimization tasks (QED and penalized logP) are available at https://github.com/IBM/QMO^{71}. For other enquiries contact the corresponding authors.
Code availability
Code for the benchmark molecule optimization tasks (QED and penalized logP) is available at https://github.com/IBM/QMO^{71}. For other enquiries contact the corresponding authors.
References
Bartók, A. P. et al. Machine learning unifies the modeling of materials and molecules. Sci. Adv. 3, e1701816 (2017).
Tkatchenko, A. Machine learning for chemical discovery. Nat. Commun. 11, 4125 (2020).
Button, A., Merk, D., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and ruledriven chemical synthesis. Nat. Mach. Intell. 1, 307–315 (2019).
Kotsias, P.C. et al. Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. Nat. Mach. Intell. 2, 254–265 (2020).
Polishchuk, P. G., Madzhidov, T. I. & Varnek, A. Estimation of the size of druglike chemical space based on GDB17 data. J. Comput. Aided Mol. Des. 27, 675–679 (2013).
Zhavoronkov, A. Artificial intelligence for drug discovery, biomarker development and generation of novel chemistry. Mol. Pharm. 15, 4311–4313 (2018).
Sun, W. et al. Machine learningassisted molecular design and efficiency prediction for highperformance organic photovoltaic materials. Sci. Adv. 5, eaay4275 (2019).
Ekins, S. et al. Exploiting machine learning for endtoend drug discovery and development. Nat. Mater. 18, 435–441 (2019).
Wu, C. et al. Analysis of therapeutic targets for SARSCoV2 and discovery of potential drugs by computational methods. Acta Pharm. Sin. B 10, 766–788 (2020).
Yang, J. et al. Molecular interaction and inhibition of SARSCoV2 binding to the ACE2 receptor. Nat. Commun. 11, 4541 (2020).
Coates, A. R., Halls, G. & Hu, Y. Novel classes of antibiotics or more of the same? Br. J. Pharm. 163, 184–194 (2011).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Reutlinger, M., Rodrigues, T., Schneider, P. & Schneider, G. Multiobjective molecular de novo design by adaptive fragment prioritization. Angew. Chem. Int. Ed. 53, 4244–4248 (2014).
Yuan, Y., Pei, J. & Lai, L. LigBuilder 2: a practical de novo drug design approach. J. Chem. Inf. Model. 51, 1083–1091 (2011).
Nigam, A., Friederich, P., Krenn, M. & AspuruGuzik, A. Augmenting genetic algorithms with deep neural networks for exploring the chemical space. In Proc. International Conference on Learning Representations (2020).
Korovina, K. et al. Chembo: Bayesian optimization of small organic molecules with synthesizable recommendations. In Proc. International Conference on Artificial Intelligence and Statistics 3393–3403 (PMLR, 2020).
G¢mezBombarelli, R. et al. Automatic chemical design using a datadriven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Skalic, M., Jiménez, J., Sabbadin, D. & De Fabritiis, G. Shapebased generative modeling for de novo drug design. J. Chem. Inf. Model. 59, 1205–1214 (2019).
Griffiths, R.R. & HernándezLobato, J. M. Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem. Sci 11, 577–586 (2020).
JiménezLuna, J. et al. DeltaDelta neural networks for lead optimization of small molecule potency. Chem. Sci. 10, 10911–10918 (2019).
Boitreaud, J., Mallet, V., Oliver, C. & Waldispühl, J. Optimol: optimization of binding affinities in chemical space for drug discovery. J. Chem. Inf. Model. 60, 5658–5666 (2020).
Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. In Proc. International Conference on Machine Learning 2323–2332 (PMLR, 2018).
Fu, T., Xiao, C. & Sun, J. CORE: automatic molecule optimization using copy & refine strategy. In Proc. AAAI Conference on Artificial Intelligence 638–645 (AAAI, 2020).
Winter, R. et al. Efficient multiobjective molecular optimization in a continuous latent space. Chem. Sci. 10, 8016–8024 (2019).
Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular denovo design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).
Guimaraes, G. L., SanchezLengeling, B., Outeiral, C., Farias, P. L. C. & AspuruGuzik, A. Objectivereinforced generative adversarial networks (organ) for sequence generation models. Preprint at https://arxiv.org/abs1705.10843 (2017).
SanchezLengeling, B., Outeiral, C., Guimaraes, G. L. & AspuruGuzik, A. Optimizing distributions over molecular space. An objectivereinforced generative adversarial network for inversedesign chemistry (organic). Preprint at https://doi.org/10.26434/chemrxiv.5309668.v3 (2017).
You, J., Liu, B., Ying, Z., Pande, V. & Leskovec, J. Graph convolutional policy network for goaldirected molecular graph generation. In Advances in Neural Information Processing Systems 6410–6421 (NIPS, 2018).
Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019).
Das, P. et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat. Biomed. Eng. 5, 613–623 (2021).
Griffen, E., Leach, A. G., Robb, G. R. & Warner, D. J. Matched molecular pairs as a medicinal chemistry tool: miniperspective. J. Med. Chem. 54, 7739–7750 (2011).
Dossetter, A. G., Griffen, E. J. & Leach, A. G. Matched molecular pair analysis in drug discovery. Drug Discov. Today 18, 724–731 (2013).
Dalke, A., Hert, J. & Kramer, C. mmpdb: an opensource matched molecular pair platform for large multiproperty data sets. J. Chem. Inf. Model. 58, 902–910 (2018).
Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proc. International Conference on Learning Representations (2015).
Jin, W., Yang, K., Barzilay, R. & Jaakkola, T. Learning multimodal graphtograph translation for molecule optimization. In Proc. International Conference on Learning Representations (2019).
Yang, K., Jin, W., Swanson, K., Barzilay, R. & Jaakkola, T. Improving molecular design by stochastic iterative target augmentation. In Proc. International Conference on Machine Learning 10716–10726 (PMLR, 2020).
Jin, W., Barzilay, R. & Jaakkola, T. Hierarchical graphtograph translation for molecules. Preprint at https://arxiv.org/abs/1907.11223 (2019).
Maragakis, P., Nisonoff, H., Cole, B. & Shaw, D. E. A deeplearning view of chemical space designed to facilitate drug discovery. J. Chem. Inf. Model. 60, 4487–4496 (2020).
Ghadimi, S. & Lan, G. Stochastic firstand zerothorder methods for nonconvex stochastic programming. SIAM J. Opt. 23, 2341–2368 (2013).
Liu, S. et al. A primer on zerothorder optimization in signal processing and machine learning. In IEEE Signal Processing Magazine 43–54 (IEEE, 2020).
Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).
Coley, C. W., Eyke, N. S. & Jensen, K. F. Autonomous discovery in the chemical sciences. Part II: outlook. Angew. Chem. Int. Ed. 59, 23414–23436 (2019).
Qin, Z. et al. Artificial intelligence method to design and fold alphahelical structural proteins from the primary amino acid sequence. Extreme Mech. Lett. 36, 100652 (2020).
Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structurebased drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).
Reymond, J.L., Ruddigkeit, L., Blum, L. & van Deursen, R. The enumeration of chemical space. Wiley Inter. Rev. Comput. Mol. Sci. 2, 717–733 (2012).
Winter, R., Montanari, F., Noé, F. & Clevert, D.A. Learning continuous and datadriven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10, 1692–1701 (2019).
Jang, E., Gu, S. & Poole, B. Categorical reparameterization with GumbelSoftmax. In Proc. International Conference on Learning Representations (2017).
Rogers, D. & Hahn, M. Extendedconnectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Polykovskiy, D. et al. Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front. Pharmacol. 11, 1931 (2020).
Sterling, T. & Irwin, J. J. Zinc 15ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Brown, N., Fiscato, M., Segler, M. H. & Vaucher, A. C. GuacaMol: benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59, 1096–1108 (2019).
Renz, P., Van Rompaey, D., Wegner, J. K., Hochreiter, S. & Klambauer, G. On failure modes in molecule generation and optimization. Drug Discov. Today Technol. 3233, 55–63 (2020).
Chenthamarakshan, V. et al. CogMol: targetspecific and selective drug design for COVID19 using deep generative models. Adv. Neural Inf. Process. Syst 33, 4320–4332 (2020).
Jin, Z. et al. Structure of MPro from SARSCoV2 and discovery of its inhibitors. Nature 582, 289–293 (2020).
Huynh, T., Wang, H. & Luan, B. In silico exploration of the molecular mechanism of clinically oriented drugs for possibly inhibiting SARSCoV2’s main protease. J. Phys. Chem. Lett 11, 4413–4420 (2020).
Cournia, Z., Allen, B. & Sherman, W. Relative binding free energy calculations in drug discovery: recent advances and practical considerations. J. Chem. Inf. Model. 57, 2911–2937 (2017).
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Wang, Z. et al. farPPI: a webserver for accurate prediction of proteinligand binding structures for smallmolecule PPI inhibitors by MM/PB(GB)SA methods. Bioinformatics 35, 1777–1779 (2019).
Tallorin, L. et al. Discovering de novo peptide substrates for enzymes using machine learning. Nat. Commun. 9, 5253 (2018).
Porto, W. F. et al. In silico optimization of a guava antimicrobial peptide enables combinatorial exploration for peptide design. Nat. Commun. 9, 1490 (2018).
Singh, S. et al. SATPdb: a database of structurally annotated therapeutic peptides. Nucleic Acids Res 44, D1119–D1126 (2015).
Pirtskhalava, M. et al. DBAASP v.2: an enhanced database of structure and antimicrobial/cytotoxic activity of natural and synthetic peptides. Nucleic Acids Res. 44, D1104–D1112 (2016).
Hawrani, A., Howe, R. A., Walsh, T. R. & Dempsey, C. E. Origin of low mammalian cell toxicity in a class of highly active antimicrobial amphipathic helical peptides. J. Biol. Chem. 283, 18636–18645 (2008).
Sun, C. et al. Characterization of the bioactivity and mechanism of bactenecin derivatives against foodpathogens. Front. Microbiol. 10, 2593 (2019).
Timmons, P. B. & Hewage, C. M. HAPPENN is a novel tool for hemolytic activity prediction for therapeutic peptides which employs neural networks. Sci. Rep. 10, 10869 (2020).
Hasan, M. M. et al. HLPpredFuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation. Bioinformatics 36, 3350–3356 (2020).
Xiao, X., Wang, P., Lin, W.Z., Jia, J.H. & Chou, K.C. iAMP2L: a twolevel multilabel classifier for identifying antimicrobial peptides and their functional types. Anal. Biochem. 436, 168–177 (2013).
Kingma, D. & Ba, J. Adam: a method for stochastic optimization. In Proc. International Conference on Learning Representations (2015).
Chen, X. et al. ZOadaMM: zerothorder adaptive momentum method for blackbox optimization. In Advances in Neural Information Processing Systems 7202–7213 (NIPS, 2019).
Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).
Hoffman, S. & Martinelli, S. Ibm/qmo: V1. Zenodo https://doi.org/10.5281/zenodo.5562908 (2021).
Acknowledgements
We thank T. Hou and Z. Wang for help with binding freeenergy calculations using the FarPPI webserver. We also thank B. Sharma for help with improving the presentation of the system plot (Fig. 1).
Author information
Affiliations
Contributions
S.C.H. and V.C. contributed to the SARSCoV2, QED and penalized logP optimization experiments. K.W. contributed to the AMP experiment. S.C.H. and P.D. contributed to the docking simulations. P.Y.C. contributed to QMO methodology design and property landscape analysis. V.C. contributed to common substructure analysis. All authors conceived and designed the research, analysed the results and contributed to writing the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors have submitted a patent application on the method and system for querybased molecule optimization (US patent application number 17/016640).
Peer review information
Nature Machine Intelligence thanks Amir Barati Farimani and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1
Cumulative success rate of antimicrobial peptide (AMP) sequence optimization v.s. iterations using QMO.
Extended Data Fig. 2 AMP and toxicity analysis.
Reported accuracy, prediction rate, and property improvement for 109 pairs of starting and QMOoptimized sequences based on different AMP (a) and toxicity (b) classifiers. The 109 starting sequences are experimentally verified toxic AMPs and are correctly predicted by the AMP and toxicity classifiers used in QMO. The external classifiers have varying prediction accuracy as they may yield incorrect predictions on some of starting sequences. The prediction rate on QMOoptimized sequences is defined as the fraction of AMP and/or toxin predictions. About 56.88% of QMOoptimized sequences are predicted as nontoxic AMPs by iAMP2L + HAPPENN, showing high agreement with the classifiers used in QMO. The complete results are reported in Supplementary Table 7.
Extended Data Fig. 3
Performance of drug likeness (QED) task with Tanimoto similarity constraint δ = 0.4.
Extended Data Fig. 4
Performance of penalized logP task at various Tanimoto similarity constraint value δ.
Supplementary information
Supplementary Information
Supplementary Figs. 1–10, Discussion and Tables 1–17.
Rights and permissions
About this article
Cite this article
Hoffman, S.C., Chenthamarakshan, V., Wadhawan, K. et al. Optimizing molecules using efficient queries from property evaluations. Nat Mach Intell 4, 21–31 (2022). https://doi.org/10.1038/s4225602100422y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s4225602100422y