Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rdkit coordination "->" after SMILES sanitization creates invalid reaction #50

Open
frederik-sandfort1 opened this issue May 24, 2024 · 0 comments

Comments

@frederik-sandfort1
Copy link

frederik-sandfort1 commented May 24, 2024

RDKit changed it's behavior to display coordination of metal ions (e.g. carboxylate cations). As this includes a "->", it is incompatible with rxnmappers splitting of the reaction at ">" (see here).

Example for Reaction sanitization:

from rdkit.Chem import AllChem
smi = "COC(=O)CCBr.O=C([O-][K+])c1ccccc1>>COC(=O)CCOC(=O)Cc1ccccc1"
rxn = AllChem.ReactionFromSmarts(smi, useSmiles=True)
AllChem.SanitizeRxnAsMols(rxn)
print(AllChem.ReactionToSmiles(rxn))
>>> "COC(=O)CCBr.O=C([O-]->[K+])c1ccccc1>>COC(=O)CCOC(=O)Cc1ccccc1"

Example for molecule sanitization:

from rdkit import Chem
smi = "COC(=O)CCBr.O=C([O-][K+])c1ccccc1>>COC(=O)CCOC(=O)Cc1ccccc1"
components = smi.split(">")
[Chem.MolToSmiles(Chem.MolFromSmiles(component)) for component in components]
>>> ['COC(=O)CCBr.O=C([O-]->[K+])c1ccccc1', '', 'COC(=O)CCOC(=O)Cc1ccccc1']

While likely the "->" is not in the vocab (and thus not compatible with the current model version), you could just check for the sign and replace it with an empty string, to stay compatible with the sanitization.
Additionally, it could make sense to use rdkits rxn.GetReactants(), rxn.GetAgents() and rxn.GetProducts() function instead of the string splitting.

Just wanted to make you aware of this issue.
Best
Frederik

@avaucher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant