Skip to content

Can't construct surface rmgpy.molecule.Molecule() #2603

@sevyharris

Description

@sevyharris

Bug Description

I'm trying to create a rmgpy.molecule.Molecule() using smiles

rmgpy.molecule.Molecule(smiles='O=C=[X]')

But it fails from an RDKit error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_18284/1886713532.py in <module>
----> 1 rmgpy.molecule.Molecule(smiles='O=C=[X]')
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.__init__()
~/rmg/RMG-Py/rmgpy/molecule/molecule.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.molecule.Molecule.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator.from_smiles()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._read()
~/rmg/RMG-Py/rmgpy/molecule/translator.cpython-37m-x86_64-linux-gnu.so in rmgpy.molecule.translator._rdkit_translator()
ValueError: Could not interpret the identifier 'O=C=[X]'

Possible Solution

I'm wondering if we can get around RDKit's limitations with the following procedure:

  • replace the surface sites X with a nonreactive species like Ar
  • generate that molecule's adjacency list
  • replace Ar with X and delete extra pairs of electrons on the X

It would look something like adding this to the _rdkit_translator() function in RMG-Py/rmgpy/molecule/translator.py:

from rdkit import Chem
import rmgpy.molecule

COX_smiles = 'O=C=[X]'

rdkitmol = Chem.MolFromSmiles(COX_smiles.replace('X', 'Ar'))
mol = rmgpy.molecule.molecule.Molecule()
output = rmgpy.molecule.converter.from_rdkit_mol(mol, rdkitmol)

lines = output.to_adjacency_list().split('\n')

for i, line in enumerate(lines):
    if 'Ar' in line:
        lines[i] = lines[i].replace('Ar', 'X')
        # remove any extra electron pairs...
        lines[i] = lines[i].replace('p3', 'p0')
        lines[i] = lines[i].replace('p2', 'p0')
        lines[i] = lines[i].replace('p1', 'p0')
adj_list = '\n'.join(lines)
COX = rmgpy.molecule.molecule.Molecule().from_adjacency_list(adj_list)

print(COX.smiles)
print(COX.to_adjacency_list())

Other Possible Solution

One workaround is to just instantiate the molecule using the adjacency list. But sometimes that's annoying- especially since we can use smiles for gas phase molecules without any issue. At the very least, a more descriptive error message is probably required, telling the user that you have to use the adjacency list.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions