Discussion:
[Rdkit-discuss] reaction fingerprint as bitstring
Ambrish
2017-03-23 17:10:24 UTC
Permalink
Hi RDKitters,

I am trying to calculate reaction fingerprints and store it in database.
The transformation fingerprint created using the routine below is a
IntSparseIntVect and I would like to convert it to a BitString of a
particular length. How do we do that .

def create_transformation_FP(rxn, fp_size, fp_type):
rkfp = None
rfp = None
pfp = None
for react in range(rxn.GetNumReactantTemplates()):
mol = rxn.GetReactantTemplate(react)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)

try:
if fp_type == 'AP':
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
elif fp_type == 'Morgan':
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
elif fp_type == 'Topological':
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
else:
print "Unsupported fingerprint type"
except:
print "Cannot build reactant fingerprint"
if rfp is None:
rfp = fp
else:
rfp += fp

for product in range(rxn.GetNumProductTemplates()):
mol = rxn.GetProductTemplate(product)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)
try:
if fp_type == 'AP':
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
elif fp_type == 'Morgan':
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
elif fp_type == 'Topological':
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
else:
print "Unsupported fingerprint type"
except:
print "Cannot build product fingerprint"
if pfp is None:
pfp = fp
else:
pfp += fp
if pfp is not None and rfp is not None:
rkfp = pfp - rfp


return rkfp

Thanks.
Greg Landrum
2017-03-24 07:07:56 UTC
Permalink
Hi Ambrish,

Assuming that I understand correctly what you want to do, here's an example
using built-in RDKit functionality that generates a reaction fingerprint
(using default parameters, you can change these) and then converts it into
a bit vector using a simple: "if the bit is set in the original fingerprint
set it in the bit vector":

In [3]: from rdkit.Chem import rdChemReactions

In [4]: fp = rdChemReactions.CreateDifferenceFingerprintForReaction(rxn)

In [5]: fp
Out[5]: <rdkit.DataStructs.cDataStructs.UIntSparseIntVect at 0x10bb4ff30>

In [6]: from rdkit import DataStructs

In [7]: ebv = DataStructs.ExplicitBitVect(2048)

In [8]: for bit in fp:
...: ebv.SetBit(bit%ebv.GetNumBits())
...:

In [9]: ebv.GetNumOnBits()
Out[9]: 5


I don't think this is the best strategy since it treats positive and
negative values the same, but without more information on what you want to
do it's the best I can do.

Best,
-greg



Best,
-greg
Post by Ambrish
Hi RDKitters,
I am trying to calculate reaction fingerprints and store it in database.
The transformation fingerprint created using the routine below is a
IntSparseIntVect and I would like to convert it to a BitString of a
particular length. How do we do that .
rkfp = None
rfp = None
pfp = None
mol = rxn.GetReactantTemplate(react)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
print "Unsupported fingerprint type"
print "Cannot build reactant fingerprint"
rfp = fp
rfp += fp
mol = rxn.GetProductTemplate(product)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
print "Unsupported fingerprint type"
print "Cannot build product fingerprint"
pfp = fp
pfp += fp
rkfp = pfp - rfp
return rkfp
Thanks.
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Ambrish
2017-03-24 13:29:07 UTC
Permalink
Thanks Greg.
I am trying to pre-calculate reaction fingerprints of all my database
reactions and store it in database, so that for any new reaction I can run
Tanimoto similarity or similar calculation and pick similar reactions. So I
decided to convert it to a BitString of fixed length, but I get your point
I am loosing information in this way. Any suggestion on how it can be done
?
Post by Greg Landrum
Hi Ambrish,
Assuming that I understand correctly what you want to do, here's an
example using built-in RDKit functionality that generates a reaction
fingerprint (using default parameters, you can change these) and then
converts it into a bit vector using a simple: "if the bit is set in the
In [3]: from rdkit.Chem import rdChemReactions
In [4]: fp = rdChemReactions.CreateDifferenceFingerprintForReaction(rxn)
In [5]: fp
Out[5]: <rdkit.DataStructs.cDataStructs.UIntSparseIntVect at 0x10bb4ff30>
In [6]: from rdkit import DataStructs
In [7]: ebv = DataStructs.ExplicitBitVect(2048)
...: ebv.SetBit(bit%ebv.GetNumBits())
In [9]: ebv.GetNumOnBits()
Out[9]: 5
I don't think this is the best strategy since it treats positive and
negative values the same, but without more information on what you want to
do it's the best I can do.
Best,
-greg
Best,
-greg
Hi RDKitters,
I am trying to calculate reaction fingerprints and store it in database.
The transformation fingerprint created using the routine below is a
IntSparseIntVect and I would like to convert it to a BitString of a
particular length. How do we do that .
rkfp = None
rfp = None
pfp = None
mol = rxn.GetReactantTemplate(react)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
print "Unsupported fingerprint type"
print "Cannot build reactant fingerprint"
rfp = fp
rfp += fp
mol = rxn.GetProductTemplate(product)
mol.UpdatePropertyCache(strict=False)
Chem.GetSSSR(mol)
fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size)
fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size)
fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol)
print "Unsupported fingerprint type"
print "Cannot build product fingerprint"
pfp = fp
pfp += fp
rkfp = pfp - rfp
return rkfp
Thanks.
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Loading...