Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate molecular weight #5

Open
woodthom2 opened this issue Aug 7, 2024 · 3 comments
Open

Calculate molecular weight #5

woodthom2 opened this issue Aug 7, 2024 · 3 comments

Comments

@woodthom2
Copy link
Member

We already have molecular structure in .mol format provided by Drugbank data.

from drug_named_entity_recognition.drugs_finder import find_drugs

drugs = find_drugs("i bought some Bivalirudin".split(" "), is_include_structure=True)

self.assertEqual(1, len(drugs))
self.assertEqual("Bivalirudin", drugs[0][0]['name'])
self.assertIn("0.0000 C", drugs[0][0]['structure_mol'])

Can we convert to SMILES on the fly? Can we calculate molecular weight on the fly?

Ideally can you do this without adding anything more to requirements.txt? There are some chemistry libraries but they can be quite heavy.

@abdullahwaqar
Copy link
Member

Hey @woodthom2, wanted to check if it is still open for contributions. If so, I would like to contribute.

@woodthom2
Copy link
Member Author

Hi @abdullahwaqar yes this is still open! Can you see a way to add molecular weight, or a data source which will give us the molecular weight?
Here's an example for one single drug:
https://www.opnme.com/molecules/khk-inhibitor-bi-9787

we have Weight: 489.6 DA, or also properties such as tmax and Cmax. I don't know if such a database exists, it's possible Drugbank gives the data to us but we need to check licences.

Also, you can see in the link that I pasted there is a nice moving 3D image of the drug. The positions in the atoms are now returned by the library in a string format (you can see my example for paracetamol here: https://fastdatascience.com/ai-in-pharma/drug-named-entity-recognition-update-2/#molecular-structures ) - it would be nice to have a Jupyter notebook (or Colab notebook) or in-browser example of us rendering this molecular structure, either as a static image or a dynamic view of some kind. I definitely do not want to add any graphics libraries as dependencies to the project, but having this as an example would be great.

The molecular structure data that we have could also be a shortcut to getting the molecular weight of a drug..

Thanks!

@woodthom2
Copy link
Member Author

woodthom2 commented Oct 20, 2024

https://pubchem.ncbi.nlm.nih.gov/docs/downloads#section=Individual-Record-Download <- this is a good source for molecular weight which appears to be allowed for our use. I cannot take the molecular weights from Drugbank because it is not allowed under the license.

If we use Pubchem we can also take the SMILES value which will be useful, e.g.

CN1C2=C(C=C(C=C2)C(=O)N(CCC(=O)O)C3=CC=CC=N3)N=C1CNC4=CC=C(C=C4)C(=N)N

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants