Posts Tagged ‘cheminformatics’

I met with Jean-Claude Bradley yesterday and we had a pretty useful hack session, allowing him to easily incorporate chemical and cheminformatics functionality into a GoogleDocs spreadsheet.

A common task that Jean-Claude wanted to automate was the calculation of milligrams (or milliliters) of a chemical required for a certain molarity.  So what we need for this calculation is the compound name, desired molarity, molecular weight and the density. Importantly, the people who’d like to use this will provide compound names and not a directly parseable SMILES.  So we’d also like to (optionally) get the SMILES. Finally, he wanted to be able to do this in a Google spreadsheet – rather than a specific web page or stand alone program.

It turns out that with a liberal helping of Python, a dash of ChemSpider and pinch of PubChem, all of this can be done in a half hour hack session.


Read Full Post »

Pub3D is a 3D version of PubChem, in which we have generated a single conformer for 99% of PubChem using the smi23d suite of programs. The structures are then stored in a PostgreSQL database along with their distance moment shape descriptors described by Ballester and Graham-Richards. This allows us to perform shape similarity queries against a user supplied 3D structure. By partitioning the database (thanks to the CGL folks at IU) and using a spatial index, performance is quite snappy. (I had briefly mentioned this in a presentation at the ACS meeting, last spring).

The database had been down for some time, so today I got it back up and running and AJAX’ified the interface, to make it look a little nicer.  jQuery rocks! (OK, the color scheme sucks)

There are obvious drawbacks to the current database – single conformer shape search is not very rigorous, especially since the stored structures are not necessarily the minimum energy conformer. However, we have started generating multiple conformers, so hopefully we’ll address this issue in time. The bigger issue is how this approach to shape similarity compares to other well known approaches such as ROCS. Clearly, a shape descriptor approach is lower resolution to a volumetric approach such as ROCS, so in that sense the results are ‘rougher’. However visual inspection of some searches seems to indicate that it isn’t too bad. The paper describing these shape descriptors didn’t do a rigorous comparison – that’s on our TODO list.

OK, the fun part (a.k.a, coding) is done for now – got to get back to the paper.

Read Full Post »

I’m in academia and I do cheminformatics. Recent collaborations, papers and funding issues in this field have made me think about the future of this research in this setting. This, and a thread discussing David Leahy’s talk on InkSpot Science at the Soton Open Science Workshop got me started on this post.

There are currently a number of groups and collaborations that are attempting to perform drug discovery without the large centralized infrastructure that is characteristic of this process. Examples of this include Jean Claude Bradley who runs the UsefulChem project and the Synaptic Leap as well as various academic labs. Also see Kozikowski et al

Cheminformatics plays a key role in drug discovery efforts at various stages. For example, identifying or prioritizing compounds from virtual libraries, predicting ADME profiles and side effects (e.g., hERG activation) and so on. I should stress that such computational methods don’t replace bench work – but they can certainly enhance it. More generally, we’re now faced with a deluge of data – and human eyeballs are not going to be able to handle this. And this is exactly the place that cheminformatics does it’s stuff.


Read Full Post »