A few days back, Hari on FriendFeed had asked how one could get a a CAS number from a PubChem compound ID (CID). The reverse, that is finding a CID for a given CAS number is generally quite easy as shown by Rich here and here. Since I was trying to get some writing done, [...]
Posts Tagged ‘postgres’
Getting a CAS Number from a PubChem CID
Posted in software, tagged cas, postgres, pubchem, python on December 12, 2008 | 1 Comment »
Brute Force – Inelegant, But Sometimes Useful
Posted in research, software, tagged benchmark, database, nearest neighbor, performance, postgres, similarity, spatial index on November 20, 2008 | 1 Comment »
A few days back I posted on improving query times in Pub3D by going from a monolithic database (17M rows), to a partitioned version (~ 3M rows in 6 separate databases) and then performing queries in parallel. I also noted that we were improving query times by making use of an R-tree spatial index.
Andrew Dalke [...]
Multi-threaded Database Access with Python
Posted in software, tagged database, parallel, performance, postgres, python, threads on November 14, 2008 | 8 Comments »
Pub3D contains about 17.3 million 3D structures for PubChem compounds, stored in a Postgres database. One of the things we wanted to do was 3D similarity searching and to achieve that we’ve been employing the Ballester and Graham-Richards method. In this post I’m going to talk about performance – how we went from a single [...]
AJAX’ified Pub3D
Posted in cheminformatics, research, software, tagged cheminformatics, database, postgres, pubchem, ROCS, shape similarity, spatial on October 3, 2008 | Leave a Comment »
Pub3D is a 3D version of PubChem, in which we have generated a single conformer for 99% of PubChem using the smi23d suite of programs. The structures are then stored in a PostgreSQL database along with their distance moment shape descriptors described by Ballester and Graham-Richards. This allows us to perform shape similarity queries against [...]