This repository has multiple goals, most importantly: mapping between MeSH, PubChem, and standard chemical identifiers. Currently, this information is scattered and incredibly difficult to access. First the following resources are available:
- ftp://ftp.ncbi.nih.gov/pubchem/Compound/Extras/README-Extras
- Map to MeSH with ftp://ftp.ncbi.nih.gov/pubchem/Compound/Extras/CID-MeSH
- Map to Substance with ftp://ftp.ncbi.nih.gov/pubchem/Compound/Extras/CID-SID.gz
- Map to InChI with ftp://ftp.ncbi.nih.gov/pubchem/Compound/Extras/CID-InChI-Key.gz
- ftp://ftp.ncbi.nih.gov/pubchem/Substance/Extras/README-Extras
- Map to MeSH with ftp://ftp.ncbi.nih.gov/pubchem/Substance/Extras/SID-MeSH
- Map to other databases with ftp://ftp.ncbi.nih.gov/pubchem/Substance/Extras/SID-Map.gz
bio2bel_pubchem
can be installed easily from PyPI with
the following code in your favorite terminal:
$ python3 -m pip install bio2bel_pubchem
or from the latest code on GitHub with:
$ python3 -m pip install git+https://github.com/bio2bel/pubchem.git@master
PubChem can be downloaded and populated from either the Python REPL or the automatically installed command line utility.
>>> import bio2bel_pubchem
>>> pubchem_manager = bio2bel_pubchem.Manager()
>>> pubchem_manager.populate()
bio2bel_pubchem populate