Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

demo_6zcy: ccd prediction (demo_6zcy.json) by helixfold3 is ok, but smiles (demo_6zcy_smiles.json) failed #337

Open
Samuel-gwb opened this issue Sep 3, 2024 · 2 comments

Comments

@Samuel-gwb
Copy link

Great work !
I've successfully predict protein-SM complex structure using demo_6zcy.json, just similar with given predicted structure in demo_output. However, use of demo_6zcy_smiles.json failed. Please help to give a solution. Using "sh run_infer.sh", and
error message is like:
##################################################################

PaddlePaddle commit: fbf852dd832bc0e63ae31cd4aa37defd829e4c03
FLAGS_new_einsum: True
args:
Namespace(bf16_infer=False, seed=None, logging_level='DEBUG', model_name='allatom_demo', init_ model='init_models/HelixFold3-240814.pdparams', precision='fp32', amp_level='O1', infer_times= 1, diff_batch_size=1, input_json='data/demo_6zcy_smiles.json', output_dir='./output', ccd_prep rocessed_path='/data/database/ccd_preprocessed_etkdg.pkl.gz', jackhmmer_binary_path='/home/gwb /miniconda3/envs/helixfold/bin/jackhmmer', hhblits_binary_path='/home/gwb/miniconda3/envs/heli xfold/bin/hhblits', hhsearch_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/hhsearch', k align_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/kalign', hmmsearch_binary_path='/ho me/gwb/miniconda3/envs/helixfold/bin/hmmsearch', hmmbuild_binary_path='/home/gwb/miniconda3/en vs/helixfold/bin/hmmbuild', nhmmer_binary_path='/home/gwb/miniconda3/envs/helixfold/bin/nhmmer ', uniprot_database_path='/data/database/uniprot/uniprot.fasta', pdb_seqres_database_path='/da ta/database/pdb_seqres/pdb_seqres.txt', uniref90_database_path='/data/database/uniref90/uniref 90.fasta', mgnify_database_path='/data/database/mgnify/mgy_clusters_2022_05.fa', bfd_database_ path='/data/database/small_bfd/bfd-first_non_consensus_sequences.fasta', small_bfd_database_pa th='/data/database/small_bfd/bfd-first_non_consensus_sequences.fasta', uniclust30_database_pat h='/data/database/uniclust30/uniclust30_2018_08/uniclust30_2018_08', rfam_database_path='/data /database/Rfam-14.9_rep_seq.fasta', template_mmcif_dir='/data/database/pdb_mmcif/mmcif_files', max_template_date='2020-05-14', obsolete_pdbs_path='/data/database/pdb_mmcif/obsolete.dat', p reset='reduced_dbs', maxit_binary='/home/gwb/RationalDesign/helixfold3/maxit-v11.100-prod-src/ bin/maxit')
[OBABEL] Temporary file created: /tmp/tmpjptejgmh.mol2
Failed to convert ligand entity 1: {'type': 'ligand', 'smiles': 'CNC(=O)c1nn(C)c2ccc(Nc3nccc(n 3)n4cc(N[C@@h]5CCNC5)c(C)n4)cc12', 'count': 1}, Python argument types in
rdkit.Chem.rdmolops.RemoveAllHs(NoneType)
did not match C++ signature:
RemoveAllHs(RDKit::ROMol mol, bool sanitize=True)
Traceback (most recent call last):
File "/home/gwb/RationalDesign/helixfold3/inference.py", line 637, in
main(args)
File "/home/gwb/RationalDesign/helixfold3/inference.py", line 442, in main
all_entitys = preprocess_json_entity(args.input_json, args.output_dir)
File "/home/gwb/RationalDesign/helixfold3/inference.py", line 87, in preprocess_json_entity
all_entitys = preprocess.online_json_to_entity(json_path, out_dir)
File "/home/gwb/RationalDesign/helixfold3/infer_scripts/preprocess.py", line 290, in online_ json_to_entity
raise RuntimeError(f'[Error] Failed to convert {len(error_ids)}/{len(entities)} entities')
RuntimeError: [Error] Failed to convert 1/2 entities
(helixfold) gwb@node01:/RationalDesign/helixfold3$ vi /home/gwb/RationalDesign/helixfold3/inf erence.py
(helixfold) gwb@node01:
/RationalDesign/helixfold3$ vi /home/gwb/RationalDesign/helixfold3/inf er_scripts/preprocess.py
(helixfold) gwb@node01:~/RationalDesign/helixfold3$ python
Python 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:50:21)
[GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
from rdkit import Chem

smiles = 'CNC(=O)c1nn(C)c2ccc(Nc3nccc(n3)n4cc(N[C@@h]5CCNC5)c(C)n4)cc12'
mol = Chem.MolFromSmiles(smiles)
if mol is None:
... print("Failed to create molecule from SMILES")
... else:
... print("Molecule created successfully")
...
Molecule created successfully

########################################################

@RyanGarciaLI
Copy link
Collaborator

Hi,

It seems that your program failed to generate ligand conformation from SMILES via openbabel or rdkit, and the model hasn't started to run yet. May I know what is your openbabel version? And the tools like openbabel or rdkit are conducting random generation. Did you try multiple times?

If you successfully generate conformation, you may have logs as follows.
image

@Samuel-gwb
Copy link
Author

Yes, ligand conformation generation from smiles failed. I tried quite a few times, with same failing message.
openbabel/rdkit version like this:
openbabel 3.1.1 py39h2d01fe1_9 conda-forge
rdkit 2024.3.5 pypi_0 pypi
rdkit-pypi 2022.9.5 pypi_0 pypi

BTW: cuda not 12.0 but 11.8, cudnn 8.4.0, paddle 2.6.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@RyanGarciaLI @Samuel-gwb and others