Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FoldTree keeps crashing in the Run FoldTree step. Facing the same issue in the pipeline. #18

Open
kaustubh-amritkar opened this issue Mar 6, 2024 · 3 comments

Comments

@kaustubh-amritkar
Copy link

I was running the pipeline for the FoldTree with my custom dataset of pdbs. And it keeps crashing.
This is the message/output from the .log file. Can you help me understand why this is happening?

Config file /mnt/researchdrive/Kaustubh/RbcS_Origin/data/Structures/Colabfold_Phylogeny_Seqs/FoldTree_Phylogeny/fold_tree/workflow/config/config_vars.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Relative file path './results/plddt.json' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/finalset.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/plddt.json' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/plddt.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/sequence_dataset.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/sequences.fst' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/finalset.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/dlstructs.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/identifiers.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/sequence_dataset.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/dlsequences.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/allvall_1.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/foldtree_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldseek2distmat.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/finalset.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/allvall_1.csv' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldseekallvall.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/alntmscore_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.PP.nwk.rooted.final' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.PP.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_fastmemat.txt' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/lddt_struct_tree.nwk' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/plddt.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/dlstructs.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/dlsequences.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldtree_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldseek2distmat.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/foldseekallvall.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/alntmscore_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_struct_madroot_post.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_struct_postprocess.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Relative file path './results/logs/lddt_quicktree.log' starts with './'. This is redundant and strongly discouraged. It can also lead to inconsistent results of the file-matching approach used by Snakemake. You can simply omit the './' for relative file paths.
Using shell: /usr/bin/bash
Provided cores: 12
Rules claiming more threads will be scaled down.
Job stats:
job                   count
------------------  -------
all                       1
dl_ids_sequences          1
dl_ids_structs            1
foldseek2distmat          1
foldseek_allvall_1        1
mad_root_post             3
plddt                     1
postprocess               3
quicktree                 3
total                    15

Select jobs to execute...

[Tue Feb 27 18:00:39 2024]
rule dl_ids_sequences:
    input: ./results/identifiers.txt
    output: ./results/sequence_dataset.csv
    log: ./results/logs/dlsequences.log
    jobid: 3
    reason: Missing output files: ./results/sequence_dataset.csv
    wildcards: folder=./results
    resources: tmpdir=/tmp

Activating conda environment: foldtree
Activating conda environment: foldtree
[Tue Feb 27 18:00:42 2024]
Finished job 3.
1 of 15 steps (7%) done
Select jobs to execute...

[Tue Feb 27 18:00:42 2024]
rule dl_ids_structs:
    input: ./results/sequence_dataset.csv
    output: ./results/sequences.fst, ./results/finalset.csv
    log: ./results/logs/dlstructs.log
    jobid: 2
    reason: Missing output files: ./results/finalset.csv; Input files updated by another job: ./results/sequence_dataset.csv
    wildcards: folder=./results
    resources: tmpdir=/tmp

Activating conda environment: foldtree
Activating conda environment: foldtree
[Tue Feb 27 18:00:43 2024]
Finished job 2.
2 of 15 steps (13%) done
Select jobs to execute...

[Tue Feb 27 18:00:43 2024]
rule plddt:
    input: ./results/finalset.csv
    output: ./results/plddt.json
    log: ./results/logs/plddt.log
    jobid: 1
    reason: Missing output files: ./results/plddt.json; Input files updated by another job: ./results/finalset.csv
    wildcards: folder=./results
    resources: tmpdir=/tmp

[Tue Feb 27 18:00:43 2024]
rule foldseek_allvall_1:
    input: ./results/finalset.csv
    output: ./results/allvall_1.csv
    log: ./results/logs/foldseekallvall.log
    jobid: 8
    reason: Missing output files: ./results/allvall_1.csv; Input files updated by another job: ./results/finalset.csv
    wildcards: folder=./results
    resources: tmpdir=/tmp

Activating conda environment: foldtree
Activating conda environment: foldtree
Activating conda environment: foldtree
[Tue Feb 27 18:00:43 2024]
Error in rule foldseek_allvall_1:
    jobid: 8
    input: ./results/finalset.csv
    output: ./results/allvall_1.csv
    log: ./results/logs/foldseekallvall.log (check log file(s) for error details)
    conda-env: foldtree
    shell:
        foldseek easy-search ./results/structs/ ./results/structs/ ./results/allvall_1.csv ./results/tmp --format-output 'query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits,lddt,lddtfull,alntmscore' --exhaustive-search --alignment-type 2 -e inf --threads 1
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

[Tue Feb 27 18:01:00 2024]
Finished job 1.
3 of 15 steps (20%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-02-27T180039.114096.snakemake.log

I also tried it on Google collab and this is the error I get:

CalledProcessError                        Traceback (most recent call last)

[<ipython-input-3-4387aeaa2f49>](https://localhost:8080/#) in <cell line: 1>()
----> 1 get_ipython().run_cell_magic('bash', '-s $jobname $input_type', 'JOBNAME=$1\nINPUT_TYPE=$2\nSUFFIX=""\nif [[ $INPUT_TYPE = "custom" ]]; then\n  mkdir -p "${JOBNAME}/structs"\n  mv "${JOBNAME}/"*.pdb "${JOBNAME}/"*.cif "${JOBNAME}/structs"\n  SUFFIX="custom_structs=True"\nfi\nsnakemake --cores $(nproc --all) --use-conda -s fold_tree/workflow/fold_tree --config folder="./${JOBNAME}" filter=False $SUFFIX  #> /dev/null 2>&1\n#snakemake --cores 4 --use-conda -s fold_tree/workflow/fold_tree --config folder=./${jobname} filter=False\n')

4 frames

<decorator-gen-103> in shebang(self, line, cell)

[/usr/local/lib/python3.10/dist-packages/IPython/core/magics/script.py](https://localhost:8080/#) in shebang(self, line, cell)
    243             sys.stderr.flush()
    244         if args.raise_error and p.returncode!=0:
--> 245             raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
    246 
    247     def _run_script(self, p, cell, to_close):

CalledProcessError: Command 'b'JOBNAME=$1\nINPUT_TYPE=$2\nSUFFIX=""\nif [[ $INPUT_TYPE = "custom" ]]; then\n  mkdir -p "${JOBNAME}/structs"\n  mv "${JOBNAME}/"*.pdb "${JOBNAME}/"*.cif "${JOBNAME}/structs"\n  SUFFIX="custom_structs=True"\nfi\nsnakemake --cores $(nproc --all) --use-conda -s fold_tree/workflow/fold_tree --config folder="./${JOBNAME}" filter=False $SUFFIX  #> /dev/null 2>&1\n#snakemake --cores 4 --use-conda -s fold_tree/workflow/fold_tree --config folder=./${jobname} filter=False\n'' returned non-zero exit status 1.
@cactuskid
Copy link
Contributor

If I had to venture a guess it looks like foldseek is throwing an error in both cases. We had seen this during the all vs all comparison with some previous versions. Make sure the structs folder contains valid PDB files. If it still doesn't work, I would try to run a foldseek all vs all command on the command line outside of snakemake using the foldseek conda environment. This may be more of a problem to report to the foldseek git. Since I wasn't involved in the development of that tool I really can't be of much more use. Sorry.

@metalichen
Copy link

Hey @kaustubh-amritkar, were you able to resolve it? I have the same issue, while running foldtree colab on the same set of pdb files that were analyzed successfully just last week

@kaustubh-amritkar
Copy link
Author

Hi @metalichen, I was able to run FoldTree locally on the custom set of pdb files. But yeah, it did keep crashing on the google colab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants