Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pocket mode=specific is not callable #9

Open
bertadenes opened this issue Nov 2, 2020 · 7 comments
Open

pocket mode=specific is not callable #9

bertadenes opened this issue Nov 2, 2020 · 7 comments

Comments

@bertadenes
Copy link

Exploring the module, looks pretty decent. However, every attempt running specific mode reverts back to run 'largest' instead:

  • from pymol GUI, it creates a config file saying mode=largest and runs accordingly
  • from pymol cmd, it does the same
  • trying to run any config as python3 -m pyvol exits without creating anythin (this one might be down to my setup)

Running with opensource pymol 2.3.0 on Ubuntu 20.04.1, msms v2.6.1

@rhs2132
Copy link
Collaborator

rhs2132 commented Jun 18, 2021

This is quite late, but I finally have some time to push some changes and wanted to follow up on this. If you don't mind looking back, what options did you supply to run in specific mode?

I think I will add an option to prevent falling back to default behavior and add better log information to explain unexpected behavior like this.

@bertadenes
Copy link
Author

Sure here is an example with the GUI v1.7.8 (on top of pymol 2.3.0)
I fetched a random PDB (9ICV), in which there are four pockets above 100 A^3, in default:

pyvol.pymol_interface:    INFO     Pocket 0 (105920-983185_9ICV_p0)  Volume: 680.0 A^3
pyvol.pymol_interface:    INFO     Pocket 1 (105920-983185_9ICV_p1)  Volume: 651.0 A^3
pyvol.pymol_interface:    INFO     Pocket 2 (105920-983185_9ICV_p2)  Volume: 188.0 A^3
pyvol.pymol_interface:    INFO     Pocket 3 (105920-983185_9ICV_p3)  Volume: 148.0 A^3

The smallest one is close to the residue Lys68, therefore I tried this:
image
Which returns the largest pocket only. The corresponding config file is generated wrongly in the first place:

[General]
prot_file = /home/berta/temp/110246-476122_9ICV.pyvol/110246-476122_9ICV_prot.pdb
min_rad = 1.4
max_rad = 3.4
constrain_radii = False

[Specification]
mode = largest

[Partitioning]
subdivide = False

[Output]
output_dir = /home/berta/temp/110246-476122_9ICV.pyvol
prefix = 110246-476122_9ICV
logger_stream_level = INFO
logger_file_level = DEBUG

[PyMOL]
protein = (9ICV) and (poly)
protein_only = True
display_mode = solid
alpha = 1.0

Same behaviour using the command:
pocket protein=9ICV, mode=specific, residue="i. 68"
I am fairly confident you can reproduce this issue, we did encounter it on three or four different computers.

@Christinele14
Copy link

I also meet the issue when trying to use mode = specific. I tried pyvol in command line with the input_template.cfg file. I tried to put ligand (in PDB format) together with protein (PDB format) to calculate the pocket volume in which the ligand binds. However, it returns ValueError: number of xyz array columns must equal 3. I tried with the test case of 1uwh_B_lig.pdb and 1uwh_B_prot.pdb in pyvol/tests folder but it still returns the mentioned ValueError as well. Please take a look at it. Thank you.

@rhs2132
Copy link
Collaborator

rhs2132 commented Oct 19, 2022

I realize that the code is unhelpfully opaque here. Any PyMOL inputs get cleaned up to remove improperly formatted inputs and then are processed with calls into PyMOL. The 'largest' mode is the default, so if the specification fails it reverts to that behavior but apparently without enough transparency.

@bertadenes in your case it doesn't like the residue selection. That must be a PyMOL selection string that specifies the residue. If you want to use the residue number, the 'resid' parameter is the correct alternative.

@Christinele14 can you provide the generated config file?

@Christinele14
Copy link

Christinele14 commented Oct 19, 2022

Because it met error, the config file cannot be generated. Here is the cfg file I used to run pyvol with specific mode together with a ligand which binds with protein in the pocket I would like to calculate volume.

[General]
prot_file = 1uwh_B_prot.pdb
lig_file = 1uwh_B_lig.pdb
min_rad = 1.4
max_rad = 3.4
constrain_radii = True

[Specification]
mode = specific
# lig_excl_rad = 3.5
# lig_incl_rad = 5.2
min_volume = 200

[Partitioning]
subdivide = False
min_subpocket_rad = 1.7
max_subpocket_rad = 3.4
min_subpocket_surf_rad = 1.0
radial_sampling = 0.1
inclusion_radius_buffer = 1.0
min_cluster_size = 50

[Output]
project_dir = /home/christine/pyvol/tests
logger_stream_level = INFO
logger_file_level = DEBUG

@alephreish
Copy link

alephreish commented Jul 16, 2023

Hey all, I've stumbled upon this issue today. The documentation is indeed pretty opaque about this. The relevant piece of code is in identify.py:

        elif opts.get("mode") == "specific":
            if opts.get("coordinates") is not None:
                coordinate = opts.get("coordinates")
                logger.info("Specific pocket identified from coordinate: {0}".format(opts.get("coordinates")))
            elif opts.get("resid") is not None:
                resid = str(opts.get("resid"))
                chain = None
                if not resid[0].isdigit():
                    chain = resid[0]
                    resid = int(resid[1:])
                else:
                    resid = int(resid)
                coordinate = utilities.coordinates_for_resid(opts.get("prot_file"), resid=resid, chain=chain)
                logger.info("Specific pocket identified from residue: {0} -> {1} (truncated)".format(opts.get("resid"), coordinate[0,:]))
            elif l_s is not None:
                lig_coords = l_s.xyz
                coordinate = np.mean(l_s.xyz, axis=0).reshape(1, -1)
                logger.info("Specific pocket identified from mean ligand position: {0}".format(coordinate))
            else:
                logger.error("A coordinate, ligand, or residue must be supplied to run in specific mode")
                return None

I haven't tried to make it work with resid, but coordinates works fine with the CLI and the GUI:

[Specification]
mode = specific
coordinates = -2.335 -2.01 -3.167

The coordinates should be delimited with a single space (opts["coordinates"] = np.asarray([float(x) for x in opts.get("coordinates").split(" ")]).reshape(-1,3)).

Another important thing is that scipy should be pinned to a version before 1.9.0 (e.g. scipy=1.8.1 from conda-forge works fine) since this is when the n_jobs argument to scipy.spatial.cKDTree.query was removed, otherwise you get an exception:

pyvol.identify:   	INFO    	Specific pocket identified from coordinate: [[-2.335 -2.01  -3.167]]
Traceback (most recent call last):
  File "/bigdata/andrey/antenna-alphafold/pyvol_env/bin/pyvol", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/bigdata/andrey/antenna-alphafold/pyvol_env/lib/python3.11/site-packages/pyvol/__main__.py", line 17, in main
    identify.pocket_wrapper(**configuration.file_to_opts(args.cfg_file))
  File "/bigdata/andrey/antenna-alphafold/pyvol_env/lib/python3.11/site-packages/pyvol/identify.py", line 186, in pocket_wrapper
    all_pockets, output_opts = pocket(**opts)
                               ^^^^^^^^^^^^^^
  File "/bigdata/andrey/antenna-alphafold/pyvol_env/lib/python3.11/site-packages/pyvol/identify.py", line 142, in pocket
    id_coord = p_bs.nearest_coord_to_external(coordinate).reshape(1, -1)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/bigdata/andrey/antenna-alphafold/pyvol_env/lib/python3.11/site-packages/pyvol/spheres.py", line 329, in nearest_coord_to_external
    dist, indices = kdtree.query(coordinates, n_jobs=-1)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_ckdtree.pyx", line 786, in scipy.spatial._ckdtree.cKDTree.query
  File "_ckdtree.pyx", line 388, in scipy.spatial._ckdtree.get_num_workers
TypeError: Unexpected keyword argument {'n_jobs': -1}

@alephreish
Copy link

Weird enough though, there are cases in which searching around a coordinate fails with "Binding pocket not correctly identified--try an alternative method to specify the binding pocket" even if the coordinate is inside the pocket. Try https://alphafold.ebi.ac.uk/files/AF-A0A1L2YW92-F1-model_v4.pdb with coordinates = 2, -1, -1. This seems to be caused by p_bs.nearest_coord_to_external(coordinate).reshape(1, -1) returning a coordinate that is well outside of the protein. The pocket is identified correctly with mode=all or mode=largest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants