-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read2tree can't find corresponding CDS for each OMA group #33
Comments
For additional information, an example of a protein sequence within an OMA group and its corresponding CDS (located in a single file containing all CDS).
|
Apologies for commenting so much on my own post. It appears the issue was similar to #20 where manual deletion of all underscores "_" fixed the issue. Program is currently running, will update when complete. |
I've subsetted 69 (selected as they include sequences from all genomes of interest) OMA groups composed from 22 genomes using the OMA standalone package. I've also made a fasta file with the corresponding CDS sequences whilst using the same headers found in the OMA groups. However, I'm encountering issues that I'm finding hard to overcome.
i.e formatting examples
(Marker gene)
Protein 1 [Animal 1]
DVAEKCRVL
Protein 1 [Animal 2]
DVAEKCRVL
(Corresponding CDS file)
Protein 1 [Animal 1]
ATCGATCGATCG
Protein 1 [Animal 2]
ATCGATCGATCG
However, when I start the Read2Tree program with the below command (All files and folders (test_markers) are in directory in which I run read2tree).
read2tree --reference --standalone ./test_markers --output_path output_v1 --dna_reference total_orths_cds.fa
I get the error:
--- Load OGs with min 0 species from oma test_markers - mode = marker_genes ---
Loading files for pre-filter: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 69/69 [00:00<00:00, 2053.57 OGs/s]
2023-07-12 15:42:14,120 - read2tree.OGSet - INFO - --- Load ogs and find their corresponding DNA seq from total_orths_cds.fa ---
2023-07-12 15:42:14,121 - read2tree.OGSet - INFO - Loading total_orths_cds.fa into memory. This might take a while . . .
Loading OGs: 0%| | 0/69 [00:00<?, ? OGs/s]
Loading OGs: 0%| | 0/69 [06:01<?, ? OGs/s]
Traceback (most recent call last):
File "/home/youseuf/miniconda3/envs/read2tree2/bin/read2tree", line 4, in
import('pkg_resources').run_script('read2tree==0.1.4', 'read2tree')
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/pkg_resources/init.py", line 720, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/pkg_resources/init.py", line 1570, in run_script
exec(script_code, namespace, namespace)
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/EGG-INFO/scripts/read2tree", line 16, in
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/read2tree/main.py", line 289, in main
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/read2tree/OGSet.py", line 79, in init
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/read2tree/OGSet.py", line 192, in _load_ogs
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/read2tree/OGSet.py", line 337, in _check_dna_aa_length_consistency
File "/home/youseuf/miniconda3/envs/read2tree2/lib/python3.8/site-packages/read2tree-0.1.4-py3.8.egg/read2tree/OGSet.py", line 337, in
AttributeError: 'NoneType' object has no attribute 'id'
when I look into the mplog.log file i see:
2023-07-12 15:42:14,120 - read2tree.OGSet - INFO - --- Load ogs and find their corresponding DNA seq from total_orths_cds.fa ---
2023-07-12 15:42:14,121 - read2tree.OGSet - INFO - Loading total_orths_cds.fa into memory. This might take a while . . .
2023-07-12 15:42:14,146 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): omabrowser.org:80
2023-07-12 15:42:14,200 - urllib3.connectionpool - DEBUG - http://omabrowser.org:80 "GET /api/protein/XP/ HTTP/1.1" 301 162
2023-07-12 15:42:14,202 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): omabrowser.org:443
2023-07-12 15:43:14,326 - urllib3.connectionpool - DEBUG - https://omabrowser.org:443 "GET /api/protein/XP/ HTTP/1.1" 504 160
2023-07-12 15:43:14,329 - read2tree.OGSet - DEBUG - DNA not found for XP_046914939.1_OG24421.
2023-07-12 15:43:14,331 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): omabrowser.org:80
2023-07-12 15:43:14,384 - urllib3.connectionpool - DEBUG - http://omabrowser.org:80 "GET /api/protein/XP/ HTTP/1.1" 301 162
2023-07-12 15:43:14,387 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): omabrowser.org:443
2023-07-12 15:44:14,524 - urllib3.connectionpool - DEBUG - https://omabrowser.org:443 "GET /api/protein/XP/ HTTP/1.1" 504 160
2023-07-12 15:44:14,526 - read2tree.OGSet - DEBUG - DNA not found for XP_027206261.1_OG24421.
2023-07-12 15:44:14,529 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): omabrowser.org:80
2023-07-12 15:44:14,583 - urllib3.connectionpool - DEBUG - http://omabrowser.org:80 "GET /api/protein/XP/ HTTP/1.1" 301 162
2023-07-12 15:44:14,586 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): omabrowser.org:443
2023-07-12 15:45:14,724 - urllib3.connectionpool - DEBUG - https://omabrowser.org:443 "GET /api/protein/XP/ HTTP/1.1" 504 160
2023-07-12 15:45:14,727 - read2tree.OGSet - DEBUG - DNA not found for XP_029824739.1_OG24421.
2023-07-12 15:45:14,935 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): omabrowser.org:80
2023-07-12 15:45:14,988 - urllib3.connectionpool - DEBUG - http://omabrowser.org:80 "GET /api/protein/XP/ HTTP/1.1" 301 162
2023-07-12 15:45:14,991 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): omabrowser.org:443
2023-07-12 15:46:15,132 - urllib3.connectionpool - DEBUG - https://omabrowser.org:443 "GET /api/protein/XP/ HTTP/1.1" 504 160
2023-07-12 15:46:15,135 - read2tree.OGSet - DEBUG - DNA not found for XP_054162837.1_OG24421.
2023-07-12 15:46:15,137 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): omabrowser.org:80
2023-07-12 15:46:15,190 - urllib3.connectionpool - DEBUG - http://omabrowser.org:80 "GET /api/protein/XP/ HTTP/1.1" 301 162
2023-07-12 15:46:15,193 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): omabrowser.org:443
2023-07-12 15:47:15,314 - urllib3.connectionpool - DEBUG - https://omabrowser.org:443 "GET /api/protein/XP/ HTTP/1.1" 504 160
2023-07-12 15:47:15,317 - read2tree.OGSet - DEBUG - DNA not found for XP_053212400.1_OG24421.
Any help would be extremely appreciated.
The text was updated successfully, but these errors were encountered: