New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simbad: refactor to use TAP #2954
base: main
Are you sure you want to change the base?
Conversation
Hello @ManonMarchand! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2024-04-16 08:53:18 UTC |
01a826c
to
6b01523
Compare
On codestyle: tox -e codestyle tells me:
and here I should break before? Which one do you prefer? I'm happy with both. Edit: I applied the line breaking after operator rule to reduce the number of issues with the PEP8 detector here. On criteria_lextab.py and criteria_parser.pyThese files are automatically generated by the Similar files are not linted in astropy code base ex: https://github.com/astropy/astropy/blob/main/astropy/units/format/generic_parsetab.py |
52edbf9
to
535ddfc
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2954 +/- ##
==========================================
+ Coverage 66.81% 67.09% +0.27%
==========================================
Files 237 239 +2
Lines 18324 18279 -45
==========================================
+ Hits 12244 12264 +20
+ Misses 6080 6015 -65 ☔ View full report in Codecov by Sentry. |
1fde941
to
e5563ad
Compare
On the I thought yacc and lex just changed places between astropy 4.2.1 and astropy 4.3, but some method's signatures changed too, see commit here astropy/astropy@85bffd9 The currently minimal astropy version supported in astroquery is 4.2.1. Is the jump negotiable? I don't see a lot of deprecations in the change log between these two versions. |
This won't make the cut for 0.4.7, and after that one is out I'm planning to bump the minimum required versions (at least to astropy 5.0, but maybe even something newer), so no need to do workaround for support for old versions. |
We enforce W504 and ignore W503 as break before operator is a bit more legible logic. |
befe1c5
to
585b93d
Compare
585b93d
to
90480f7
Compare
Hello! I'm almost in a un-draft state :) Here are the remaining points that I'm unsure about:
|
90480f7
to
d8a50f3
Compare
- add construct_query method that reads the columns_in_output, join, and criteria attirbutes - support the removed query_criteria method functionnalities by adding a criteria attribute that should be a valid adql clause. The utils CriteriaTranslator can translate between the old and new syntax. - make ROW_LIMIT = -1 to return all lines because TOP 0 or maxrec = 0 are the dedicated way to retrieve table metadata in TAP - fix usage of lru_cache on class methods that can cause memory leaks (see bugbear rule B019)
these are the files of the simbad.utils.CriteriaTranslator parser
and move from os to pathlib
this commit also adds a patch to simbad's query_objects in the tests
simbad calls lru_cache from python core library, so no cache_location
d8a50f3
to
d79cddc
Compare
d79cddc
to
5ca38bc
Compare
Sorry for the delay in responding:
Next release, 0.4.8. At some point I'll refactor the versioning and related infrastructure, but don't have an ETA when it happens.
I suspect this also run into the same pyvo related RTD failure and was unrelated to your changes. |
Switching back to draft, I'm tweaking a bit the support of the legacy 'query_criteria' |
Hi astroquery,
[draft] Docs are not entirely clean yet, but the code should stay more or less like this before I un-draft. Here are some changes and questions about them:
List of changes:
Adding criteria
Formerly, there was only one way to enter criteria on the output, and it was the
query_criteria
method. This does not exist anymore, but is replaced by a newcriteria
argument in everyquery_***
method (exceptquery_tap
because people can directly write that into their ADQL string).This new interface looks like this:
Where the criteria string can be translated from the old syntax to the new one with the helper class:
This could also be done automatically at each criteria insertion (like detecting if this is the old or new format and translating, maybe with a warning indicating the new ADQL syntax?). What are your thoughts?
On the output
Adding columns to the output
This was done with
add_votable_fields
before. This is replaced byadd_to_output
where the arguments are reproduced to fit as close as possible to the votable fields, but some were documented here but deprecated for years in Simbad. Seeing that there is no issue opened in this repo, these are not used.That'd look like this:
Every other possibility can be listed with:
(note that the number of possible options went from 107 to 97, among with 12 tables that really don't exist in Simbad since years. So the possibilities are now slightly bigger 🙂 )
Changes in output style
All these could be hidden to the users by modifying the output table in query_tap on python side. Is it worth it?
Empty result
Empty result is valid in TAP. So the default
ROW_LIMIT
could not stay at zero to mean infinity. I copied VizieR module API and went with -1.It also means that a query with no answer returns an empty table and not None like before. This broke JWST module, see later is this long PR description.
Caching
Caching in the simbad module is now handled by python built-in
lru_cache
. This might be reverted if we add a caching mechanism to BaseVOQuery (something that'd keep things in a votable-xml format in the default astroquery cache folder maybe?). But I did not go all this way yet.Changes to query_*** methods (except the criteria argument covered above)
Query_object
The
script number ID
inquery_object
is replaced bymatched_id
that contains the ID that corresponded to the wildcard expression.It looks like this:
Query_objects
It now has a
typed_id
column as requested in #967 . Theobject_number_id
replacesscript_number_id
(but this could be reverted, it's just strange as there is really one script)Looks like this:
Note that the requested feature (I could not find the issue anymore) that objects not found would return an empty line is there: M503 obviously does not exist.
Query_bibcode
The output is very different.
Former output:
New output:
I know it's a very different output but I really hated the former one. I can try to reproduce the old one but☹️
Also, it is now possible to retrieve the abstract with
abstract=True
.Query_region, query_catalog, query_bibobj, query_object_ids
No big changes
In JWST
As Simbad is used in the tests, I just patched what I could how I could, may not be the prettiest way to go :/
Also, now an empty response returns an empty table and not None, so I reflected that in jwst
core.py
.Issues linked to this PR
Fixes: #2198
Fixes: #1468
Partially: #967 (It is done for query_objects, but not for query_region yet)