Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GaiaClass.ROW_LIMIT prevents effective searching #1760

Open
lib-j opened this issue Jul 6, 2020 · 6 comments · May be fixed by #2184
Open

GaiaClass.ROW_LIMIT prevents effective searching #1760

lib-j opened this issue Jul 6, 2020 · 6 comments · May be fixed by #2184

Comments

@lib-j
Copy link

lib-j commented Jul 6, 2020

The GaiaClass.ROW_LIMIT put in from this PR:
#1641

Now limits all queries to return only the top 50 objects. There is no way to adjust this row limit without breaking python conventions over class attributes.

A better solution would be to use this attribute (set to unlimited) as a default to all the search methods, with a parameter to these methods for overriding the default.

@keflavich
Copy link
Contributor

What do you mean "breaking python conventions over class attributes"?

You should use a pattern like:

from astroquery.gaia import Gaia
Gaia.ROW_LIMIT = 1000
Gaia.query...

in the present implementation.

I think I follow the general premise here, and it would be nice to have Gaia behave like Vizier, e.g.:

from astroquery.vizier import Vizier
vv = Vizier(row_limit=100)
vv.query...

but that is presently not implemented.

Note that we will not consider changes to set defaults to "unlimited", as that is more likely to result in dangerous undesired requests (the worst case might be requesting terabytes of data unwittingly and the server not rejecting the request; the best case is the server rejecting the request)

@lib-j
Copy link
Author

lib-j commented Jul 6, 2020

I understand the worries about large queries overloading the servers. However at minimum it should be made clear to the user in the documentation that a default is set and how to override this easily. The current default number being limited to 50 is far too small for general users.

This made understanding the results from your documented examples harder to understand as they return more sources than the default size.

Regarding python conventions PEP8 states that constants are the only type of variable that should be fully capitalised, as shown by https://www.python.org/dev/peps/pep-0008/#constants. These should not be generally changed by users, and should be only set during builds. Hence my confusion when this constant attribute had to be changed to return queries with result sizes over 50. Like almost all Gaia queries.

@keflavich
Copy link
Contributor

OK, there's a very good point here that the documentation has not kept up with code changes. There should have been accompanying documentation changes in #1641 that were left out.

Re: constants, it's not clear to me that we violate PEP8; these are class attributes, not global constants. I've seen class-level constants used in this way in other parts of the python ecosystem, but if we are in violation of a specific part of the PEP, I think a refactor and eventual deprecation would be in order. @bsipocz @ceb8 , could you chime in on this?

@lib-j thanks for raising these points, they're important and it's likely we simply overlooked recommendations in the original design and clearly had some oversights in recent changes. We welcome documentation fix PRs, if you're willing. We'd also welcome help on refactoring to meet PEP requirements, if that's the path we decide on (but let's discuss a little more before anyone takes the time to make a sweeping change).

@lib-j
Copy link
Author

lib-j commented Jul 6, 2020

I wasn't intending to do any documentation PR's; I was only looking through the code to try an understand why my search results were not producing the expected values or the same as what the examples returned, hence raising this issue to prevent other users loosing time trying to figure this out.

On the class attributes they are commonly not capitalised as stated in the PEP8 link.

Cheers for looking into this for me and responding so quickly.

@ceb8
Copy link
Member

ceb8 commented Jul 7, 2020

While the capital letter convention does not strictly follow pep8 it is common enough throughout the python ecosystem that I am comfortable leaving it as is. Strict best practice would be to add a getter, but that also seems minor.

The lack of documentation is important, and ideally the function would produce a warning if results have been truncated due to hitting the row limit.

@keflavich
Copy link
Contributor

keflavich commented Jul 16, 2020

To summarize the current state of this issue:

  • Gaia documentation doesn't clearly specify how you should change the ROW_LIMIT
  • Gaia documentation has examples that don't match default behaviors
  • A more convenient mechanism to change the ROW_LIMIT would be helpful

mfonism added a commit to mfonism/astroquery that referenced this issue Oct 24, 2020
mfonism added a commit to mfonism/astroquery that referenced this issue Nov 4, 2020
bsipocz added a commit that referenced this issue Nov 4, 2020
Update documentation for the gaia query object (#1760)
@eerovaher eerovaher linked a pull request Oct 20, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants