Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: switch from curl to urllib for HTTP requests #29970

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

iw4p
Copy link

@iw4p iw4p commented Apr 26, 2024

Switched from using subprocess to make HTTP requests (curl) to using the Python requests library, improving cleanliness and maintainability.

@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 26, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK AngusP

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Comment on lines 143 to 145
response = requests.get(tarballUrl)
with open(tarball, 'wb') as file:
file.write(response.content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a refactor. It is a behavior change when an error occurs, for example a network error, or a file error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to read the entire data into memory, you might as well do the hashing at the same time? Currently, it reads back the same tarball it wrote and passes it to the hasher, which seems kind of a waste.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right about behavior change. I am still not sure about using script or util tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to be careful about behavior changes but i do think this is an improvement, using requests.head is better practice than calling curl then looking for a string in the output.

@laanwj
Copy link
Member

laanwj commented Apr 26, 2024

This is a script used for the tests, so adding test tag.

@iw4p iw4p changed the title refactor: switch from curl to requests for HTTP requests test: switch from curl to requests for HTTP requests Apr 26, 2024
@iw4p
Copy link
Author

iw4p commented Apr 26, 2024

The title's tag also changed from Refactor to Test

@iw4p iw4p force-pushed the feature/requests-library-usage branch from 38176a7 to ddd27cb Compare April 26, 2024 17:42
@iw4p
Copy link
Author

iw4p commented Apr 26, 2024

@laanwj I investigated and found that you modified import requests here here. Is there any shell script that I can provide pip install requests to pass the previous release, depends DEBUG?

edit: It seems bitcoin/ci/test/00_setup_env_native_previous_releases.sh is responsible to provides everything that the environment's needed. Shall I add pip install requests here?

@laanwj
Copy link
Member

laanwj commented Apr 26, 2024

Yes, that's usually how it'd be done.

But gah, requests is an external dependency? That's unfortunate, i don't think we should be adding dependencies unless absolutely necessary. Probably better to do this with python's built in urllib or http client functionality, or keep it like this.

@iw4p iw4p force-pushed the feature/requests-library-usage branch from ddd27cb to 29e9377 Compare April 27, 2024 05:30
@iw4p iw4p changed the title test: switch from curl to requests for HTTP requests test: switch from curl to urllib for HTTP requests Apr 27, 2024
@iw4p iw4p force-pushed the feature/requests-library-usage branch from 29e9377 to 7fe94f7 Compare April 27, 2024 05:34
@iw4p
Copy link
Author

iw4p commented Apr 27, 2024

I replaced the requests with urllib, tested the script for downloading .tar.gz file and works fine for me.

['curl', '--remote-name', tarballUrl]
]
try:
response = urlopen(Request(tarballUrl, method='HEAD'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required to check for 404 in a separate, additional request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the current code works too - it first does a HEAD request to check that the file is there, then goes to downloading. i don't know the original reasoning behind this, though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I could write more pythonically and with one request, but the reason I wrote it this way was to commit to the previous code. If you agree with one request, I can change it.
Should this change be in a new commit or should I force it on the previous commit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this change be in a new commit or should I force it on the previous commit?

Squashing seems fine?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To rewrite the git history for having the last change on the last commit and ignore old ones, I forced it. I used squashing only for merging before that.

Copy link
Contributor

@AngusP AngusP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crACK 7fe94f7

if ret:
return ret
try:
response = urlopen(tarballUrl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there's a minor behaviour change here, where urlopen will follow redirects whereas curl won't usually

$ curl -I https://httpbin.org/absolute-redirect/3
HTTP/2 302
# ...
>>> from urllib.request import urlopen
>>> response = urlopen("https://httpbin.org/absolute-redirect/3")
>>> response.code
200  # Not 302 because redirects were followed

This should be fine, but worth a mention.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, Thank you.

@@ -5,7 +5,7 @@
# file COPYING or http://www.opensource.org/licenses/mit-license.php.
#
# Download or build previous releases.
# Needs curl and tar to download a release, or the build dependencies when
# Needs urllib built-in python library and tar to download a release, or the build dependencies when
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is part of the standard library, I don't think you need to list it as a requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants