Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace awscli v1 with boto3 during environment bootstrapping #1778

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

trhodeos
Copy link
Contributor

Fixes #387

@trhodeos trhodeos changed the title Replace awscliv1 with boto3 Replace awscli v1 with boto3 during environment bootstrapping Mar 27, 2024
madhur-ob
madhur-ob previously approved these changes Mar 27, 2024
Copy link
Collaborator

@madhur-ob madhur-ob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@trhodeos
Copy link
Contributor Author

@madhur-ob sorry I had to fix some formatting / precommit issues. When you get the chance, would you mind re-reviewing?

@trhodeos
Copy link
Contributor Author

Thanks @madhur-ob .. It looks like some tests failed.. Perhaps they are flaky?

Also, would it be possible to run the metaflow.s3-tests? Seems like this'd be a useful gauge to make sure this works as intended!

@@ -4,6 +4,22 @@
from metaflow.exception import MetaflowException


def parse_s3_full_path(s3_uri):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trhodeos is there any specific reason to create a utility function? maybe we can have this snippet live right next to L:91 in metaflow_environment.py. it will help in making the code much more readable (avoiding an extra jump to a different file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mirrors what we do for azure and gcs. I was just following convention here, happy to pull this inline if you think it's better.


# <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
scheme, netloc, path, _, _, _ = urlparse(s3_uri)
assert scheme == "s3"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within metaflow code base, we have elected to not use assert statements. also, when the assert statement fails, the user is left no better in their understanding of how to fix the issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from parse_gcs_full_path.. Happy to remove it though if we don't want it!

bucket = netloc
path = path.lstrip("/").rstrip("/")
if path == "":
path = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't expect this to be None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removing it makes sense for this use case.. I sound like a broken record at this point, but I just copied this wholesale from parse_gcs_full_path.

assert netloc is not None

bucket = netloc
path = path.lstrip("/").rstrip("/")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the rstrip needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need it for this case, but just keeping this in line with the other parse functions.. We can probably remove if we just inline this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use boto3 instead of awscli python module to copy packages
3 participants