Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use boto3 instead of awscli to download code package #474

Closed
wants to merge 4 commits into from
Closed

Use boto3 instead of awscli to download code package #474

wants to merge 4 commits into from

Conversation

tkislan
Copy link

@tkislan tkislan commented Apr 12, 2021

Updated PR from #389
Resolves #387

@tkislan
Copy link
Author

tkislan commented Apr 12, 2021

No idea why the build failed

Downloading https://homebrew.bintray.com/bottles/qpdf-10.3.1.catalina.bottle.tar.gz
curl: (22) The requested URL returned error: 403 Forbidden
Error: Failed to download resource "qpdf"

@savingoyal
Copy link
Collaborator

I launched the tests again. Hopefully, it should resolve now. The existing test suite doesn't exercise the files that the PR touches.

@savingoyal
Copy link
Collaborator

@tkislan Have you tested this PR? I am running into this error - boto3;: 1: boto3;: Syntax error: end of file unexpected (expecting "done")

@tkislan
Copy link
Author

tkislan commented Apr 19, 2021

I only tested the generated command that downloads the file

Can you please provide more context?
What the whole command looks like, etc

"%s -c \"" % self._python()
+ "import boto3; "
+ "exec('try:\\n from urlparse import urlparse\\nexcept:\\n from urllib.parse import urlparse');"
+ "parsed = urlparse('%s');" % s3_path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkislan This parsing can happen while we are generating the get_s3_download_command locally rather than in the remote environment. That will save us precious entry point chars.

"""

return (
"%s -c \"" % self._python()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt \" would work here given that we use shlex downstream for parsing the entry point.

@savingoyal
Copy link
Collaborator

I only tested the generated command that downloads the file

Can you please provide more context?
What the whole command looks like, etc

The issue is with incorrect quoting given that we use shlex downstream. This is what the command looks like -

"/bin/sh","-c","set -e && echo 'Setting up task environment.' && python -m pip install click requests boto3 -qqq && mkdir metaflow && cd metaflow && mkdir .metaflow && i=0; while [ $i -le 5 ]; do echo 'Downloading code package.'; python -c import","boto3;","exec(try:\\n from urlparse import urlparse\\nexcept:\\n from urllib.parse import urlparse);parsed","=","urlparse(s3://oss-dev-metaflows3bucket-13oayj2nh8n4l/metaflow/ForeachFlow2q/data/95/950443769eee3cac5d35c5fa724bf390d5093b2b);boto3.client(s3).download_file(parsed.netloc,parsed.path.lstrip(/),","job.tar); >/dev/null &&                         echo 'Code package downloaded.' && break; sleep 10;...

@savingoyal savingoyal self-assigned this Apr 20, 2021
@tkislan
Copy link
Author

tkislan commented Apr 20, 2021

@savingoyal Thx for the pointer to shlex .. Honestly I didn't delve into the coda that much, as I hoped that previous PR had it almost done

This is the new full command that it generates, and I was able to run it locally

['/bin/sh', '-c', "set -e && echo 'Setting up task environment.' && python -m pip install click requests boto3 -qqq && mkdir metaflow && cd metaflow && mkdir .metaflow && i=0; while [ $i -le 5 ]; do echo 'Downloading code package.'; python -c 'import sys; import boto3; boto3.client(sys.argv[1]).download_file(sys.argv[2], sys.argv[3], sys.argv[4])' s3 'def3b3-mf-test' 'ParameterFlow/data/24/243633f93d432b2431cfac4518a44f4e9eb68881' job.tar >/dev/null &&                         echo 'Code package downloaded.' && break; sleep 10; i=$((i+1)); done && if [ $i -gt 5 ]; then echo 'Failed to download code package from s3://def3b3-mf-test/ParameterFlow/data/24/243633f93d432b2431cfac4518a44f4e9eb68881 after 6 tries. Exiting...' && exit 1; fi && tar xf job.tar && echo 'Task is starting.' && sleep 10"]

I had to go for the sys.argv solution, as I wasn't able to force the correct quote escaping inside the single quoted string

@tkislan tkislan requested a review from savingoyal April 23, 2021 07:54
@tkislan
Copy link
Author

tkislan commented Apr 30, 2021

@savingoyal is anything still blocking this?

@tkislan
Copy link
Author

tkislan commented Aug 9, 2021

Closing, as nobody cares ..

@tkislan tkislan closed this Aug 9, 2021
@tkislan tkislan deleted the s3-boto3-download branch August 9, 2021 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use boto3 instead of awscli python module to copy packages
2 participants