Build Docker image and push to GHCR #230

br3ndonland · 2024-04-19T01:51:01Z

Description

Closes #58

Up to this point, the project has been set up as a Docker action referencing the Dockerfile.

Lines 86 to 88 in 3fbcf7c

 runs: 

 using: docker 

 image: Dockerfile

The downside to using the Dockerfile for the action is that the Docker image must be built every time the action is used (#58).

This PR will set up the project to build the Docker image and push it to GitHub Container Registry (GHCR). This change will speed up user workflows every time the action is used because the workflows will simply pull the Docker image from GHCR instead of building again.

Changes

Build container image with GitHub Actions

This PR will build Docker images with the Docker CLI (docker build). Builds will include inline cache metadata so layers can be reused by future builds.

This PR only proposes to build container images for x86_64 (linux/amd64) because GitHub Actions Linux runners currently only support x86_64 CPU architectures (actions/runner-images#5631), and this project only supports GitHub Actions Linux runners. The README explains:

Since this GitHub Action is docker-based, it can only be used from within GNU/Linux based jobs in GitHub Actions CI/CD workflows. This is by design and is unlikely to change due to a number of considerations we rely on.

Push container image to GHCR

The workflow will log in to GHCR using the built-in GitHub token and push the Docker image. Workflow runs triggered by pull requests will build the Docker image and run the smoke tests but will not push the Docker image.

Update action to pull container image from GHCR

Docker actions support pulling in pre-built Docker images by supplying a registry address to the image: key. The downside to this syntax is that there's no way to specify the correct Docker tag because the GitHub Actions image: and uses: keys don't accept any context. For example, if a user's workflow has uses: pypa/gh-action-pypi-publish@release/v1.8, then the action should pull in a Docker image built from the release/v1.8 ref, something like ghcr.io/pypa/gh-action-pypi-publish:release-v1.8 (Docker tags can't have /).

# this works but the image tag can't be customized
runs:
  using: docker
  image: docker://ghcr.io/pypa/gh-action-pypi-publish:release-v1.8

# this doesn't work because `image:` doesn't support context
runs:
  using: docker
  image: docker://ghcr.io/pypa/gh-action-pypi-publish:${{ github.action_ref }}

The workaround is to switch the top-level action.yml to a composite action that then calls the Docker action, substituting the correct image name and tag.

Originally I thought that I'd have a workflow where I trigger a release, that release adds a commit that hardcodes an update to action.yml with the "future" version tag, tags that commit and pushes it (post docker publish). It wouldn't be on the main branch, the tags would be on the orphaned leaves.

This looks like a better idea so far. Thanks again!

pypa#230 (comment)

br3ndonland · 2024-04-26T23:12:16Z

That sounds great. Take your time. Thanks for your consideration.

If you do decide to accept this change, I'm happy to help maintain the workflows in the future. Feel free to mention me @br3ndonland and I will help address any issues that come up.

pypa#230 (comment)

webknjaz · 2024-05-16T15:56:33Z

Thanks! I've hit "rebase" on the UI to get this on top of the recent changes/linting/lockfile bumps but haven't yet looked into it deeper.

.github/workflows/build-and-push-docker-image.yml

webknjaz · 2024-05-16T19:10:30Z

.github/actions/run-docker-container/action.yml

+ args:
+ - ${{ inputs.user }}
+ - ${{ inputs.password }}
+ - ${{ inputs.repository-url }}
+ - ${{ inputs.packages-dir }}
+ - ${{ inputs.verify-metadata }}
+ - ${{ inputs.skip-existing }}
+ - ${{ inputs.verbose }}
+ - ${{ inputs.print-hash }}


These aren't actually used. Somebody contributed them, but I don't see them being needed. So perhaps we shouldn't keep them around anymore.

Aren't the inputs used by the entrypoint script?

gh-action-pypi-publish/twine-upload.sh

Lines 37 to 41 in 699cd61

INPUT_REPOSITORY_URL="$(get-normalized-input 'repository-url')"

INPUT_PACKAGES_DIR="$(get-normalized-input 'packages-dir')"

INPUT_VERIFY_METADATA="$(get-normalized-input 'verify-metadata')"

INPUT_SKIP_EXISTING="$(get-normalized-input 'skip-existing')"

INPUT_PRINT_HASH="$(get-normalized-input 'print-hash')"

.github/workflows/build-and-push-docker-image.yml

webknjaz · 2024-05-16T19:18:18Z

.github/workflows/self-smoke-test-action.yml

 pull_request:
+ workflow_run:


Two separate workflow runs is often hard to track. Instead, I adopted a practice of modularizing the workflow pieces as reusable workflows having the reusable- prefix in their names. This allows embedding everything in all the right places. Let's try this, WDYT?

I will return to this suggestion at a later time.

.github/workflows/self-smoke-test-action.yml

.pre-commit-config.yaml

Dockerfile

webknjaz · 2024-05-16T19:37:27Z

action.yml

+ steps:
+ - name: Reset path if needed
+ run: |
+ # Reset path if needed


Why would this be needed outside the container?

Because you've set up a test that modifies the $PATH (#112, 1350b8b).

gh-action-pypi-publish/.github/workflows/self-smoke-test-action.yml

Lines 85 to 90 in 699cd61

- name: ✅ Smoke-test the locally checked out action

uses: ./test

env:

DEBUG: >-

true

PATH: utter-nonsense

action.yml

webknjaz · 2024-05-16T19:45:43Z

action.yml

+ - name: Set Docker image name and tag
+ run: |
+ # Set Docker image name and tag
+ # if action run was triggered by a pull request to this repo,
+ # build image from Dockerfile because it has not been pushed to GHCR,
+ # else pull image from GHCR
+ if [[ $GITHUB_EVENT_NAME == "pull_request" ]] &&
+ [[ $GITHUB_REPOSITORY == "pypa/gh-action-pypi-publish" ]]; then
+ IMAGE="../../../Dockerfile"
+ else
+ REF=${{ steps.set-repo-and-ref.outputs.ref }}
+ REPO=${{ steps.set-repo-and-ref.outputs.repo }}
+ IMAGE="docker://ghcr.io/$REPO:${REF/'/'/'-'}"
+ fi
+ FILE=".github/actions/run-docker-container/action.yml"
+ sed -i -e "s|{{image}}|$IMAGE|g" "$FILE"
+ shell: bash


So I was thinking… Why do we need to check in action.yml to Git even if this modifies it anyway? Why not generate it all, then?

Also, I'd use Python instead of Bash here. Then, it'd be possible to have a dict with data and write it as YAML using json.dump() (because JSON is valid YAML, almost always).

That could work. I've added a Python script that generates the Docker action (9ae1850).

Regarding your related comment:

I think that generating the file is a good idea. It should be possible to write the file without bringing in the PyYAML dependency. But it's not that easy for reading it. Can we make use of yq somehow, and convert YAML to JSON this way, maybe?

I've started by using PyYAML but will think about a way to do this with yq.

.github/actions/run-docker-container/action.yml

webknjaz · 2024-05-16T19:50:43Z

action.yml

+ user: ${{ inputs.user }}
+ password: ${{ inputs.password }}
+ repository-url: ${{ inputs.repository-url || inputs.repository_url }}
+ packages-dir: ${{ inputs.packages-dir || inputs.packages_dir }}


This would break the deprecation messages, it seems. Have you checked?

I don't think so. The new inputs don't have defaults, so for example, ${{ inputs.repository-url || inputs.repository_url }} will default to inputs.repository_url and deprecationMessage will be logged.

webknjaz · 2024-05-16T20:01:27Z

I'm done with the initial review. More is needed, but I'd rather accept what I can through separate PRs to make this one smaller. And the suggested refactoring could be done in parallel. I think that generating the file is a good idea. It should be possible to write the file without bringing in the PyYAML dependency. But it's not that easy for reading it. Can we make use of yq somehow, and convert YAML to JSON this way, maybe?

br3ndonland · 2024-05-27T20:43:49Z

@webknjaz thank you for your detailed review. I've addressed most of your comments so far.

webknjaz · 2024-05-29T14:40:43Z

Commits like 213c885ac41d769527ac150e2e633bb1ccd886d4 aren't really necessary since Git would automatically absorb changes applied separately. FYI. In fact, this one may have a harmful effect — when the label change is merged into the default branch, and this one rebased on top, Git will keep the removal commit and attempt deleting the label 🤷‍♂️

Anyway, we'll address this later on.

webknjaz · 2024-05-29T14:49:15Z

create-docker-action.py

+
+
+def set_image(event: str, ref: str, repo: str) -> str:
+ if event == 'pull_request' and 'gh-action-pypi-publish' in repo:


Why do you check for 'gh-action-pypi-publish' in repo? How about forks that renamed it?

If the workflow was triggered by a PR to this repo (pypa/gh-action-pypi-publish), then the action should build the Docker image. Else, the action should be using the GHCR Docker image.

This is important because workflows in other repos might run the marketplace action on PRs. For example, they might want to smoke test package uploads using DevPI like this:

name: Smoke test workflow in a different repo on: pull_request: env: devpi-password: abcd1234 devpi-username: root devpi-port: 3141 jobs: smoke-test: runs-on: ubuntu-latest services: devpi: image: muccg/devpi env: DEVPI_PASSWORD: ${{ env.devpi-password }} ports: - 3141 timeout-minutes: 2 steps: - name: Check out the action locally uses: actions/checkout@v4 - name: Install the packaging-related tools run: python3 -m pip install build twine - name: Create the stub package importable directory run: mkdir -pv src/test_package - name: Populate the stub package `__init__.py` run: echo '__version__ = "0.1"' > src/test_package/__init__.py - name: Populate the stub package `README.md` run: echo "# Test Package" > README.md - name: Populate the stub package `pyproject.toml` run: echo "$CONTENTS" > pyproject.toml env: CONTENTS: | [build-system] requires = [ "setuptools == 65.6.3", ] build-backend = "setuptools.build_meta" [project] name = "test-package" version = "0.1" readme = "README.md" - name: Build the stub package sdist and wheel distributions run: python3 -m build - name: Register the stub package in devpi run: twine register dist/*.tar.gz env: TWINE_USERNAME: ${{ env.devpi-username }} TWINE_PASSWORD: ${{ env.devpi-password }} TWINE_REPOSITORY_URL: >- http://localhost:${{ job.services.devpi.ports[env.devpi-port] }}/${{ env.devpi-username }}/public/ - name: Smoke-test the marketplace action uses: pypa/gh-action-pypi-publish@unstable/v1 with: user: ${{ env.devpi-username }} password: ${{ env.devpi-password }} repository-url: http://devpi:${{ env.devpi-port }}/${{ env.devpi-username }}/public/

In that case, github.event_name would be pull_request, but the repo name would be different.

As far as handling PRs to/from renamed forks of this repo (pypa/gh-action-pypi-publish), if you have some other way of specifying "this repo" that doesn't include gh-action-pypi-publish, I'm happy to consider it.

webknjaz · 2024-05-29T14:56:15Z

action.yml

+ - name: Create Docker container action
+ run: |
+ # Create Docker container action
+ python -m pip install -r requirements/github-actions.txt


This is not the right way of using constraint files. Here's what should be done instead:

Suggested change

python -m pip install -r requirements/github-actions.txt

PIP_CONSTRAINT=requirements/github-actions.txt python -Im pip install --only-binary=:all: -r requirements/github-actions.in

If we go with the latest approach of dumping JSON to a .yml file, we won't need to deal with the PyYAML requirement.

webknjaz · 2024-05-29T15:04:32Z

requirements/github-actions.txt

+# This file is autogenerated by pip-compile with Python 3.12
+# by the following command:
+#
+# pip-compile --allow-unsafe --config=../.pip-tools.toml --output-file=github-actions.txt --strip-extras github-actions.in


Compiling PyYAML on various VMs might be problematic, so I'd pass --only-binary=:all: via pip-opts or even put it into the config. This way, installing would be more predictable on different versions of Ubuntu VMs that GitHub has.

Another concern is that originally, I opted for a docker-based action so that the external runtime is not mutated (that's one of a few reasons). By installing things with pip we violate that promise. I was kinda hoping that it'd be possible to avoid using a third-party dependency at all.

If we go with the latest approach of dumping JSON to a .yml file, we won't need to deal with the PyYAML requirement.

webknjaz · 2024-05-29T15:05:49Z

action.yml

+ - name: Create Docker container action
+ run: |
+ # Create Docker container action
+ python -m pip install -r requirements/github-actions.txt


Another thing to consider, provided that it's actually needed (see https://github.com/pypa/gh-action-pypi-publish/pull/230/files#r1619055808), is making a virtualenv managed by this action (in its checkout/tmp dir so that the end-user's projects aren't polluted).

If we go with the latest approach of dumping JSON to a .yml file, we won't need to deal with the PyYAML requirement.

webknjaz · 2024-05-29T15:07:56Z

create-docker-action.py

+with action_path.open(mode='w', encoding='utf-8') as file:
+ yaml.dump(action, file, allow_unicode=True, sort_keys=False)


Could you check if using just

Suggested change

with action_path.open(mode='w', encoding='utf-8') as file:

yaml.dump(action, file, allow_unicode=True, sort_keys=False)

action_path.write_text(json.dumps(action), encoding='utf-8')

would work?

Most of the time, any valid JSON is also valid YAML so GHA should be able to read it...

Alternatively, we could have a template for the YAML file that would be post-processed by Template strings, str.format_map() or even just f-strings.

Any of these would let us avoid having an external dependency.

I was able to get it to work by dumping JSON to a .yml file.

webknjaz · 2024-05-29T20:03:55Z

@br3ndonland I'm sure my review is incomplete, but hopefully it gives you enough ideas to try out until the next time.

I think it'd be nice to get this in before #236 if at all possible.

webknjaz · 2024-05-29T20:28:13Z

@br3ndonland could you also rebase this branch locally? The GH button doesn't work, meaning there's going to be some conflicts to resolve.

Up to this point, the project has been set up as a Docker action referencing the Dockerfile. The downside to using the Dockerfile for the action is that the Docker image must be built every time the action is used. This commit will set up the project to build the Docker image and push it to GitHub Container Registry (GHCR). This change will speed up user workflows every time the action is used because the workflows will simply pull the Docker image from GHCR instead of building again. Changes: - Add required metadata to Dockerfile - Build container image with GitHub Actions - Push container image to GHCR Docker actions support pulling in pre-built Docker images. The downside is that there's no way to specify the correct Docker tag because the GitHub Actions `image` and `uses:` keys don't accept any context. For example, if a user's workflow has `uses: pypa/gh-action-pypi-publish@release/v1.8`, then the action should pull in a Docker image built from the `release/v1.8` branch, something like `ghcr.io/pypa/gh-action-pypi-publish:release-v1.8` (Docker tags can't have `/`). The workaround is to switch the top-level `action.yml` to a composite action that then calls the Docker action, substituting the correct image name and tag.

pypa#230 (comment)

br3ndonland · 2024-05-31T01:09:00Z

@br3ndonland could you also rebase this branch locally? The GH button doesn't work, meaning there's going to be some conflicts to resolve.

No problem. I'm not sure what "the GH button" is (some GitHub Mobile thing?) but I've rebased the branch.

I will address the other comments soon. Looks like there's a small git fetch issue in the smoke test workflow so I will look into that as well.

actions/checkout#27 (comment) actions/checkout#1108

br3ndonland · 2024-06-01T21:50:55Z

I was able to get it to work by dumping JSON to a .yml file. The file must have a .yml or .yaml extension (GitHub Actions will raise "Error: Can't find 'action.yml', 'action.yaml' or 'Dockerfile'" on .json files), but the file contents can be JSON.

webknjaz reviewed Apr 25, 2024

View reviewed changes

.github/workflows/build-and-push-docker-image.yml Show resolved Hide resolved

webknjaz mentioned this pull request Apr 26, 2024

[Roadmap] Implement PEP 740 pypi/warehouse#15871

Open

16 tasks

br3ndonland added a commit to br3ndonland/gh-action-pypi-publish that referenced this pull request Apr 26, 2024

Separate docker login and docker push

bb0315f

pypa#230 (comment)

webknjaz pushed a commit to br3ndonland/gh-action-pypi-publish that referenced this pull request May 16, 2024

Separate docker login and docker push

3384fad

pypa#230 (comment)

webknjaz force-pushed the ghcr branch from bb0315f to 3384fad Compare May 16, 2024 15:55