Setting up Github actions and repo2docker for multiple images #25

jmunroe · 2023-11-14T21:00:59Z

Community maintained images currently are based on a template where a GitHub action builds one docker image per repository.

Is it possible to set up GitHub actions to allow for one docker image PER repository subdirectory?
Can these community maintained images import from each other?

This kind of functionality is already available in jupyter-docker-stacks and pangeo-docker projects. Can it be set up with in this repository so that more than one image environment can be maintained in the same repository?

Or, is it a built in requirement on how repo2docker works that there is 1:1 mapping between the repository and the docker image?

The text was updated successfully, but these errors were encountered:

jmunroe · 2023-11-14T21:07:19Z

@sgibson91 I am hoping you'll either be interested in taking up this issue or #21 during Sprint 3. I think it may be too much to tackle both #21 and #25 in Sprint 3 but I invite your input on which task makes more sense for you to be involved in.

sgibson91 · 2023-11-15T17:03:47Z

Is it possible to set up GitHub actions to allow for one docker image PER repository subdirectory? [...] This kind of functionality is already available in jupyter-docker-stacks and pangeo-docker projects. Can it be set up with in this repository so that more than one image environment can be maintained in the same repository?

There's no technical reason why we can't do this. Is the scope of this task for this repo only? I ask because multiple images in a repo may be a complex thing for communities, especially if they're new to the concept. But I'm happy for it to be just this one for now.

Or, is it a built in requirement on how repo2docker works that there is 1:1 mapping between the repository and the docker image?

I don't think repo2docker knows the concept of repos, it only knows the concept of a filesystem. So the command would look like repo2docker <subdir> instead of repo2docker .

Can these community maintained images import from each other?

Traditionally, if you wanted to build a Docker image using another one as a base, you had to provide a Dockerfile with a FROM statement in it. And that is already "here be dragons" territory for repo2docker. This is why jupyter-docker-stacks and pangeo-docker don't actually build their images using repo2docker, but docker directly.

There was recently some work done in repo2docker that would permit defining a base image to use jupyterhub/repo2docker#909 - so hooray, we can continue using repo2docker! However, I'm not sure if that work has percolated upwards such that the setting is exposed in the repo2docker action we use in GitHub Actions. Either the action has an ability to pass arbitrary args to the repo2docker command which will solve our problem, or some upstream dev work in the action will be required to support this. A little bit of research needed.

jmunroe · 2023-11-15T21:10:42Z

Understood about the complexity of having one repo2docker work flow build and image that in turn would be the base image for another repo2docker image, especially if they exist in the same repository. Let's exclude that requirement from this issue.

The scope is for this repo only. For the purposes of this Showcase hub, I want to be able demonstrate a few different repositories and it seems confusing to me to create a separate repo for each one. But, I will commit to each repo being defined entirely from the contents of subdirectory (following the specification given here ) with any image dependency (FROM <base_image> type usage) only being used that base_image is fully defined outside of this repo.

It is exactly because I know I can use repo2docker <sub_dir> that I was hoping this would be difficult. The challenge for me is getting the GitHub action set up correctly.

I am imagining a file structure for this repo like

README.md
.github/workflows/
handbook/
  *.yml
  *.md
  *.ipynb
images/
 image1/
   environment.yml
 image2/
   Dockerfile
 image3/
   install.R
   apt.txt
image4/
    requirements.txt
    postBuild

I can see that this could be done by either having a separate, dedicated GitHub action for each image. But I also suspect with some sort of arg passing there be a way to set it up so that one GitHub action is triggered for each imageN/ subdirectory is built independently. All images should can use the same quay.io secrets and be pushed to the same container registry.

There doesn't appear to be a way of setting the image name within the repo2docker configuration specification (a possible upstream contribution?). But I think using the directory name and reponame so images get called something like {QUAYID}/{REPONAME}-{subdir} might be reasonable (e.g., quay.io/2i2c/community-showcase-image1). It also could be fine if some configuration of the GitHub action needs to occur if a new image subdirectory is created.

@sgibson91 is this specified clearly enough for you? If there is any confusion, please through something in my calendar so we can chat on it.

sgibson91 · 2023-11-16T10:24:09Z

There doesn't appear to be a way of setting the image name within the repo2docker configuration specification

You can set the image name, both in repo2docker and the repo2docker-action, but it is a command line argument, not something in the config spec:

is this specified clearly enough for you?

Yes. I have a couple of research questions I can pursue on my own (indeed around figuring out arg passing so we only require one workflow file for the repo, rather than a workflow file per image subdir), but other than that I think I can make progress on this.

sgibson91 · 2023-11-16T12:55:32Z

Bookmarking some info I gathered on changing the base image for repo2docker. I know this is no longer a requirement but wanted to make sure the record is kept somewhere.

https://2i2c.slack.com/archives/CKJS000F4/p1700138248103149

CLI invocation: repo2docker --Repo2Docker.base_image IMAGE_NAME:TAG .
Can also include in the folder a repo2docker_config.py file containing c.Repo2Docker.base_image=IMAGE_NAME:TAG

sgibson91 · 2023-11-21T10:23:45Z

Successful matrix deployment based on files changed in sub-paths!

PR: [DO NOT MERGE] Testing matrix workflow #35
Related GitHub Action run: https://github.com/2i2c-org/community-showcase/actions/runs/6942368076?pr=35

Now to add the repo2docker bit in

sgibson91 · 2023-11-21T16:20:27Z

Tomorrow, I'll add a section to the readme documenting how to add an image to the repo given the current workflow and the assumptions it makes.

jmunroe · 2023-11-21T17:56:47Z

Looking forward to trying it out!

The quick test I did triggered the workflow and built a new image but didn't appear to actually push it to quay.io. I verified that the corresponding quay.io repository of the same name already existed but that didn't make a change. Perhaps there is a step I am missing to get the workflow to push to quay.io?

sgibson91 · 2023-11-22T11:04:04Z

Yep - in my experience, merging a workflow and having it work first time is a holy grail quest 😄 I'll dig in today

sgibson91 · 2023-11-22T11:52:06Z

Ok, so it seems there was something iffy about setting the NO_PUSH variable with a conditional as this commit got us to the pushing step.

However, there is then a permissions error even though I have given the bot account write permissions on the repo to push to https://github.com/2i2c-org/community-showcase/actions/runs/6956812443/job/18928433370?pr=40#step:6:3849

sgibson91 · 2023-11-22T12:14:29Z

I think there is now something up with the credentials (but I don't know what) as I tried to add an explicit login step before running the repo2docker action (which should not be required) and that failed as well

sgibson91 · 2023-11-22T12:45:06Z

I forgot secrets are not available in PRs

sgibson91 · 2023-11-28T11:37:31Z

If jupyterhub/repo2docker-action#107 is merged, I will be able to achieve a workflow where images are built in both PRs and pushes to the default, but are only pushed to the registry when the PR is merged to the default branch, without having duplicate workflow files.

jmunroe · 2023-11-28T15:44:16Z

I have had success in using these new actions to make modifications to images and maintain multiple repo2docker based images from the same repository. It is working as I expect!

One last step in my work flow is to recover the tag that is automatically generated in the build process and pushed to quay.io. I can manually look it up in either the github action log, or go to quay.io for that image. Would it be possible to extract this tag from the log and make it more easily discoverable? I don't know "where" this information would go ... I most immediately need it when I am spinning up a new instance in the hub and use the 'Other...' image dialog to verify it the new image has the new functionality I intended. Here's the workflow I am currently using:

An image needs a new/updated package, extension, or some other feature added to it.
Clone the community-showcase hub within an existing JupyterHub instance
Make changes in the configuration files for building the image and commit the change.
Push the changes back to GH and merge into the main branch to determine
a. if the modify image actually built without errors
b. what the newly assigned tag is for the image that has been pushed to quay.io
[ If the the build but not push option for a PR was set up, I suppose I could first push to PR, wait for the build to occur without error, then merge into main, then wait for the build to occur again -- but that seems like an extra step I am not sure I would actually want to do! ]
Browse to https://showcase.2i2c.cloud and select a custom image 'Other...' and test this newly built image.
Go back to step 3 and make another configuration change because I almost invariably missed something
Having verified that the new image is working as expected, send a note to my community to use this new image tag via the Other... UI (if it requires feedback from only a few people) or make a PR on the infrastructure repo to update the image under the ProfileList so that the image is available to all hub users.

In some future state, I can imagine a improvement to the ProfileList that goes and looks up recently tags for a given image.

Since the change, build, test loop for making a change an image repository is slow, I also experimented with a build and test changes to an image on a local computer using repo2docker. With cached build files this can be a faster process when I am making changes to configuration to solve a problem that I do not already know the full solution to.

sgibson91 · 2023-11-28T16:59:07Z

I think the simplest implementation to unearth the built and pushed tag would be to:

Output the tag from the build command into the GitHub Actions context
Within this context, make some GitHub API calls to leave a comment on the PR that affected the change in the image reflecting the new image tag

This will be a little bit like the workflow in the infrastructure repo that extracts the GitHub Actions run that was triggered by a merge into the default branch, and comments on the PR the merge commit came from with a link to the running workflow triggered by the merge.

jmunroe · 2024-01-08T19:13:33Z

Based on @sgibson91 efforts, I think we can mark this issue as 'Completed'!

jmunroe mentioned this issue Nov 14, 2023

Q4 Community Building Sprint 3 #13

Closed

jmunroe assigned sgibson91 Nov 14, 2023

jmunroe mentioned this issue Nov 16, 2023

reorganize environment images into a common subdirectory #28

Merged

sgibson91 mentioned this issue Nov 20, 2023

First pass of parallel repo2docker builds of images in subfolders #34

Merged

sgibson91 mentioned this issue Nov 21, 2023

Finalise the repo2docker workflow #36

Merged

This was referenced Nov 29, 2023

Add docs on how to add new images to the repo #41

Merged

Add scripts/workflows that will comment the image name and tag of an image that was built and pushed by repo2docker #42

Merged

jmunroe mentioned this issue Jan 8, 2024

[Goal] Build a Community of Hub Champions #10

Open

jmunroe closed this as completed Jan 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting up Github actions and repo2docker for multiple images #25

Setting up Github actions and repo2docker for multiple images #25

jmunroe commented Nov 14, 2023 •

edited

Loading

jmunroe commented Nov 14, 2023

sgibson91 commented Nov 15, 2023

jmunroe commented Nov 15, 2023

sgibson91 commented Nov 16, 2023 •

edited

Loading

sgibson91 commented Nov 16, 2023

sgibson91 commented Nov 21, 2023 •

edited

Loading

sgibson91 commented Nov 21, 2023

jmunroe commented Nov 21, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 28, 2023 •

edited

Loading

jmunroe commented Nov 28, 2023

sgibson91 commented Nov 28, 2023

jmunroe commented Jan 8, 2024

Setting up Github actions and repo2docker for multiple images #25

Setting up Github actions and repo2docker for multiple images #25

Comments

jmunroe commented Nov 14, 2023 • edited Loading

jmunroe commented Nov 14, 2023

sgibson91 commented Nov 15, 2023

jmunroe commented Nov 15, 2023

sgibson91 commented Nov 16, 2023 • edited Loading

sgibson91 commented Nov 16, 2023

sgibson91 commented Nov 21, 2023 • edited Loading

sgibson91 commented Nov 21, 2023

jmunroe commented Nov 21, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 22, 2023

sgibson91 commented Nov 28, 2023 • edited Loading

jmunroe commented Nov 28, 2023

sgibson91 commented Nov 28, 2023

jmunroe commented Jan 8, 2024

jmunroe commented Nov 14, 2023 •

edited

Loading

sgibson91 commented Nov 16, 2023 •

edited

Loading

sgibson91 commented Nov 21, 2023 •

edited

Loading

sgibson91 commented Nov 28, 2023 •

edited

Loading