Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting up Github actions and repo2docker for multiple images #25

Closed
1 of 2 tasks
Tracked by #13
jmunroe opened this issue Nov 14, 2023 · 16 comments
Closed
1 of 2 tasks
Tracked by #13

Setting up Github actions and repo2docker for multiple images #25

jmunroe opened this issue Nov 14, 2023 · 16 comments
Assignees

Comments

@jmunroe
Copy link
Collaborator

jmunroe commented Nov 14, 2023

Community maintained images currently are based on a template where a GitHub action builds one docker image per repository.

  • Is it possible to set up GitHub actions to allow for one docker image PER repository subdirectory?
  • Can these community maintained images import from each other?

This kind of functionality is already available in jupyter-docker-stacks and pangeo-docker projects. Can it be set up with in this repository so that more than one image environment can be maintained in the same repository?

Or, is it a built in requirement on how repo2docker works that there is 1:1 mapping between the repository and the docker image?

@jmunroe
Copy link
Collaborator Author

jmunroe commented Nov 14, 2023

@sgibson91 I am hoping you'll either be interested in taking up this issue or #21 during Sprint 3. I think it may be too much to tackle both #21 and #25 in Sprint 3 but I invite your input on which task makes more sense for you to be involved in.

@sgibson91
Copy link
Member

Is it possible to set up GitHub actions to allow for one docker image PER repository subdirectory? [...] This kind of functionality is already available in jupyter-docker-stacks and pangeo-docker projects. Can it be set up with in this repository so that more than one image environment can be maintained in the same repository?

There's no technical reason why we can't do this. Is the scope of this task for this repo only? I ask because multiple images in a repo may be a complex thing for communities, especially if they're new to the concept. But I'm happy for it to be just this one for now.

Or, is it a built in requirement on how repo2docker works that there is 1:1 mapping between the repository and the docker image?

I don't think repo2docker knows the concept of repos, it only knows the concept of a filesystem. So the command would look like repo2docker <subdir> instead of repo2docker .

Can these community maintained images import from each other?

Traditionally, if you wanted to build a Docker image using another one as a base, you had to provide a Dockerfile with a FROM statement in it. And that is already "here be dragons" territory for repo2docker. This is why jupyter-docker-stacks and pangeo-docker don't actually build their images using repo2docker, but docker directly.

There was recently some work done in repo2docker that would permit defining a base image to use jupyterhub/repo2docker#909 - so hooray, we can continue using repo2docker! However, I'm not sure if that work has percolated upwards such that the setting is exposed in the repo2docker action we use in GitHub Actions. Either the action has an ability to pass arbitrary args to the repo2docker command which will solve our problem, or some upstream dev work in the action will be required to support this. A little bit of research needed.

@jmunroe
Copy link
Collaborator Author

jmunroe commented Nov 15, 2023

Understood about the complexity of having one repo2docker work flow build and image that in turn would be the base image for another repo2docker image, especially if they exist in the same repository. Let's exclude that requirement from this issue.

The scope is for this repo only. For the purposes of this Showcase hub, I want to be able demonstrate a few different repositories and it seems confusing to me to create a separate repo for each one. But, I will commit to each repo being defined entirely from the contents of subdirectory (following the specification given here ) with any image dependency (FROM <base_image> type usage) only being used that base_image is fully defined outside of this repo.

It is exactly because I know I can use repo2docker <sub_dir> that I was hoping this would be difficult. The challenge for me is getting the GitHub action set up correctly.

I am imagining a file structure for this repo like

README.md
.github/workflows/
handbook/
  *.yml
  *.md
  *.ipynb
images/
 image1/
   environment.yml
 image2/
   Dockerfile
 image3/
   install.R
   apt.txt
image4/
    requirements.txt
    postBuild

I can see that this could be done by either having a separate, dedicated GitHub action for each image. But I also suspect with some sort of arg passing there be a way to set it up so that one GitHub action is triggered for each imageN/ subdirectory is built independently. All images should can use the same quay.io secrets and be pushed to the same container registry.

There doesn't appear to be a way of setting the image name within the repo2docker configuration specification (a possible upstream contribution?). But I think using the directory name and reponame so images get called something like {QUAYID}/{REPONAME}-{subdir} might be reasonable (e.g., quay.io/2i2c/community-showcase-image1). It also could be fine if some configuration of the GitHub action needs to occur if a new image subdirectory is created.

@sgibson91 is this specified clearly enough for you? If there is any confusion, please through something in my calendar so we can chat on it.

@sgibson91
Copy link
Member

sgibson91 commented Nov 16, 2023

There doesn't appear to be a way of setting the image name within the repo2docker configuration specification

You can set the image name, both in repo2docker and the repo2docker-action, but it is a command line argument, not something in the config spec:

is this specified clearly enough for you?

Yes. I have a couple of research questions I can pursue on my own (indeed around figuring out arg passing so we only require one workflow file for the repo, rather than a workflow file per image subdir), but other than that I think I can make progress on this.

@sgibson91
Copy link
Member

Bookmarking some info I gathered on changing the base image for repo2docker. I know this is no longer a requirement but wanted to make sure the record is kept somewhere.

https://2i2c.slack.com/archives/CKJS000F4/p1700138248103149

  • CLI invocation: repo2docker --Repo2Docker.base_image IMAGE_NAME:TAG .
  • Can also include in the folder a repo2docker_config.py file containing c.Repo2Docker.base_image=IMAGE_NAME:TAG

@sgibson91
Copy link
Member

sgibson91 commented Nov 21, 2023

Successful matrix deployment based on files changed in sub-paths!

Now to add the repo2docker bit in

@sgibson91
Copy link
Member

Tomorrow, I'll add a section to the readme documenting how to add an image to the repo given the current workflow and the assumptions it makes.

@jmunroe
Copy link
Collaborator Author

jmunroe commented Nov 21, 2023

Looking forward to trying it out!

The quick test I did triggered the workflow and built a new image but didn't appear to actually push it to quay.io. I verified that the corresponding quay.io repository of the same name already existed but that didn't make a change. Perhaps there is a step I am missing to get the workflow to push to quay.io?

@sgibson91
Copy link
Member

Yep - in my experience, merging a workflow and having it work first time is a holy grail quest 😄 I'll dig in today

@sgibson91
Copy link
Member

Ok, so it seems there was something iffy about setting the NO_PUSH variable with a conditional as this commit got us to the pushing step.

However, there is then a permissions error even though I have given the bot account write permissions on the repo to push to https://github.com/2i2c-org/community-showcase/actions/runs/6956812443/job/18928433370?pr=40#step:6:3849

@sgibson91
Copy link
Member

I think there is now something up with the credentials (but I don't know what) as I tried to add an explicit login step before running the repo2docker action (which should not be required) and that failed as well

@sgibson91
Copy link
Member

I forgot secrets are not available in PRs

@sgibson91
Copy link
Member

sgibson91 commented Nov 28, 2023

If jupyterhub/repo2docker-action#107 is merged, I will be able to achieve a workflow where images are built in both PRs and pushes to the default, but are only pushed to the registry when the PR is merged to the default branch, without having duplicate workflow files.

@jmunroe
Copy link
Collaborator Author

jmunroe commented Nov 28, 2023

I have had success in using these new actions to make modifications to images and maintain multiple repo2docker based images from the same repository. It is working as I expect!

One last step in my work flow is to recover the tag that is automatically generated in the build process and pushed to quay.io. I can manually look it up in either the github action log, or go to quay.io for that image. Would it be possible to extract this tag from the log and make it more easily discoverable? I don't know "where" this information would go ... I most immediately need it when I am spinning up a new instance in the hub and use the 'Other...' image dialog to verify it the new image has the new functionality I intended. Here's the workflow I am currently using:

  1. An image needs a new/updated package, extension, or some other feature added to it.
  2. Clone the community-showcase hub within an existing JupyterHub instance
  3. Make changes in the configuration files for building the image and commit the change.
  4. Push the changes back to GH and merge into the main branch to determine
    a. if the modify image actually built without errors
    b. what the newly assigned tag is for the image that has been pushed to quay.io
  5. [ If the the build but not push option for a PR was set up, I suppose I could first push to PR, wait for the build to occur without error, then merge into main, then wait for the build to occur again -- but that seems like an extra step I am not sure I would actually want to do! ]
  6. Browse to https://showcase.2i2c.cloud and select a custom image 'Other...' and test this newly built image.
  7. Go back to step 3 and make another configuration change because I almost invariably missed something
  8. Having verified that the new image is working as expected, send a note to my community to use this new image tag via the Other... UI (if it requires feedback from only a few people) or make a PR on the infrastructure repo to update the image under the ProfileList so that the image is available to all hub users.

In some future state, I can imagine a improvement to the ProfileList that goes and looks up recently tags for a given image.

Since the change, build, test loop for making a change an image repository is slow, I also experimented with a build and test changes to an image on a local computer using repo2docker. With cached build files this can be a faster process when I am making changes to configuration to solve a problem that I do not already know the full solution to.

@sgibson91
Copy link
Member

I think the simplest implementation to unearth the built and pushed tag would be to:

  1. Output the tag from the build command into the GitHub Actions context
  2. Within this context, make some GitHub API calls to leave a comment on the PR that affected the change in the image reflecting the new image tag

This will be a little bit like the workflow in the infrastructure repo that extracts the GitHub Actions run that was triggered by a merge into the default branch, and comments on the PR the merge commit came from with a link to the running workflow triggered by the merge.

@jmunroe
Copy link
Collaborator Author

jmunroe commented Jan 8, 2024

Based on @sgibson91 efforts, I think we can mark this issue as 'Completed'!

@jmunroe jmunroe closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done 🎉
Development

No branches or pull requests

2 participants