Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let FROM <base_image> in the Dockerfile template be configurable #909

Merged
merged 34 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
592ccb7
Set ubuntu 20.04 as new base image
yuvipanda Jun 9, 2020
b724c02
Switch RSPM to use focal, not bionic
yuvipanda Jun 3, 2022
8086937
Don't hardcode ubuntu codename when getting r packages
yuvipanda Jun 22, 2022
20b0815
Make base_image configurable
yuvipanda Jun 25, 2022
f35a948
Set base image for default buildpack too
yuvipanda Jul 23, 2022
446e678
Add test fixture for setting base image
yuvipanda Jul 23, 2022
9ff14c1
Remove accidental import vscode inserted
yuvipanda Jul 24, 2022
b58fd15
Fix rspm test for VERSION_CODENAME
yuvipanda Jul 25, 2022
7eb143e
Set base_image in a few more places
yuvipanda Jul 25, 2022
c21374d
Fix typo
yuvipanda Jul 26, 2022
f776d8e
Use full image tag
yuvipanda Jul 26, 2022
4d83abc
Add note about what base images are supported
yuvipanda Jul 26, 2022
389334a
Added missing base_image argument
kardasbart Dec 10, 2022
90c375c
Add setuptools to built image
yuvipanda Dec 10, 2022
aae6a71
Try using py3-pip when installing in Alpine
yuvipanda Dec 10, 2022
8a359f9
Readded setuptools
kardasbart Dec 11, 2022
c3d2575
Fixd shell command for buildpacks/r
kardasbart Dec 11, 2022
76731c8
Added setuptools
kardasbart Dec 11, 2022
21ad4ca
Bump linux alpine to 3.17 due to setuptools bug
kardasbart Dec 11, 2022
52e8076
Renamed branch to main
kardasbart Dec 11, 2022
b85df8f
Refactored R command
kardasbart Dec 11, 2022
0b9154a
Updated master/main branches error info
kardasbart Dec 11, 2022
2b17eca
Merge pull request #6 from kardasbart/feat/new-base
yuvipanda Dec 12, 2022
5e75258
Document base image
choldgraf Dec 13, 2022
4eaf096
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 13, 2022
4a5ff3c
Add note on reproducibility
choldgraf Dec 13, 2022
e956501
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Dec 13, 2022
99125ab
Merge branch 'main' into feat/new-base
yuvipanda Jan 7, 2023
0a3846f
Support installing RStudio on distros without openssl1.1
yuvipanda Jan 7, 2023
01c142b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 7, 2023
a6ccf81
Fix mismatched quotes
yuvipanda Jan 7, 2023
f848c81
Install libssl-dev unconditionally for R
yuvipanda Mar 23, 2023
5894f63
Merge remote-tracking branch 'upstream/main' into feat/new-base
yuvipanda Mar 23, 2023
e1051c3
Merge branch 'main' into feat/new-base
minrk Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# syntax = docker/dockerfile:1.3
ARG ALPINE_VERSION=3.16
ARG ALPINE_VERSION=3.17
FROM alpine:${ALPINE_VERSION}

RUN apk add --no-cache git python3 python3-dev py-pip build-base
RUN apk add --no-cache git python3 python3-dev py3-pip py3-setuptools build-base

# build wheels in first image
ADD . /tmp/src
Expand All @@ -16,7 +16,7 @@ RUN mkdir /tmp/wheelhouse \
FROM alpine:${ALPINE_VERSION}

# install python, git, bash, mercurial
RUN apk add --no-cache git git-lfs python3 py-pip bash docker mercurial
RUN apk add --no-cache git git-lfs python3 py3-pip py3-setuptools bash docker mercurial

# install hg-evolve (Mercurial extensions)
RUN pip3 install hg-evolve --user --no-cache-dir
Expand Down
33 changes: 33 additions & 0 deletions docs/source/howto/base_image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Change the base image used by Docker

You may change the base image used in the `Dockerfile` that creates images by repo2docker.
This is equivalent to changing the `FROM <base_image>` in the Dockerfile.

To do so, use the `base_image` traitlet when invoking `repo2docker`.
Note that this is not configurable by individual repositories, it is configured when you invoke the `repo2docker` command.

```{note}
By default repo2docker builds on top of the `buildpack-deps:bionic` base image, an Ubuntu-based image.
```

## Requirements for your base image

`repo2docker` will only work if a specific set of packages exists in the base image.
Only images that match the following criteria are supported:

- Ubuntu based distributions (minimum `18.04`)
- Contains a set of base packages installed with [the `buildpack-deps` image family](https://hub.docker.com/_/buildpack-deps).

Other images _may_ work, but are not officially supported.

## This will affect reproducibility 🚨

Changing the base image may have an impact on the reproducibility of repositories that are built.
There are **no guarantees that repositories will behave the same way as other repo2docker builds if you change the base image**.
For example these are two scenarios that would make your repositories non-reproducible:

- **Your base image is different from `Ubuntu:bionic`.**
If you change the base image in a way that is different from repo2docker's default (the Ubuntu `bionic` image), then repositories that **you** build with repo2docker may be significantly different from those that **other** instances of repo2docker build (e.g., those from [`mybinder.org`](https://mybinder.org)).
- **Your base image changes over time.**
If you choose a base image that changes its composition over time (e.g., an image provided by some other community), then it may cause repositories build with your base image to change in unpredictable ways.
We recommend choosing a base image that you know to be stable and trustworthy.
1 change: 1 addition & 0 deletions docs/source/howto/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ Select from the pages listed below to get started.
lab_workspaces
jupyterhub_images
deploy
base_image
21 changes: 19 additions & 2 deletions repo2docker/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,21 @@ def _dry_run_changed(self, change):
""",
)

base_image = Unicode(
"docker.io/library/buildpack-deps:bionic",
config=True,
help="""
Base image to use when building docker images.

Only images that match the following criteria are supported:
- Ubuntu based distributions, minimum 18.04
- Contains set of base packages installed with the buildpack-deps
image family: https://hub.docker.com/_/buildpack-deps

Other images *may* work, but are not officially supported.
""",
)

def get_engine(self):
"""Return an instance of the container engine.

Expand Down Expand Up @@ -793,12 +808,14 @@ def build(self):

with chdir(checkout_path):
for BP in self.buildpacks:
bp = BP()
bp = BP(base_image=self.base_image)
if bp.detect():
picked_buildpack = bp
break
else:
picked_buildpack = self.default_buildpack()
picked_buildpack = self.default_buildpack(
base_image=self.base_image
)

picked_buildpack.platform = self.platform
picked_buildpack.appendix = self.appendix
Expand Down
29 changes: 23 additions & 6 deletions repo2docker/buildpacks/_r_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,19 @@ def rstudio_base_scripts(r_version):
shiny_proxy_version = "1.1"
shiny_sha256sum = "80f1e48f6c824be7ef9c843bb7911d4981ac7e8a963e0eff823936a8b28476ee"

rstudio_url = "https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2022.02.1-461-amd64.deb"
rstudio_sha256sum = (
"239e8d93e103872e7c6d827113d88871965f82ffb0397f5638025100520d8a54"
# RStudio server has different builds based on wether OpenSSL 3 or 1.1 is available in the base
# image. 3 is present Jammy+, 1.1 until then. Instead of hardcoding URLs based on distro, we actually
# check for the dependency itself directly in the code below. You can find these URLs in
# https://posit.co/download/rstudio-server/, toggling between Ubuntu 22 (for openssl3) vs earlier versions (openssl 1.1)
# you may forget about openssl, but openssl never forgets you.
rstudio_openssl3_url = "https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2022.12.0-353-amd64.deb"
rstudio_openssl3_sha256sum = (
"a5aa2202786f9017a6de368a410488ea2e4fc6c739f78998977af214df0d6288"
)

rstudio_openssl1_url = "https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2022.12.0-353-amd64.deb"
rstudio_openssl1_sha256sum = (
"bb88e37328c304881e60d6205d7dac145525a5c2aaaf9da26f1cb625b7d47e6e"
)
rsession_proxy_version = "2.0.1"

Expand All @@ -27,11 +37,18 @@ def rstudio_base_scripts(r_version):
# but here it's important because these recommend r-base,
# which will upgrade the installed version of R, undoing our pinned version
rf"""
curl --silent --location --fail {rstudio_url} > /tmp/rstudio.deb && \
apt-get update > /dev/null && \
if apt-cache search libssl3 > /dev/null; then \
RSTUDIO_URL="{rstudio_openssl3_url}" ;\
RSTUDIO_HASH="{rstudio_openssl3_sha256sum}" ;\
else \
RSTUDIO_URL="{rstudio_openssl1_url}" ;\
RSTUDIO_HASH="{rstudio_openssl1_sha256sum}" ;\
fi && \
curl --silent --location --fail ${{RSTUDIO_URL}} > /tmp/rstudio.deb && \
curl --silent --location --fail {shiny_server_url} > /tmp/shiny.deb && \
echo '{rstudio_sha256sum} /tmp/rstudio.deb' | sha256sum -c - && \
echo "${{RSTUDIO_HASH}} /tmp/rstudio.deb" | sha256sum -c - && \
echo '{shiny_sha256sum} /tmp/shiny.deb' | sha256sum -c - && \
apt-get update > /dev/null && \
apt install -y --no-install-recommends /tmp/rstudio.deb /tmp/shiny.deb && \
rm /tmp/*.deb && \
apt-get -qq purge && \
Expand Down
12 changes: 9 additions & 3 deletions repo2docker/buildpacks/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

# Only use syntax features supported by Docker 17.09
TEMPLATE = r"""
FROM buildpack-deps:bionic
FROM {{base_image}}

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive
Expand Down Expand Up @@ -210,7 +210,6 @@ class BuildPack:
Specifically used for creating Dockerfiles for use with repo2docker only.

Things that are kept constant:
- base image
- some environment variables (such as locale)
- user creation & ownership of home directory
- working directory
Expand All @@ -220,9 +219,13 @@ class BuildPack:

"""

def __init__(self):
def __init__(self, base_image):
"""
base_image specifies the base image to use when building docker images
"""
self.log = logging.getLogger("repo2docker")
self.appendix = ""
self.base_image = base_image
self.labels = {}
if sys.platform.startswith("win"):
self.log.warning(
Expand Down Expand Up @@ -254,6 +257,8 @@ def get_base_packages(self):
# Utils!
"less",
"unzip",
# Gives us envsubst
"gettext-base",
}

def get_build_env(self):
Expand Down Expand Up @@ -521,6 +526,7 @@ def render(self, build_args=None):
appendix=self.appendix,
# For docker 17.09 `COPY --chown`, 19.03 would allow using $NBUSER
user=build_args.get("NB_UID", DEFAULT_NB_UID),
base_image=self.base_image,
)

@staticmethod
Expand Down
3 changes: 3 additions & 0 deletions repo2docker/buildpacks/legacy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ class LegacyBinderDockerBuildPack:
This buildpack has been deprecated.
"""

def __init__(self, *args, **kwargs):
pass

def detect(self):
"""Check if current repo should be built with the Legacy BuildPack."""
log = logging.getLogger("repo2docker")
Expand Down
21 changes: 15 additions & 6 deletions repo2docker/buildpacks/r.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,7 @@ def get_packages(self):
"libapparmor1",
"sudo",
"lsb-release",
"libssl-dev",
]

return super().get_packages().union(packages)
Expand All @@ -210,7 +211,10 @@ def get_rspm_snapshot_url(self, snapshot_date, max_days_prior=7):
# Construct a snapshot URL that will give us binary packages for Ubuntu Bionic (18.04)
if "upsi" in snapshots:
return (
"https://packagemanager.rstudio.com/all/__linux__/bionic/"
# Env variables here are expanded by envsubst in the Dockerfile, after sourcing
# /etc/os-release. This allows us to use distro specific variables here to get
# appropriate binary packages without having to hard code version names here.
"https://packagemanager.rstudio.com/all/__linux__/${VERSION_CODENAME}/"
+ snapshots["upsi"]
)
raise ValueError(
Expand Down Expand Up @@ -253,7 +257,10 @@ def get_devtools_snapshot_url(self):
# Hardcoded rather than dynamically determined from a date to avoid extra API calls
# Plus, we can always use packagemanager.rstudio.com here as we always install the
# necessary apt packages.
return "https://packagemanager.rstudio.com/all/__linux__/bionic/2022-01-04+Y3JhbiwyOjQ1MjYyMTU7NzlBRkJEMzg"
# Env variables here are expanded by envsubst in the Dockerfile, after sourcing
# /etc/os-release. This allows us to use distro specific variables here to get
# appropriate binary packages without having to hard code version names here.
return "https://packagemanager.rstudio.com/all/__linux__/${VERSION_CODENAME}/2022-06-03+Y3JhbiwyOjQ1MjYyMTU7RkM5ODcwN0M"

def get_build_scripts(self):
"""
Expand Down Expand Up @@ -333,16 +340,18 @@ def get_build_scripts(self):
rf"""
R RHOME && \
mkdir -p /etc/rstudio && \
echo 'options(repos = c(CRAN = "{cran_mirror_url}"))' > /opt/R/{self.r_version}/lib/R/etc/Rprofile.site && \
echo 'r-cran-repos={cran_mirror_url}' > /etc/rstudio/rsession.conf
EXPANDED_CRAN_MIRROR_URL="$(. /etc/os-release && echo {cran_mirror_url} | envsubst)" && \
echo "options(repos = c(CRAN = \"${{EXPANDED_CRAN_MIRROR_URL}}\"))" > /opt/R/{self.r_version}/lib/R/etc/Rprofile.site && \
echo "r-cran-repos=${{EXPANDED_CRAN_MIRROR_URL}}" > /etc/rstudio/rsession.conf
""",
),
(
"${NB_USER}",
# Install a pinned version of devtools, IRKernel and shiny
rf"""
R --quiet -e "install.packages(c('devtools', 'IRkernel', 'shiny'), repos='{self.get_devtools_snapshot_url()}')" && \
R --quiet -e "IRkernel::installspec(prefix='$NB_PYTHON_PREFIX')"
export EXPANDED_CRAN_MIRROR_URL="$(. /etc/os-release && echo {cran_mirror_url} | envsubst)" && \
R --quiet -e "install.packages(c('devtools', 'IRkernel', 'shiny'), repos=Sys.getenv(\"EXPANDED_CRAN_MIRROR_URL\"))" && \
R --quiet -e "IRkernel::installspec(prefix=Sys.getenv(\"NB_PYTHON_PREFIX\"))"
""",
),
]
Expand Down
6 changes: 3 additions & 3 deletions repo2docker/contentproviders/git.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ def fetch(self, spec, output_dir, yield_output=False):
self.log.error(
f"Failed to check out ref {ref}", extra=dict(phase=R2dState.FAILED)
)
if ref == "master":
if ref == "master" or ref == "main":
msg = (
"Failed to check out the 'master' branch. "
"Maybe the default branch is not named 'master' "
f"Failed to check out the '{ref}' branch. "
f"Maybe the default branch is not named '{ref}' "
"for this repository.\n\nTry not explicitly "
"specifying `--ref`."
)
Expand Down
8 changes: 8 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,14 @@ def run_test(args):
return run_test


@pytest.fixture()
def base_image():
"""
Base ubuntu image to use when testing specific BuildPacks
"""
return "buildpack-deps:bionic"


def _add_content_to_git(repo_dir):
"""Add content to file 'test' in git repository and commit."""
# use append mode so this can be called multiple times
Expand Down
8 changes: 4 additions & 4 deletions tests/unit/test_binder_dir.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,21 @@


@pytest.mark.parametrize("binder_dir", ["binder", ".binder", ""])
def test_binder_dir(tmpdir, binder_dir):
def test_binder_dir(tmpdir, binder_dir, base_image):
tmpdir.chdir()
if binder_dir:
os.mkdir(binder_dir)

bp = buildpacks.BuildPack()
bp = buildpacks.BuildPack(base_image)
assert binder_dir == bp.binder_dir
assert bp.binder_path("foo.yaml") == os.path.join(binder_dir, "foo.yaml")


def test_exclusive_binder_dir(tmpdir):
def test_exclusive_binder_dir(tmpdir, base_image):
tmpdir.chdir()
os.mkdir("./binder")
os.mkdir("./.binder")

bp = buildpacks.BuildPack()
bp = buildpacks.BuildPack(base_image)
with pytest.raises(RuntimeError):
_ = bp.binder_dir
16 changes: 8 additions & 8 deletions tests/unit/test_buildpack.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,41 +7,41 @@
from repo2docker.utils import chdir


def test_legacy_raises():
def test_legacy_raises(base_image):
# check legacy buildpack raises on a repo that triggers it
with TemporaryDirectory() as repodir:
with open(pjoin(repodir, "Dockerfile"), "w") as d:
d.write("FROM andrewosh/binder-base")

with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
with pytest.raises(RuntimeError):
bp.detect()


def test_legacy_doesnt_detect():
def test_legacy_doesnt_detect(base_image):
# check legacy buildpack doesn't trigger
with TemporaryDirectory() as repodir:
with open(pjoin(repodir, "Dockerfile"), "w") as d:
d.write("FROM andrewosh/some-image")

with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
assert not bp.detect()


def test_legacy_on_repo_without_dockerfile():
def test_legacy_on_repo_without_dockerfile(base_image):
# check legacy buildpack doesn't trigger on a repo w/o Dockerfile
with TemporaryDirectory() as repodir:
with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
assert not bp.detect()


@pytest.mark.parametrize("python_version", ["2.6", "3.0", "4.10", "3.99"])
def test_unsupported_python(tmpdir, python_version):
def test_unsupported_python(tmpdir, python_version, base_image):
tmpdir.chdir()
bp = PythonBuildPack()
bp = PythonBuildPack(base_image)
bp._python_version = python_version
assert bp.python_version == python_version
with pytest.raises(ValueError):
Expand Down
8 changes: 4 additions & 4 deletions tests/unit/test_cache_from.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
)


def test_cache_from_base(tmpdir):
def test_cache_from_base(tmpdir, base_image):
cache_from = ["image-1:latest"]
fake_log_value = {"stream": "fake"}
fake_client = MagicMock(spec=docker.APIClient)
Expand All @@ -21,7 +21,7 @@ def test_cache_from_base(tmpdir):

# Test base image build pack
tmpdir.chdir()
for line in BaseImage().build(
for line in BaseImage(base_image).build(
fake_client, "image-2", 100, {}, cache_from, extra_build_kwargs
):
assert line == fake_log_value
Expand All @@ -30,7 +30,7 @@ def test_cache_from_base(tmpdir):
assert called_kwargs["cache_from"] == cache_from


def test_cache_from_docker(tmpdir):
def test_cache_from_docker(tmpdir, base_image):
cache_from = ["image-1:latest"]
fake_log_value = {"stream": "fake"}
fake_client = MagicMock(spec=docker.APIClient)
Expand All @@ -42,7 +42,7 @@ def test_cache_from_docker(tmpdir):
with tmpdir.join("Dockerfile").open("w") as f:
f.write("FROM scratch\n")

for line in DockerBuildPack().build(
for line in DockerBuildPack(base_image).build(
fake_client, "image-2", 100, {}, cache_from, extra_build_kwargs
):
assert line == fake_log_value
Expand Down
Loading