Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DO NOT MERGE: buildah vendor treadmill #13808

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

edsantiago
Copy link
Collaborator

Followup to Cabal discussion 2022-04-07, re: letting us vendor in buildah on a moment's notice.

Instead of cron'ing this, let's see if this works: keep this PR open perpetually, with daily updates to get latest podman and buildah.

See individual commit messages for details.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 7, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 7, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 7, 2022
@edsantiago edsantiago added do-not-merge/blocked-paths Indicates that a PR should not merge because it touches files in blocked paths. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. labels Apr 7, 2022
@edsantiago
Copy link
Collaborator Author

The first commit is the important one. This one must be hand-crafted by a human, there is no possible way to automate it. This is the one that, in case of emergency vendor, can be grabbed into a real PR.

The second commit is a throwaway, ignore it. That's just automated fetch-the-latest-buildah.

@openshift-ci openshift-ci bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Apr 7, 2022
@edsantiago
Copy link
Collaborator Author

bud test failed, remote only, because the new --identity-label=false option is not getting passed through. Probably an easy fix, but not this late in my day.

This is still a proof-of-concept PR, and my feeling so far is positive: it has quickly identified multiple problems in the buildah merge that are easy to solve iteratively. A nightly cron job, or a non-gating step in buildah CI, would be utterly useless because neither one would have the crucial first-commit. Without that, there's no way to make it past the first or second or third failure.

@rhatdan
Copy link
Member

rhatdan commented Apr 8, 2022

I just pushed a fix for this.

@edsantiago edsantiago force-pushed the vendor_buildah_latest branch 2 times, most recently from 4c3987c to a58b300 Compare April 8, 2022 17:43
@Luap99
Copy link
Member

Luap99 commented Apr 11, 2022

How is this PR regularly updated?

@edsantiago
Copy link
Collaborator Author

By a human, presumably me. There is actually no other way: this is not possible to solve via cron or PRs.

I am working on a script to automate as much of it as possible.

@edsantiago edsantiago changed the title [WIP] constant blend of latest-podman + latest-buildah DO NOT MERGE: buildah vendor treadmill Apr 12, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 12, 2022
@edsantiago edsantiago marked this pull request as draft April 13, 2022 11:51
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 13, 2022
@edsantiago
Copy link
Collaborator Author

Followup, because I don't want anyone to think this is abandonware: it's not. I'm running my script at least twice a day, one of those times in the evening. One of the nice features I added is:

+---> Podman has bumped, but Buildah is unchanged. There's probably not much point to testing this.

So I'm choosing to save CI cycles by not re-pushing today. Should anyone need to re-vendor buildah tomorrow or this weekend (when I'm unavailable), just cherry-pick the first commit from this PR. It's still valid.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 15, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2022
@edsantiago
Copy link
Collaborator Author

Yippee, script has caught a bug! containers/buildah#3919 breaks when vendoring into podman:

+---> Running 'make' to confirm that podman builds cleanly...
CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build \
        -mod=vendor  \
        -ldflags '-X github.com/containers/podman/v4/libpod/define.gitCommit=3bf05d3113df29addd392032797f4fdf5dae56b7 -X github.com/containers/podman/v4/libpod/define.buildInfo=1650539506 -X github.com/containers/podman/v4/libpod/config._installPrefix=/usr/local -X github.com/containers/podman/v4/libpod/config._etcDir=/usr/local/etc -X github.com/containers/common/pkg/config.additionalHelperBinariesDir= ' \
        -tags "   selinux systemd  exclude_graphdriver_devicemapper seccomp" \
        -o bin/podman ./cmd/podman
# github.com/containers/podman/v4/libpod
libpod/events.go:18:3: cannot use r.config.Engine.EventsLogFileMaxSize (type "github.com/containers/common/pkg/config".eventsLogMaxSize) as type uint64 in field value
make: *** [Makefile:321: bin/podman] Error 2
buildah-vendor-treadmill-x: 'make' failed with new buildah. Cannot continue.

I could just force-push right now and let y'all see the breakage in CI, or I could just report it like I am right here. Or I could file an issue in containers/common since that's where the offending code seems to be?

Anyhow, @rhatdan et al, PTAL because this is not going to vendor into podman.

@edsantiago
Copy link
Collaborator Author

What the heck, I pushed anyway in case someone wants to pull & play.

@edsantiago edsantiago force-pushed the vendor_buildah_latest branch 2 times, most recently from c63d8a0 to b8aa0c7 Compare February 29, 2024 15:56
Copy link

Cockpit tests failed for commit c63d8a0. @martinpitt, @jelly, @mvollmer please check.

Copy link

Cockpit tests failed for commit b8aa0c7. @martinpitt, @jelly, @mvollmer please check.

Copy link

Cockpit tests failed for commit 34fcb47. @martinpitt, @jelly, @mvollmer please check.

Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@edsantiago edsantiago force-pushed the vendor_buildah_latest branch 2 times, most recently from 0078ee8 to c39b80f Compare April 25, 2024 11:24
@edsantiago edsantiago force-pushed the vendor_buildah_latest branch 2 times, most recently from c1a3213 to 31f1fb8 Compare May 22, 2024 15:20
@edsantiago
Copy link
Collaborator Author

@containers/buildah-maintainers PTAL. Something in latest buildah is breaking podman int tests. As best I can tell, the common factor is that buildah is now re-pulling images that are already present:

old (good):
$ podman build ...
STEP 1 2 3 etc
new (bad):
$ podman build ...
Trying to pull .....  <<<<<<<<<< SHOULD NOT HAPPEN

(I may be wrong in my analysis. Maybe the common factor is something else I didn't notice)

@nalind
Copy link
Member

nalind commented May 22, 2024

@containers/buildah-maintainers PTAL. Something in latest buildah is breaking podman int tests. As best I can tell, the common factor is that buildah is now re-pulling images that are already present:

It looks like the image caching save/load sequence is changing the compression used for the image's layer, and thus the manifest, and thus the manifest's digest, and when build checks if the image in the registry has a different digest when compared to the one it has on disk, it sees that it is and pulls it again.

To verify, compare the RepoDigests values you get from podman inspect quay.io/libpod/alpine after freshly pulling the image, and after a podman save/podman rmi/podman load cycle.

@edsantiago
Copy link
Collaborator Author

Thank you. Looks like a reasonable analysis. How do we fix that?

@Luap99
Copy link
Member

Luap99 commented May 22, 2024

@containers/buildah-maintainers PTAL. Something in latest buildah is breaking podman int tests. As best I can tell, the common factor is that buildah is now re-pulling images that are already present:

It looks like the image caching save/load sequence is changing the compression used for the image's layer, and thus the manifest, and thus the manifest's digest, and when build checks if the image in the registry has a different digest when compared to the one it has on disk, it sees that it is and pulls it again.

To verify, compare the RepoDigests values you get from podman inspect quay.io/libpod/alpine after freshly pulling the image, and after a podman save/podman rmi/podman load cycle.

But nothing here changed that looking at the diff, it used too work fine and I cannot see any config/compression chnages here

What I do see however is containers/buildah#5478

if platform.OS == "" {
	platform.OS = runtime.GOOS
}

AFAIK the libimage code treats a non empty platform as must pull or something like that if I remember correctly
However I haven't verified that.

@nalind
Copy link
Member

nalind commented May 22, 2024

Hmm, this has been happening for a while, since I see it happening with both podman 4.9.4 and 5.0.0 as well, so it might not be the only cause.
But to your question, buildah uses a "dir:" location for the cache to avoid changing the image while caching it, and a custom "copy" test helper since it doesn't want to depend on having skopeo available at test time. But podman already expects skopeo to be present, so skopeo copy should be up to the task.

@nalind
Copy link
Member

nalind commented May 22, 2024

AFAIK the libimage code treats a non empty platform as must pull or something like that if I remember correctly However I haven't verified that.

I had suspected as much, but at most, that should be changing the pull policy to "if-newer" under the covers, and I'm pretty sure that's already the value that's being passed in.

@Luap99
Copy link
Member

Luap99 commented May 22, 2024

AFAIK the libimage code treats a non empty platform as must pull or something like that if I remember correctly However I haven't verified that.

I had suspected as much, but at most, that should be changing the pull policy to "if-newer" under the covers, and I'm pretty sure that's already the value that's being passed in.

Yeah not sure, but the fact is main is passing but this PR is not so the problem must be in the diff here and thankfully the diff is small so this is my best guess as of why. I can try to reproduce tomorrow.

@edsantiago
Copy link
Collaborator Author

What I do see however is containers/buildah#5478

if platform.OS == "" {
platform.OS = runtime.GOOS
}

Nice catch, @Luap99 . Removing those three lines (and the import runtime) appears to fix the problem.

@rhatdan
Copy link
Member

rhatdan commented May 22, 2024

That was added to make podman build --arch arm64 work

@Luap99
Copy link
Member

Luap99 commented May 23, 2024

That was added to make podman build --arch arm64 work

Sure but it fixed it in broken way, the default pull policy is ifmissing and not ifnewer which the code uses now due the well other not so great behavior in libimage.
And reading your PR your intention is to only set TARGETARCH not change the pull policy so this is definitely broken and must be fixed before the buildah release

This is a JUNK COMMIT from buildah-vendor-treadmill-x v0.3.

DO NOT MERGE! This is just a way to keep the buildah-podman
vendoring in sync. Refer to:

   https://github.com/containers/podman/wiki/Buildah-Vendor-Treadmill

Signed-off-by: Ed Santiago <[email protected]>
As you run --sync, please update this commit message with your
actual changes.

Changes since 2024-05-21:

Signed-off-by: Ed Santiago <[email protected]>
@edsantiago
Copy link
Collaborator Author

Tests now passing. This is safe to vendor into podman; no changes needed. Thank you @nalind for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bloat_approved Approve a PR in which binary file size grows by over 50k do-not-merge/blocked-paths Indicates that a PR should not merge because it touches files in blocked paths. do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants