Allow pushing when layers are from a remote location (RBE) #95

ittaiz · 2018-08-16T18:22:20Z

Hi,
I’m not a container expert so I might be way off but I’d really like to be able to use the pusher from a machine which doesn’t have the layers on it.

Context:
I’m using GCB+RBE+Bazel (rules_docker) to build and test our code.
We’re currently integrating publishing from rules_docker and its current assumption is that I’ve built locally and so the layers are available.
I’d like to be able to run the pusher on different machines and so parallelize publishing as well as be able to start before the build for the entire repo finishes.
ResultStore gives us the ability to poll a bazel build and know what’s ready as well as the URLs on RBE.

Is there a technical option of using the pusher with URLs from RBE?

cc @nlopezgi since he often has very wise insights :)

ittaiz · 2018-08-20T15:52:46Z

@nlopezgi are you by any chance relevant? if not do you know who might be?

nlopezgi · 2018-08-20T17:23:20Z

This sounds like a valid use case, but I don't know enough yet about how container registry works to know if its feasible/simple to do this. Maybe @mattmoor or @dlorenc can comment about this?

ittaiz · 2018-08-28T17:52:58Z

@mattmoor @dlorenc any thoughts?

ittaiz · 2018-09-12T06:01:00Z

@nlopezgi I think @mattmoor is working on other stuff. From the contributors view it seems @KaylaNguyen and @dekkagaijin are very active; Any chance you can contribute here or point me to the relevant person?
Thanks!

mattmoor · 2018-09-13T13:59:47Z

Sorry, Github notifications now get lost in the noise of my day job (Knative), so I missed this.

Wearing my idealist hat
IIUC what you are asking for is effectively distributed execution of the push, which is contrary to my mental model of Blaze's distributed execution, which must be hermetic and happens in a network jail.

Wearing my pragmatist hat
I probably wouldn't try to make a single push straddle multiple machines (actions), that seems like it's asking for trouble. Instead what I'd probably do is leverage push incrementality to pre-push layers in a distributed fashion so that the ultimate push never needs to download them because existence checks succeed.

The basic way of doing this would be to wrap individual layers in a dummy image and leverage that to get the layer published to the registry. The problem is knowing where (and when it is appropriate to publish stuff, or every build?).

I never played with aspects, but it is possible that they might allow you to walk the action graph (when executing a push) and decorate builds with the kind of actions described above including where.

FWIW, we don't do anything special here internally. Basically, given build-to-build reproducibility and incremental uploads, this should only ever be a problem once per delta in the output. Granted even those deltas can be big.

ittaiz · 2018-09-14T06:44:58Z

Thank you for replying! I think I didn’t convey my intent. I want to be able to push a container from *outside* a bazel workspace and by leveraging the layers in the RBE CAS *not* via running them on RBE workers in the bazel build. The context is that we run a bazel build with RBE. Which populated the CAS with all of the inputs the container push needs. I want to be able to download the container_push script (python binary?) and run it from an arbitrary production machine and have that script take its inputs from the CAS and not from local files. Is that clearer?

…

On Thu, 13 Sep 2018 at 16:59 Matt Moore ***@***.***> wrote: Sorry, Github notifications now get lost in the noise of my day job (Knative), so I missed this. *Wearing my idealist hat* IIUC what you are asking for is effectively distributed execution of the push, which is contrary to my mental model of Blaze's distributed execution, which must be hermetic and happens in a network jail. *Wearing my pragmatist hat* I probably wouldn't try to make a single push straddle multiple machines (actions), that seems like it's asking for trouble. Instead what I'd probably do is leverage push incrementality to pre-push layers in a distributed fashion so that the ultimate push never needs to download them because existence checks succeed. The basic way of doing this would be to wrap individual layers in a dummy image and leverage that to get the layer published to the registry. The problem is knowing where (and when it is appropriate to publish stuff, or every build?). I never played with aspects, but it is possible that they might allow you to walk the action graph (when executing a push) and decorate builds with the kind of actions described above *including* where. FWIW, we don't do anything special here internally. Basically, given build-to-build reproducibility and incremental uploads, this should only ever be a problem once per delta in the output. Granted even those deltas can be big. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#95 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABUIF9YlaysyjRNWBjLbDDui6XehIOapks5uamTXgaJpZM4WAWeH> .

helenalt · 2018-09-20T13:17:52Z

@nlopezgi how do you recommend we proceed with this?

nlopezgi · 2018-09-20T14:26:34Z

It sounds like the work needed to get this use case supported is mostly related to making the script in this repo work with contents that are in the CAS. I don't have enough expertise wrt how container_push works or how the CAS works to be able to provide many meaningful insights (but I'll be happy to comment on any design someone produces for this feature). I think if this is a use case that Wix wants supported, and its one we want to prioritize, Wix would need to work with owners of container registry to figure out better what is the effort required to build this feature (i.e., produce a design?).

ittaiz · 2018-09-20T15:43:14Z

I'd be happy to get that ball rolling. Who are the owners of container registry?

…

On Thu, Sep 20, 2018 at 5:27 PM Nicolas Lopez ***@***.***> wrote: It sounds like the work needed to get this use case supported is mostly related to making the script in this repo work with contents that are in the CAS. I don't have enough expertise wrt how container_push works or how the CAS works to be able to provide many meaningful insights (but I'll be happy to comment on any design someone produces for this feature). I think if this is a use case that Wix wants supported, and its one we want to prioritize, Wix would need to work with owners of container registry to figure out better what is the effort required to build this feature (i.e., produce a design?). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#95 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABUIF5hNuhS10ouLKsd1EMXPrNT2z65iks5uc6XXgaJpZM4WAWeH> .

EricBurnett · 2018-09-25T14:22:11Z

Discussed this with Ittai today. Attempting to summarize my understanding (Ittai, correct anything I've messed up):

Wix currently uses GCB to run a bazel build that builds containers via rules_docker, and then as a subsequent GCB step, pushes images to a registry.
They do not like the serialization of having the "push" step follow after the entire build, when it could have happened as soon as the relevant tests pass.
Ittai is considering ways to streamline this so that pushes happen as near in time to the relevant unit tests having been run:
1. Writing a microservice that gets the information from the Build Results UI to know when the relevant steps are completed, and to subsequently pull the relevant files to trigger a push from there.
  - The challenge here is in getting the relevant information to some microservice that can then trigger the push. This involves marshalling (a) some top-level identifier for the docker image to push that a tool can operate on and complete the push (b) a script that embeds that data, e.g. produced by rules_docker itself, or (c) a list of digests and whatnot so that the files can be pulled and layed out sufficiently that rules_docker can be executed to finish the push.
2. Partitioning the build such that GCB runs multiple steps: an initial (minimal) bazel build, then any push steps, then a bazel build for the remainder of the work.
  - This partitions the bazel build into multiple pieces, making the e2e time longer (due to worse parallelism, etc) and invalidating some assumptions Ittai currently holds about "single bazel"
  - This doesn't scale to building multiple containers.
3. As a third option, I've suggested that rules_docker could consider supporting doing a push as a target within the build, rather than as a subsequent bazel run step.
  - This target could have as dependencies all the relevant tests that should pass before it gets executed, and/or could be set up to push unilaterally to some scratch registry - e.g. if a microservice is still going to be used to check out-of-band criteria, and then re-push from there to the appropriate output registry.
  - Note that this target would have side-effects, but would still be deterministic/idempotent/cacheable: once a specific image has been successfully pushed to a registry, it can be assumed it does not need to be pushed again.
  - I (naively) think that that's reasonable to allow/support from a rules_docker perspective. Though users should take care to ensure that this target does not get triggered outside of CI scenarios, since it would presumably fail when run by general developers from their own workstations. (For example, I think the target could be marked 'manual' so it had to be listed explicitly on command-lines to get executed in builds).
  - Note also that this requires the build runner to have permission to push to said registry. (For RBE this is possible today with GCR; unclear how easily it could be done for other registries that need other credentials). Otherwise, the action being marked 'no-remote' and run on the GCB worker directly may suffice.

(All of these options have drawbacks; having written them out I'm not necessarily sure which is best for Ittai to pursue).

In any case, I consider the crux of this problem to be around what information is passed and where - questions on tooling should follow after figuring out what model seems most reasonable. (E.g. it'd be relatively straightforward for someone to write a tool that pulls layers from the RBE CAS directly, if they knew which digests to pull and what to do with them after.)

nlopezgi · 2018-09-25T15:39:07Z

About option iii: It should be feasible to provide a way to push with rules_docker using bazel build. wrt authentication, please see bazelbuild/rules_docker#526 for an open thread to provide better support for this kind of use case (@ittaiz please comment specifically if the solution of having a rule to read secrets from env and output a file to override $HOME/.docker/config.json would work for your use case). Please let me know if option iii is what you think will work best so I can plan accordingly to work on the features (but, iiuc, it should not be hard to implement exposing the push script to execute with bazel build via an additional output of the push rule, so if anyone wants to volunteer a PR to rules_docker to do this it would be great)!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow pushing when layers are from a remote location (RBE) #95

Allow pushing when layers are from a remote location (RBE) #95

ittaiz commented Aug 16, 2018

ittaiz commented Aug 20, 2018

nlopezgi commented Aug 20, 2018

ittaiz commented Aug 28, 2018

ittaiz commented Sep 12, 2018

mattmoor commented Sep 13, 2018

ittaiz commented Sep 14, 2018 via email

helenalt commented Sep 20, 2018

nlopezgi commented Sep 20, 2018 •

edited

ittaiz commented Sep 20, 2018 via email

EricBurnett commented Sep 25, 2018 •

edited

nlopezgi commented Sep 25, 2018

Allow pushing when layers are from a remote location (RBE) #95

Allow pushing when layers are from a remote location (RBE) #95

Comments

ittaiz commented Aug 16, 2018

ittaiz commented Aug 20, 2018

nlopezgi commented Aug 20, 2018

ittaiz commented Aug 28, 2018

ittaiz commented Sep 12, 2018

mattmoor commented Sep 13, 2018

ittaiz commented Sep 14, 2018 via email

helenalt commented Sep 20, 2018

nlopezgi commented Sep 20, 2018 • edited

ittaiz commented Sep 20, 2018 via email

EricBurnett commented Sep 25, 2018 • edited

nlopezgi commented Sep 25, 2018

nlopezgi commented Sep 20, 2018 •

edited

EricBurnett commented Sep 25, 2018 •

edited